Ansible Foundations
What Ansible is
Ansible is an automation tool. It connects to machines over SSH (no agent required on the target) and performs tasks like:
- Installing packages
- Writing config files from templates
- Managing services
- Creating users
- Running commands
Why people use it
Instead of manually editing 20 servers, you define the desired state once and apply it consistently. Every task is designed to be idempotent — safe to run again without causing repeated unwanted changes.
Key terms
- inventory
- The list of hosts and groups Ansible can target.
- playbook
- A YAML file describing what to do — contains one or more plays.
- play
- A section of a playbook that maps a set of hosts to a list of tasks.
- task
- A single action using one module.
- module
- The actual piece of functionality:
dnf,service,copy,template, etc. - role
- A reusable structure bundling tasks, templates, handlers, vars, and files.
- handler
- A task that runs only when notified by a change — used for restarts.
- idempotent
- Produces the same result if run multiple times. A key Ansible goal.
- fact
- Data Ansible discovers about the target host automatically (OS, IPs, memory, etc.).
Example playbook
- name: Install and start nginx
hosts: web
become: true
tasks:
- name: Install nginx
dnf:
name: nginx
state: present
- name: Start nginx
service:
name: nginx
state: started
enabled: true
What each part means:
- name:— human-readable label for the play. Appears in output.hosts: web— target thewebgroup from inventory.become: true— use privilege escalation (sudo).tasks:— begin the list of actions.- First task uses
dnfto ensure nginx is installed (state: present). - Second task uses
serviceto ensure nginx is running and starts on boot.
Inventory
INI style
[web]
web01
web02
YAML style
all:
children:
web:
hosts:
web01:
web02:
Common commands
ansible all -m ping
ansible all -i inventory.yml -m ping
Tests whether Ansible can reach all hosts. Note: -m ping is an Ansible module check, not ICMP ping.
ansible-playbook
ansible-playbook -i inventory.yml site.yml
ansible-playbook -i inventory.yml site.yml --check
ansible-playbook -i inventory.yml site.yml --diff
ansible-playbook -i inventory.yml site.yml --limit web01
ansible-playbook -i inventory.yml site.yml --tags nginx
--check is a dry-run — shows what would change without applying it. --diff shows file differences. Always run --check first on production systems.
Common modules
ansible.builtin.package/dnf/apt— install/remove packagesansible.builtin.service/systemd— manage servicesansible.builtin.copy— copy static files to hostsansible.builtin.template— render a Jinja2 template and deployansible.builtin.file— manage file/directory state and permissionsansible.builtin.user/group— manage OS users and groupsansible.builtin.lineinfile— manage individual lines in a fileansible.builtin.command— run a command (not through shell)ansible.builtin.shell— run a shell command (use sparingly)ansible.builtin.debug— print a variable or message for debuggingansible.builtin.assert— validate assumptions before doing risky work
Why the name field matters
Every task should have a name. It appears in output and helps when something fails.
# Hard to debug
- shell: some command
# Easy to debug
- name: Restart nginx after config change
service:
name: nginx
state: restarted
become: true
Run tasks with privilege escalation (typically sudo). Set at the play level to apply to all tasks, or per task for finer control.
register
Capture the output of a task into a variable for use in later tasks:
- name: Check nginx version
command: nginx -v
register: nginx_version
ignore_errors: true
- debug:
var: nginx_version
when (conditionals)
Skips the task if the condition is false. The condition is a Jinja2 expression evaluated at runtime.
# Fact-based: only run on RedHat-family systems
- name: Install SELinux tools
ansible.builtin.dnf:
name: policycoreutils-python-utils
state: present
when: ansible_os_family == "RedHat"
# Registered variable: act on the result of a previous task
- name: Check if config file exists
ansible.builtin.stat:
path: /etc/myapp/myapp.conf
register: conf_stat
- name: Migrate old config format
ansible.builtin.command: /usr/local/bin/migrate-config.sh
when: conf_stat.stat.exists and conf_stat.stat.size < 100
# Multiple conditions — all must be true (and)
- name: Enable TLS only on production web servers
ansible.builtin.include_tasks: enable-tls.yml
when:
- env == "production"
- ansible_os_family == "RedHat"
- nginx_port == 443
# Multiple conditions — any can be true (or)
- name: Restart on RedHat or Debian
ansible.builtin.service:
name: nginx
state: restarted
when: ansible_os_family == "RedHat" or ansible_os_family == "Debian"
# when with loop — condition is evaluated per item
- name: Create dirs only if they don't already exist
ansible.builtin.file:
path: "{{ item.path }}"
state: directory
mode: "{{ item.mode }}"
loop:
- { path: /srv/app, mode: "0755" }
- { path: /srv/app/logs, mode: "0750" }
- { path: /tmp/scratch, mode: "0777" }
when: item.path != "/tmp/scratch" or env != "production"
when: written as a list (indented items) is an implicit and — all conditions must be true. Use or inline for "any of" logic. Conditions can reference facts (ansible_*), registered vars, inventory vars, and play vars.
Handlers
Handlers run only when notified by a task that reported a change. They run once at the end of the play, even if notified multiple times.
tasks:
- name: Deploy nginx config
template:
src: nginx.conf.j2
dest: /etc/nginx/nginx.conf
notify: Restart nginx
handlers:
- name: Restart nginx
service:
name: nginx
state: restarted
Why use handlers: if the config did not change, there is no notification, so nginx is not unnecessarily restarted.
Linting
Linting checks code for style problems, risky patterns, bad practices, and likely mistakes:
ansible-lint playbook.yml
Run this before every deployment. It catches common errors that would cause failures or produce unexpected behaviour.
Best practices
- Prefer modules over
shell/command— modules are idempotent, shell is not - Keep tasks small and focused on one thing
- Use handlers for service restarts — never restart in a task directly
- Use templates for config files — never hardcode configs
- Name every task clearly
- Avoid hardcoding values — use variables instead
- Test with
--checkand--diffbefore running for real
group_vars / host_vars
Ansible automatically loads variable files based on the name of a group or host. This lets you define different values for different environments without touching your playbook.
Directory structure:
inventory/
hosts.ini
group_vars/
all.yml # applies to every host
webservers.yml # applies to hosts in [webservers]
staging.yml # applies to hosts in [staging]
host_vars/
web01.yml # applies only to web01
Ansible picks these up automatically — no import needed. Variables in host_vars override group_vars for that specific host.
Example group_vars/webservers.yml:
---
nginx_port: 443
app_user: deploy
group_vars/all.yml as defaults, and override specific values per group or host. This keeps your playbook clean and free of hardcoded values.
Ansible Vault
Ansible Vault encrypts sensitive values — passwords, API keys, private keys — so you can safely commit them to version control without exposing secrets.
Encrypt an individual string in-place:
ansible-vault encrypt_string 'my_secret_password' --name 'db_password'
This outputs an encrypted block you paste directly into a YAML variable file. The value is unreadable in the file but Ansible decrypts it at runtime.
Encrypt or decrypt an entire file:
ansible-vault encrypt secrets.yml
ansible-vault decrypt secrets.yml
ansible-vault view secrets.yml
Run a playbook that uses vault-encrypted values:
ansible-playbook site.yml --ask-vault-pass
ansible.cfg
Ansible looks for ansible.cfg in the current directory, then ~/.ansible.cfg, then /etc/ansible/ansible.cfg. A project-level ansible.cfg is the standard for team repos.
[defaults]
inventory = inventory/hosts.ini # default inventory path
roles_path = roles # where to find roles
host_key_checking = False # skip SSH host key check (dev only)
retry_files_enabled = False # don't create *.retry files on failure
stdout_callback = yaml # cleaner output format
forks = 10 # run on 10 hosts in parallel
[privilege_escalation]
become = True # become root by default (can override per play)
become_method = sudo
Set host_key_checking = False only in development. In CI and production, pre-populate known_hosts instead. Commit ansible.cfg to your repo — it documents your project's conventions and removes the need for per-run flags.
Loops
- name: Install required packages
ansible.builtin.dnf:
name: "{{ item }}"
state: present
loop:
- nginx
- git
- python3
- name: Create application users
ansible.builtin.user:
name: "{{ item.name }}"
shell: "{{ item.shell }}"
loop:
- { name: deploy, shell: /bin/bash }
- { name: app, shell: /sbin/nologin }
- name: Loop with custom variable name (avoids 'item' collision in nested loops)
ansible.builtin.debug:
msg: "Processing {{ pkg }}"
loop: "{{ packages }}"
loop_control:
loop_var: pkg
loop: is the modern syntax. The legacy with_items:, with_dict:, etc. still work but are not recommended for new playbooks — loop: with filters covers all the same cases more consistently.
Error handling
By default Ansible stops a play when any task fails. Use block/rescue/always to handle failures gracefully — equivalent to try/except/finally.
- name: Deploy application with rollback on failure
block:
- name: Pull latest image
ansible.builtin.command: docker pull myapp:latest
- name: Restart service
ansible.builtin.service:
name: myapp
state: restarted
rescue:
- name: Rollback to previous image
ansible.builtin.command: docker pull myapp:previous
- name: Log failure
ansible.builtin.debug:
msg: "Deployment failed, rolled back."
always:
- name: Verify service is running
ansible.builtin.service:
name: myapp
state: started
failed_when and changed_when
- name: Check if service is healthy
ansible.builtin.command: /usr/local/bin/health-check.sh
register: health
failed_when: health.rc != 0 and "degraded" not in health.stdout
changed_when: false # this command never makes changes; don't mark changed
failed_when overrides Ansible's built-in failure detection — useful when a command returns non-zero but you don't always want it treated as failure. changed_when: false prevents reporting "changed" for read-only tasks, which keeps your diff output meaningful.
Vault in CI/CD
--ask-vault-pass is interactive — it cannot be used in CI. Use a vault password file instead.
# Store the vault password in a CI/CD secret variable, then write it to a file
echo "$VAULT_PASSWORD" > .vault-pass
chmod 600 .vault-pass
ansible-playbook site.yml --vault-password-file .vault-pass
# Or set the path in ansible.cfg
# vault_password_file = .vault-pass
In GitLab CI, store the vault password as a masked, protected CI variable (e.g. VAULT_PASSWORD). Write it to a temp file at the start of the job and ensure you do not commit or print it. With vault IDs you can use multiple passwords simultaneously: ansible-vault encrypt --vault-id prod@.vault-prod secrets.yml and ansible-playbook --vault-id prod@.vault-prod site.yml.