Ansible Debugging
- Check connectivity first
- Verbose mode (-v to -vvv)
- --check and --diff
- debug module
- --start-at-task and --step
- Common errors and what they mean
- UNREACHABLE
- MODULE FAILURE
- Undefined variable
- Template errors
- ansible-lint for catching mistakes early
- Inventory introspection
- Playbook introspection (dry-scope)
- assert module — pre-flight checks
Check connectivity first
Before troubleshooting a playbook failure, confirm Ansible can actually reach the hosts.
# Ping all hosts
ansible all -i inventories/production/hosts.ini -m ping
# Ping one host
ansible web01 -i inventories/production/hosts.ini -m ping
# Check what user Ansible would connect as
ansible web01 -i inventories/production/hosts.ini -m debug -a "var=ansible_user"
A successful ping returns pong. An UNREACHABLE error means SSH cannot connect — wrong host, wrong user, key not in authorized_keys, or firewall blocking port 22.
Verbose mode (-v to -vvv)
Add -v flags to get progressively more output:
ansible-playbook site.yml -v # task output and return values
ansible-playbook site.yml -vv # more connection detail
ansible-playbook site.yml -vvv # full SSH connection info, module args
ansible-playbook site.yml -vvvv # connection plugin debugging
With -vvv, each task shows:
- The exact module arguments being sent to the host
- The raw return value from the module
- Which SSH connection is being used
Start with -v when a task fails but the error message is not clear. Escalate to -vvv for connection and SSH problems.
--check and --diff
The two most important flags for safe production use:
# Dry run — show what would change without changing anything
ansible-playbook site.yml --check
# Show file diffs — what the template would render vs what is on disk
ansible-playbook site.yml --diff
# Both together (most useful)
ansible-playbook site.yml --check --diff
With --diff, when a template or file task would make a change, Ansible shows a unified diff:
--- before: /etc/chrony.conf
+++ after: /etc/chrony.conf
@@ -1,4 +1,5 @@
server 0.pool.ntp.org iburst
+server ntp1.internal.example.com iburst
driftfile /var/lib/chrony/drift
makestep 1.0 3
rtcsync
debug module
Print variable values during a playbook run:
- name: Show what NTP servers will be used
ansible.builtin.debug:
msg: "chrony_servers: {{ chrony_servers }}"
- name: Show entire variable dict
ansible.builtin.debug:
var: chrony_servers
- name: Show multiple values
ansible.builtin.debug:
msg: |
hostname: {{ inventory_hostname }}
NTP servers: {{ chrony_servers | join(', ') }}
TLS enabled: {{ enable_tls }}
Print the output of a registered task result:
- name: Run config check
ansible.builtin.command: nginx -t
register: nginx_check
ignore_errors: true
- name: Show result
ansible.builtin.debug:
var: nginx_check
# nginx_check will contain:
# - nginx_check.rc (return code — 0 = success)
# - nginx_check.stdout (standard output)
# - nginx_check.stderr (standard error)
# - nginx_check.changed (always false for command — use changed_when)
--start-at-task and --step
Skip to a specific task without re-running everything:
# Start from a specific task by name
ansible-playbook site.yml --start-at-task "Deploy nginx config"
# Interactive mode — confirm each task before running
ansible-playbook site.yml --step
--start-at-task is useful when a long playbook fails at step 47 and you have already fixed the problem — you can resume from there instead of starting over.
Common errors and what they mean
These are the errors you will see most often. Each has a predictable cause and a fast diagnostic path.
UNREACHABLE
fatal: [web01]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ...", "unreachable": true}
Ansible could not SSH to the host. Diagnose in order:
# Try SSH manually with the ansible user
ssh ansible@web01
# Check the right key is being used
ansible web01 -m ping -vvv # look for "identity file" lines
# Confirm the host is in DNS/resolves
dig web01
# Check port 22 is reachable
nc -zv web01 22
MODULE FAILURE
fatal: [web01]: FAILED! => {
"changed": false,
"msg": "Could not find the requested service chronyd: host",
"rc": 1
}
The module ran but the action failed. Read the msg field — it usually tells you exactly what went wrong. Common causes:
- Package not found — wrong package name for this OS; check with
dnf searchorapt-cache search - Service not found — the service name differs between distros (e.g.
httpdvsapache2) - Permission denied — task needs
become: true - File not found — template or file source path is wrong
Add -v to see the full module output and narrow down the cause.
Undefined variable
fatal: [web01]: FAILED! => {
"msg": "The task includes an option with an undefined variable.
The error was: 'chrony_servers' is undefined"
}
The variable is referenced in a task or template but was never defined. Check:
# Is it in the role's defaults?
cat roles/chrony/defaults/main.yml | grep chrony_servers
# Is it in group_vars?
grep -r "chrony_servers" inventories/
# Inspect what Ansible sees for this host
ansible web01 -m debug -a "var=chrony_servers"
If the variable should be optional, use a default filter in the template: {{ chrony_servers | default([]) }}
Template errors
fatal: [web01]: FAILED! => {
"msg": "AnsibleError: template error while templating string:
expected token 'end of print statement', got '|'. ..."
}
Syntax error in a Jinja2 template. Common causes:
- Unclosed
{% if %}without{% endif %} - Unclosed
{% for %}without{% endfor %} - Variable name typo inside
{{ }} - YAML quoting issue — a
{{ }}expression at the start of a YAML value must be quoted:dest: "{{ path }}"
Test the template in isolation:
# Render the template locally (shows what it would produce)
ansible localhost -m template \
-a "src=roles/chrony/templates/chrony.conf.j2 dest=/tmp/chrony.conf.test" \
-e "chrony_servers=['0.pool.ntp.org']"
cat /tmp/chrony.conf.test
ansible-lint for catching mistakes early
# Lint a specific playbook
ansible-lint site.yml
# Lint everything
ansible-lint
Run this before every push. ansible-lint catches:
- Missing task names
- Using
shellwhen a module would be better - Missing
becomeon privileged tasks - Deprecated module syntax
- YAML formatting issues
If ansible-lint gives false positives for rules you disagree with, you can skip specific rules with # noqa: rule-name on the task or configure them in .ansible-lint.
Inventory introspection
Before running any playbook, you can inspect exactly what Ansible sees from your inventory — merged variables, group memberships, and all. This is the fastest way to debug "wrong variable" or "wrong host" issues.
# Show all hosts and groups as JSON (merged variable view)
ansible-inventory -i inventories/production/ --list
# Show host tree (groups → hosts)
ansible-inventory -i inventories/production/ --graph
# Show all merged variables for a specific host
ansible-inventory -i inventories/production/ --host web01
--host is the most useful flag. It shows the final merged variable values Ansible will use for that host — including group_vars, host_vars, and dynamic inventory. If the value here is wrong, the problem is in inventory, not in your playbook.
# Combine with a playbook-specific inventory for dynamic sources
ansible-inventory -i inventories/production/ --graph --vars
Playbook introspection (dry-scope)
These flags let you see the scope of a playbook run without executing anything — essential for confirming tags, hosts, and task order before a maintenance window.
# Validate YAML and role syntax without running tasks
ansible-playbook site.yml --syntax-check
# List all tasks that would run (in order)
ansible-playbook site.yml --list-tasks
# List all tags defined across the playbook
ansible-playbook site.yml --list-tags
# List which hosts would be targeted
ansible-playbook site.yml -i inventories/production/ --list-hosts
# Combine: see what tasks would run with a specific tag on specific hosts
ansible-playbook site.yml -i inventories/production/ \
--tags chrony \
--limit webservers \
--list-tasks
Run --list-tasks before every production change to confirm exactly which tasks will execute. It prevents surprises from when: conditions you forgot about.
assert module — pre-flight checks
Add assert tasks at the start of a role or playbook to fail loudly with a helpful message if required variables are missing or invalid. This catches problems before any change is made.
# Pre-flight checks at the start of a play
- name: Pre-flight — verify required variables
ansible.builtin.assert:
that:
- db_host is defined
- db_host | length > 0
- db_port is defined
- db_port | int > 0 and db_port | int < 65536
- env in ['staging', 'production']
fail_msg: >
Required variable missing or invalid.
db_host={{ db_host | default('UNDEFINED') }},
db_port={{ db_port | default('UNDEFINED') }},
env={{ env | default('UNDEFINED') }}
success_msg: "Pre-flight checks passed"
Practical patterns:
# Check that a variable matches a pattern (e.g. version string)
- ansible.builtin.assert:
that:
- app_version is match('^[0-9]+\.[0-9]+\.[0-9]+$')
fail_msg: "app_version must be semver (got: {{ app_version }})"
# Check OS is supported
- ansible.builtin.assert:
that:
- ansible_os_family in ['RedHat', 'Debian']
fail_msg: "This role only supports RedHat and Debian families"
# Check a port is open before proceeding
- ansible.builtin.assert:
that:
- ansible_facts.services['postgresql.service'].state == 'running'
fail_msg: "PostgreSQL must be running before applying app config"
Put assert tasks in a block with tags: always so they run even when you use --tags to run only part of a playbook.