Ansible Architecture: Control Node, Inventory, Module and SSH
Article 0 said Ansible is "agentless, over SSH, idempotent". This article dissects exactly that: Ansible's components, and — most importantly — what actually happens when you run a command. Once you understand this mechanism, those three properties (agentless, idempotent, write-your-own-module) stop being a mystery.
The components
Ansible has an important asymmetry: everything lives on the control node, and the target needs almost nothing.
Control node (your machine) Managed node (target host)
┌──────────────────────────┐ ┌─────────────────────────┐
│ • Ansible (installed) │ │ • sshd (already there) │
│ • module library │ │ • python3 (already there)│
│ • inventory (host list) │ SSH │ │
│ • playbook (YAML) │ ─────────► │ NO agent installed │
└──────────────────────────┘ └─────────────────────────┘
- Control node — the machine where Ansible is installed and where you run commands. It holds the module library, inventory, playbooks. (Runs only on Linux/macOS; on Windows use WSL.)
- Managed node (host) — the target machine. The only requirement: you can SSH in and it has Python. No Ansible, no agent.
- Inventory — the list of hosts + how they're grouped (Article 3).
- Module — a piece of code (usually Python) that does one job on the host:
ping,apt,copy,service... Ansible ships thousands of modules. - Plugin — extends Ansible itself (connection, filter, lookup, callback... — Article 12).
What happens when you run a command
This is the deep-dive part. Run the simplest ad-hoc command with the -vvv flag (very verbose) to look inside:
ansible all -m ping -vvv
Filtering for the important lines, we see the exact sequence:
<203.0.113.10> ESTABLISH SSH CONNECTION FOR USER: ec2-user
<203.0.113.10> SSH: EXEC ssh ... '/bin/sh -c ... mkdir -p ".../.ansible/tmp/ansible-tmp-..."'
Using module file .../ansible/modules/ping.py
<203.0.113.10> PUT /tmp/.../tmpXXXX TO /home/ec2-user/.ansible/tmp/.../AnsiballZ_ping.py
<203.0.113.10> SSH: EXEC sftp ... put ... AnsiballZ_ping.py
<203.0.113.10> SSH: EXEC ssh ... chmod u+rwx .../AnsiballZ_ping.py
... python3 .../AnsiballZ_ping.py → JSON back
... rm -rf .../ansible-tmp-...
Walking through each step — this is the "lifecycle" of a task:
1. SSH CONNECT open an SSH connection to the host (using key, user ec2-user)
2. MKDIR TEMP create a temp dir ~/.ansible/tmp/ansible-tmp-... on the HOST
3. SHIP MODULE package the module into "AnsiballZ_ping.py", PUT (sftp) it to the host
4. EXECUTE run: python3 AnsiballZ_ping.py (WITH the HOST'S Python)
5. RETURN JSON module prints its result to stdout as JSON → Ansible reads it back
6. CLEANUP delete the temp dir on the host
The core point: Ansible doesn't "remote-control" — it SHIPS the code to the host and runs it there. The module is packaged (with the libraries it needs) into a self-contained Python file called AnsiballZ, shipped to the host over SSH, executed by the host's Python, returns JSON, then cleans itself up.
Why this mechanism explains everything
The three characteristics from Article 0 now become obvious:
- Agentless: no agent needed because Ansible brings the code over on each run, using the SSH + Python that already exist on the host. The only requirement of the host is "SSH-able + has Python". There's no background service installed on the host.
- Idempotent: the module (running on the host) checks the current state itself before acting, then returns JSON with a
changed: true/falsefield. It's real code, not a blind command — so it knows "is it already in the right state?" without doing redundant work (Article 5). - Writable modules: since a module is just a program that takes parameters (JSON in) and returns JSON out, you can write one yourself in Python (Article 11).
A few details worth noticing in the trace
Looking closely at the -vvv output reveals two things Ansible does to run fast:
ControlMaster=auto ControlPersist=60s— Ansible reuses one SSH connection for multiple tasks (keeping the connection for 60s), instead of opening a new SSH for every task. This is an important optimization when a playbook has many tasks.StrictHostKeyChecking=no(which we set inansible.cfg) — skip the host-key confirmation prompt, handy for a lab. In production you should manage known_hosts properly.
There's one more speedup option, pipelining (sending the module over SSH's stdin instead of an sftp PUT, reducing the number of SSH round trips) — we revisit it in Article 13 (optimization).
Push model: the control node drives
Unlike agent-based tools (Chef/Puppet) following a pull model (an agent on the host periodically pulls its config down on its own), Ansible follows a push model: the control node actively pushes changes down to the host when you run a command. The upshot: changes happen immediately and under control (you know exactly when they're applied), but the host doesn't "fix itself" between runs — to guarantee state, re-run the playbook (periodically or via CI).
Wrap-up
Ansible puts everything on the control node (Ansible + modules + inventory + playbooks); the managed node only needs SSH + Python, no agent. When running a task, Ansible opens SSH, ships the module (AnsiballZ) to the host, runs it with the host's Python, gets JSON back, then cleans up — observable with -vvv. This "bring the code over and run it in place" mechanism is exactly why Ansible is agentless (nothing to install), idempotent (the module checks state itself), and extensible (a module is just a JSON-in/JSON-out program). This is the push model: the control node actively pushes changes.
Article 2 rolls up its sleeves: install Ansible as the control node, stand up an EC2 as the host, and run the first ad-hoc command — the very ping we just dissected.