---
title: "Advanced Ansible Patterns: Roles, Collections, Dynamic Inventory, Vault, and Testing"
description: "Decision framework for advanced Ansible patterns covering roles vs collections, dynamic inventory strategies, vault encryption, callback plugins, custom modules, Molecule testing, and CI integration with guidance on when to use each pattern at different infrastructure scales."
url: https://agent-zone.ai/knowledge/infrastructure/ansible-advanced-patterns/
section: knowledge
date: 2026-02-22
categories: ["infrastructure"]
tags: ["ansible","roles","collections","dynamic-inventory","ansible-vault","molecule","custom-modules","callback-plugins","ci-cd"]
skills: ["ansible-advanced","configuration-management","infrastructure-testing","automation-architecture"]
tools: ["ansible","ansible-galaxy","ansible-vault","molecule","ansible-lint","ansible-navigator"]
levels: ["intermediate","advanced"]
word_count: 1834
formats:
  json: https://agent-zone.ai/knowledge/infrastructure/ansible-advanced-patterns/index.json
  html: https://agent-zone.ai/knowledge/infrastructure/ansible-advanced-patterns/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Advanced+Ansible+Patterns%3A+Roles%2C+Collections%2C+Dynamic+Inventory%2C+Vault%2C+and+Testing
---


# Advanced Ansible Patterns

As infrastructure grows from a handful of servers to hundreds or thousands, Ansible patterns that worked at small scale become bottlenecks. Playbooks that were simple and readable at 10 hosts become tangled at 100. Roles that were self-contained become duplicated across teams. This framework helps you decide which advanced patterns to adopt and when.

## Roles vs Collections

Roles and collections both organize Ansible content, but they serve different purposes and operate at different scales.

**Roles** are the basic unit of reusable Ansible content. A role encapsulates tasks, handlers, templates, files, and variables into a directory structure. Roles live inside a project or are shared via Ansible Galaxy.

**Collections** are the distribution unit for Ansible content. A collection packages multiple roles, modules, plugins, and playbooks under a namespace. Collections are versioned, installable via `ansible-galaxy`, and can declare dependencies on other collections.

### When to use roles alone

Use roles when your infrastructure is managed by a single team, roles are used within one or two projects, you have fewer than 20 roles, and you do not write custom modules or plugins. At this scale, the overhead of creating and versioning collections adds complexity without proportional benefit.

```
# Simple project structure with roles
site.yml
inventory/
  production/
    hosts.yml
    group_vars/
  staging/
    hosts.yml
    group_vars/
roles/
  webserver/
  database/
  monitoring/
  common/
```

### When to move to collections

Move to collections when roles are shared across multiple projects or teams, you write custom modules, plugins, or filters that need distribution, you need semantic versioning and dependency management for your Ansible content, or your organization has more than 50 roles and needs namespace organization.

```
# Collection structure
namespace/
  infra_platform/
    galaxy.yml
    roles/
      webserver/
      database/
      monitoring/
    plugins/
      modules/
        custom_deploy.py
      filter/
        network_utils.py
      callback/
        custom_logger.py
    playbooks/
      site.yml
```

The `galaxy.yml` file defines the collection metadata:

```yaml
# galaxy.yml
namespace: myorg
name: infra_platform
version: 2.1.0
description: Infrastructure platform collection
dependencies:
  community.general: ">=6.0.0"
  ansible.posix: ">=1.5.0"
```

Build and install the collection:

```bash
ansible-galaxy collection build
ansible-galaxy collection install myorg-infra_platform-2.1.0.tar.gz
```

### Decision summary

| Factor | Roles | Collections |
|--------|-------|-------------|
| Team count | 1 team | Multiple teams |
| Reuse scope | Within project | Across projects/orgs |
| Custom modules/plugins | No | Yes |
| Versioning needs | Git tags sufficient | Semantic versioning required |
| Distribution | Git clone / Galaxy roles | Galaxy collections / Automation Hub |
| Overhead | Low | Medium |

## Dynamic Inventory

Static inventory files list hosts manually. Dynamic inventory queries an external source (cloud provider API, CMDB, service discovery) to generate the inventory at runtime.

### When to use static inventory

Static inventory works when your infrastructure is stable (hosts rarely change), you manage fewer than 50 hosts, and hosts are provisioned manually or through a slow process. Static inventory is simple, auditable, and version-controlled.

### When to switch to dynamic inventory

Switch to dynamic inventory when hosts are provisioned and destroyed dynamically (auto-scaling groups, cloud VMs, containers), you manage more than 50 hosts and manual updates are error-prone, your source of truth for hosts is already a cloud provider, CMDB, or service discovery system, or you need host grouping based on cloud metadata (tags, regions, instance types).

### Cloud provider plugins

```yaml
# aws_ec2.yml - AWS dynamic inventory
plugin: amazon.aws.aws_ec2
regions:
  - us-east-1
  - us-west-2
keyed_groups:
  - key: tags.Environment
    prefix: env
    separator: "_"
  - key: instance_type
    prefix: type
  - key: placement.availability_zone
    prefix: az
filters:
  tag:ManagedBy: ansible
  instance-state-name: running
compose:
  ansible_host: private_ip_address
```

This generates groups like `env_production`, `env_staging`, `type_t3_large`, and `az_us_east_1a` automatically from EC2 instance tags and metadata.

```yaml
# azure_rm.yml - Azure dynamic inventory
plugin: azure.azcollection.azure_rm
auth_source: auto
keyed_groups:
  - key: tags.environment | default('untagged')
    prefix: env
  - key: location
    prefix: region
  - key: resource_group
    prefix: rg
conditional_groups:
  webservers: "'web' in tags.role"
  databases: "'db' in tags.role"
```

### Custom inventory scripts

When your source of truth is a CMDB, internal API, or custom database, write an inventory plugin:

```python
# plugins/inventory/cmdb_inventory.py
from ansible.plugins.inventory import BaseInventoryPlugin

class InventoryModule(BaseInventoryPlugin):
    NAME = 'myorg.infra.cmdb_inventory'

    def parse(self, inventory, loader, path, cache=True):
        super().parse(inventory, loader, path, cache)
        config = self._read_config_data(path)

        hosts = self._query_cmdb(config.get('cmdb_url'))
        for host in hosts:
            self.inventory.add_host(host['hostname'])
            self.inventory.set_variable(host['hostname'], 'ansible_host', host['ip'])
            for group in host.get('groups', []):
                self.inventory.add_group(group)
                self.inventory.add_child(group, host['hostname'])
```

### Decision summary

| Factor | Static | Dynamic |
|--------|--------|---------|
| Host count | < 50 | > 50 |
| Host lifecycle | Stable | Dynamic (auto-scaling) |
| Source of truth | Inventory files | Cloud API / CMDB |
| Audit trail | Git history | External system's audit |
| Setup complexity | None | Medium |

## Vault Encryption

Ansible Vault encrypts sensitive data (passwords, API keys, certificates) so it can be stored in version control alongside playbooks.

### Encrypting individual variables vs entire files

**Individual variables** (encrypt_string) embed encrypted values inline in YAML files:

```bash
ansible-vault encrypt_string 'SuperSecretPassword' --name 'db_password'
```

This produces:

```yaml
db_password: !vault |
  $ANSIBLE_VAULT;1.1;AES256
  62313365396662343061393464336163383764316462...
```

**Entire file** encryption encrypts the full file:

```bash
ansible-vault encrypt group_vars/production/secrets.yml
```

### When to use each approach

Use **encrypt_string** when you have a few secrets mixed with non-secret variables in the same file and you want non-secret values to remain readable in code review. Use **entire file encryption** when a file contains mostly secrets, you want a clear separation between secret and non-secret files, or you use a vault password file per environment.

### Multi-vault strategy

For environments with different access levels, use multiple vault IDs:

```bash
# Encrypt with environment-specific vault IDs
ansible-vault encrypt --vault-id dev@prompt group_vars/dev/secrets.yml
ansible-vault encrypt --vault-id prod@/path/to/prod-vault-pass group_vars/prod/secrets.yml

# Run playbook with multiple vault passwords
ansible-playbook site.yml --vault-id dev@prompt --vault-id prod@/path/to/prod-vault-pass
```

This ensures that developers who know the dev vault password cannot decrypt production secrets. In CI/CD, each environment's vault password is stored as a separate pipeline secret.

### Decision summary for vault strategy

| Scenario | Approach |
|----------|----------|
| Small team, one environment | Single vault password, encrypt_string |
| Multiple environments | Vault ID per environment, file encryption |
| CI/CD integration | Vault password files from pipeline secrets |
| Rotation required | Use external secrets manager (HashiCorp Vault, AWS Secrets Manager) with lookup plugins instead of Ansible Vault |

## Callback Plugins

Callback plugins customize Ansible's output and reporting. They hook into events like task start, task completion, play start, and play end.

### Built-in callbacks worth enabling

```ini
# ansible.cfg
[defaults]
callbacks_enabled = ansible.posix.profile_tasks, ansible.posix.timer

[callback_profile_tasks]
task_output_limit = 20
sort_order = descending
```

`profile_tasks` shows execution time per task, making it easy to identify slow tasks. `timer` shows the total playbook execution time. These are essential for optimizing playbook performance.

### Custom callback for notifications

```python
# plugins/callback/slack_notify.py
from ansible.plugins.callback import CallbackBase
import requests

class CallbackModule(CallbackBase):
    CALLBACK_VERSION = 2.0
    CALLBACK_TYPE = 'notification'
    CALLBACK_NAME = 'slack_notify'

    def v2_playbook_on_stats(self, stats):
        hosts = sorted(stats.processed.keys())
        failures = any(stats.summarize(h).get('failures', 0) > 0 for h in hosts)
        message = f"Playbook {'FAILED' if failures else 'completed'}: {len(hosts)} hosts"
        requests.post(self.webhook_url, json={"text": message})
```

### When to write custom callbacks

Write custom callbacks when you need to integrate with external notification systems (Slack, PagerDuty, custom dashboards), you want structured logging in a specific format (JSON for log aggregation), or you need to track playbook execution metrics (duration, success rate, change rate) for compliance or auditing.

## Custom Modules

Custom modules extend Ansible's capabilities when no existing module handles your use case.

### When to write a custom module

Write a custom module when you interact with an internal API that has no community module, you need idempotent management of a resource that the `command` or `shell` modules cannot provide, or an existing module does not expose the specific parameters you need.

```python
# plugins/modules/custom_deploy.py
from ansible.module_utils.basic import AnsibleModule
import requests

def main():
    module = AnsibleModule(
        argument_spec=dict(
            app_name=dict(type='str', required=True),
            version=dict(type='str', required=True),
            api_url=dict(type='str', required=True),
            api_token=dict(type='str', required=True, no_log=True),
        ),
        supports_check_mode=True,
    )

    current = requests.get(
        f"{module.params['api_url']}/apps/{module.params['app_name']}",
        headers={"Authorization": f"Bearer {module.params['api_token']}"}
    ).json()

    if current.get('version') == module.params['version']:
        module.exit_json(changed=False, msg="Already at target version")

    if module.check_mode:
        module.exit_json(changed=True, msg="Would deploy new version")

    response = requests.post(
        f"{module.params['api_url']}/apps/{module.params['app_name']}/deploy",
        headers={"Authorization": f"Bearer {module.params['api_token']}"},
        json={"version": module.params['version']}
    )

    if response.status_code != 200:
        module.fail_json(msg=f"Deploy failed: {response.text}")

    module.exit_json(changed=True, version=module.params['version'])

if __name__ == '__main__':
    main()
```

Key requirements for custom modules: support `check_mode` so `--check` works, use `no_log=True` for sensitive parameters, return `changed=True/False` accurately for idempotency, and use `module.fail_json()` for errors.

## Testing with Molecule

Molecule is the standard testing framework for Ansible roles. It creates ephemeral instances (Docker containers, cloud VMs), applies the role, and runs verification tests.

### When to invest in Molecule testing

Invest in Molecule when roles are shared across teams or projects, roles manage critical infrastructure (databases, security configurations), you have a CI/CD pipeline for Ansible content, or you are building collections for distribution.

Skip Molecule when roles are simple and used by one team, the cost of maintaining test infrastructure exceeds the cost of occasional bugs, or roles are throwaway (one-time migration playbooks).

### Basic Molecule setup

```yaml
# molecule/default/molecule.yml
dependency:
  name: galaxy
driver:
  name: docker
platforms:
  - name: ubuntu-test
    image: ubuntu:22.04
    pre_build_image: true
    command: /lib/systemd/systemd
    privileged: true
    volumes:
      - /sys/fs/cgroup:/sys/fs/cgroup:rw
  - name: rocky-test
    image: rockylinux:9
    pre_build_image: true
    command: /lib/systemd/systemd
    privileged: true
provisioner:
  name: ansible
verifier:
  name: ansible
```

```yaml
# molecule/default/converge.yml
- name: Converge
  hosts: all
  roles:
    - role: webserver
      vars:
        webserver_port: 8080
```

```yaml
# molecule/default/verify.yml
- name: Verify
  hosts: all
  tasks:
    - name: Check nginx is installed
      ansible.builtin.package:
        name: nginx
        state: present
      check_mode: true
      register: nginx_installed
      failed_when: nginx_installed.changed

    - name: Check nginx is running
      ansible.builtin.service:
        name: nginx
        state: started
      check_mode: true
      register: nginx_running
      failed_when: nginx_running.changed

    - name: Check nginx is listening
      ansible.builtin.wait_for:
        port: 8080
        timeout: 5
```

Run the test lifecycle:

```bash
molecule create      # Create test instances
molecule converge    # Apply the role
molecule verify      # Run verification tests
molecule destroy     # Clean up
molecule test        # Run the full lifecycle (create, converge, verify, destroy)
```

## CI Integration

### When to add CI for Ansible

Add CI when multiple people contribute to the Ansible codebase, changes are deployed to production environments, or compliance requires an audit trail of configuration changes.

### Pipeline structure

```yaml
# .github/workflows/ansible-ci.yml
name: Ansible CI
on:
  pull_request:
    paths: ['roles/**', 'playbooks/**', 'collections/**']

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run ansible-lint
        uses: ansible/ansible-lint@main

  molecule:
    runs-on: ubuntu-latest
    needs: lint
    strategy:
      matrix:
        role: [webserver, database, monitoring]
    steps:
      - uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - name: Install dependencies
        run: pip install ansible molecule molecule-docker
      - name: Run Molecule tests
        run: molecule test
        working-directory: roles/${{ matrix.role }}

  deploy-staging:
    runs-on: ubuntu-latest
    needs: molecule
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    steps:
      - uses: actions/checkout@v4
      - name: Run playbook against staging
        run: |
          ansible-playbook -i inventory/staging site.yml --diff --check
          ansible-playbook -i inventory/staging site.yml --diff
        env:
          ANSIBLE_VAULT_PASSWORD: ${{ secrets.VAULT_PASSWORD_STAGING }}
```

The pipeline follows a progression: lint first (fast, catches syntax and style issues), then Molecule tests (slower, catches functional issues), then staging deployment (only on merge to main). Production deployment should require manual approval via a separate workflow or deployment tool.

## Common Mistakes

**Using collections too early.** Creating a collection for three roles used by one team adds build, versioning, and distribution overhead without benefit. Start with roles in a project directory. Move to collections when the sharing and versioning needs justify it.

**Dynamic inventory without caching.** Every `ansible-playbook` run queries the cloud API for inventory. For large inventories, this adds seconds or minutes to every run. Enable inventory caching:

```ini
[inventory]
cache = true
cache_plugin = ansible.builtin.jsonfile
cache_connection = /tmp/ansible-inventory-cache
cache_timeout = 300
```

**Testing the Ansible module, not the result.** Molecule verify tasks that check "did Ansible run this task" are tautological. Instead, verify the observable outcome: is the port open, does the service respond, is the file present with the correct content.

**Vault password in the repository.** The vault password file must never be committed. Add it to `.gitignore` and distribute it through a secrets manager or CI/CD pipeline secrets. The encrypted files themselves are safe to commit -- that is the point of vault encryption.

