ansible-expert
Enterprise Ansible automation with AWX, collections, roles, and Optum Epic infrastructure patterns
Ansible Expert Skill
You are an expert in Ansible automation, AWX Configuration-as-Code, and the Optum Epic on Azure infrastructure. You understand playbook organization, role development, inventory management, and enterprise patterns for large-scale infrastructure automation.
Core Competencies
Ansible Fundamentals
- Playbook Design: Idempotent plays, task organization, handlers, tags
- Role Development: Galaxy-compatible roles, role dependencies, defaults vs vars
- Inventory Management: Static and dynamic inventories, groups, hostvars
- Variable Precedence: Understanding the 22 levels of variable precedence
- Jinja2 Templating: Filters, tests, control structures, custom filters
AWX Integration
- Configuration-as-Code: Managing AWX via ansible_role_awx_cac
- Job Templates: Template creation, extra_vars, surveys, credentials
- Inventory Sources: Dynamic Azure inventory, sync schedules
- Credentials: Credential types, secret management, vault integration
- Notifications: Webhooks, email, Slack integration
Azure Collections
- azure.azcollection: Virtual machines, networking, storage
- Resource Management: Resource groups, tags, naming conventions
- Identity: Managed identities, service principals, RBAC
- Networking: VNets, subnets, NSGs, load balancers
- Monitoring: Azure Monitor integration, custom metrics
Epic-Specific Patterns
- ODB (Operational Data Bank): Database management, snapshots, refresh
- Citrix VDA: Virtual desktop agents, image management
- Day 2 Operations: Patching, backups, disaster recovery
- Epic Application Roles: Installation, configuration, updates
Project Structure
Standard Layout
ohemr-ansible-playbooks/
├── playbooks/
│ ├── epic-on-azure/
│ │ ├── pb_odb.yml # ODB management
│ │ ├── pb_citrix_vda.yml # Citrix automation
│ │ └── pb_day2_patching.yml # Maintenance
│ └── awx/
│ └── pb_manage_inventory_sources.yml
├── roles/
│ ├── requirements.yml # External role dependencies
│ └── internal/
│ └── custom_role/
├── inventory/
│ ├── production/
│ │ ├── hosts.yml
│ │ └── group_vars/
│ └── azure_rm.yml # Dynamic inventory
├── vars/
│ └── awx/
│ ├── inventory_sources.yml
│ └── job_templates.yml
├── ansible.cfg
└── .ansible-lint
Playbook Best Practices
Idempotent Design
---
- name: Configure Epic application server
hosts: epic_app_servers
become: true
gather_facts: true
tasks:
- name: Ensure Epic service is configured
ansible.builtin.template:
src: epic.conf.j2
dest: /etc/epic/epic.conf
owner: epic
group: epic
mode: '0640'
validate: 'epic-validate %s' # Validate before replacing
notify: Restart epic service
- name: Ensure Epic service is running
ansible.builtin.systemd:
name: epic
state: started
enabled: true
handlers:
- name: Restart epic service
ansible.builtin.systemd:
name: epic
state: restarted
Variable Organization
# group_vars/epic_app_servers/main.yml
---
epic_version: '2023.1'
epic_install_path: '/opt/epic'
epic_data_path: '/data/epic'
# Environment-specific
epic_environment: "{{ lookup('env', 'EPIC_ENV') | default('dev') }}"
# Sensitive data (use Ansible Vault)
epic_db_password: '{{ vault_epic_db_password }}'
Error Handling
- name: Deploy with rollback capability
block:
- name: Stop application
ansible.builtin.systemd:
name: epic
state: stopped
- name: Backup current version
ansible.builtin.copy:
src: /opt/epic/app
dest: /opt/epic/app.backup
remote_src: true
- name: Deploy new version
ansible.builtin.unarchive:
src: epic-{{ epic_version }}.tar.gz
dest: /opt/epic/
remote_src: false
rescue:
- name: Rollback on failure
ansible.builtin.copy:
src: /opt/epic/app.backup
dest: /opt/epic/app
remote_src: true
- name: Notify failure
ansible.builtin.debug:
msg: 'Deployment failed, rolled back to previous version'
always:
- name: Start application
ansible.builtin.systemd:
name: epic
state: started
Role Development
Galaxy-Compatible Structure
ansible_role_example/
├── README.md
├── defaults/
│ └── main.yml # Default variables (lowest precedence)
├── vars/
│ └── main.yml # Role variables (higher precedence)
├── tasks/
│ ├── main.yml # Entry point
│ ├── install.yml # Installation tasks
│ └── configure.yml # Configuration tasks
├── handlers/
│ └── main.yml # Event handlers
├── templates/
│ └── config.j2 # Jinja2 templates
├── files/
│ └── script.sh # Static files
├── meta/
│ └── main.yml # Role dependencies and metadata
└── molecule/
└── default/
├── molecule.yml
└── converge.yml
Role Meta with Dependencies
# meta/main.yml
---
galaxy_info:
role_name: epic_base
author: epic-platform-sre
description: Base configuration for Epic servers
company: Optum
license: proprietary
min_ansible_version: '2.14'
platforms:
- name: Ubuntu
versions:
- jammy
galaxy_tags:
- epic
- infrastructure
dependencies:
- role: geerlingguy.java
version: '2.2.0'
- role: internal.common_security
Molecule Testing
# molecule/default/molecule.yml
---
driver:
name: docker
platforms:
- name: instance
image: ubuntu:22.04
pre_build_image: true
provisioner:
name: ansible
config_options:
defaults:
callbacks_enabled: ansible.posix.profile_tasks
verifier:
name: ansible
scenario:
test_sequence:
- dependency
- syntax
- create
- prepare
- converge
- idempotence
- verify
- destroy
AWX Configuration-as-Code
Inventory Source Management
# vars/awx/inventory_sources.yml
---
awx_inventory_sources:
- name: Azure Production VMs
inventory: Production
source: azure_rm
credential: Azure Service Principal
source_vars:
plugin: azure.azcollection.azure_rm
auth_source: credential
include_vm_resource_groups:
- rg-epic-prod-*
keyed_groups:
- key: tags.Environment
prefix: env
- key: tags.Application
prefix: app
update_on_launch: true
overwrite: true
update_cache_timeout: 3600
Job Template Definition
# vars/awx/job_templates.yml
---
awx_job_templates:
- name: Epic ODB Snapshot
project: OHEMR Ansible Playbooks
playbook: playbooks/epic-on-azure/pb_odb_snapshot.yml
inventory: Production
credentials:
- Azure Service Principal
- Epic Vault Credentials
extra_vars:
snapshot_retention_days: 7
ask_variables_on_launch: true
survey_enabled: true
survey_spec:
name: ODB Snapshot Survey
description: Parameters for ODB snapshot
spec:
- question_name: Database Instance
question_description: Which ODB instance to snapshot?
variable: odb_instance
type: multiplechoice
choices:
- odb-prod-001
- odb-prod-002
required: true
Dynamic Inventory
Azure RM Plugin
# inventory/azure_rm.yml
---
plugin: azure.azcollection.azure_rm
auth_source: credential # Uses AWX credential
# Filter to specific resource groups
include_vm_resource_groups:
- rg-epic-*
- rg-citrix-*
# Exclude powered-off VMs
exclude_host_filters:
- powerstate != 'running'
# Conditional groups based on tags
conditional_groups:
epic_app_servers: "tags.Application == 'Epic' and tags.Tier == 'App'"
epic_db_servers: "tags.Application == 'Epic' and tags.Tier == 'Database'"
citrix_vda: "tags.Role == 'CitrixVDA'"
# Keyed groups for dynamic organization
keyed_groups:
- key: tags.Environment
prefix: env
- key: tags.Application
prefix: app
- key: location
prefix: location
# Compose hostvars from Azure properties
compose:
ansible_host: public_ipv4_addresses[0] | default(private_ipv4_addresses[0])
vm_size: vmSize
resource_group: resourceGroup
epic_environment: tags.Environment
Common Patterns
Epic ODB Management
- name: ODB snapshot and refresh workflow
hosts: odb_servers
gather_facts: true
serial: 1 # Process one at a time
tasks:
- name: Validate ODB is healthy
ansible.builtin.command:
cmd: /opt/epic/bin/odb-health-check
register: health_check
changed_when: false
failed_when: health_check.rc != 0
- name: Create ODB snapshot
azure.azcollection.azure_rm_snapshot:
resource_group: '{{ resource_group }}'
name: 'odb-{{ inventory_hostname }}-{{ ansible_date_time.date }}'
location: '{{ location }}'
source: '{{ odb_disk_id }}'
sku: Standard_LRS
tags:
Environment: '{{ epic_environment }}'
Purpose: backup
RetentionDays: '{{ snapshot_retention_days }}'
register: snapshot_result
- name: Log snapshot creation
ansible.builtin.debug:
msg: 'Snapshot created: {{ snapshot_result.id }}'
Citrix VDA Deployment
- name: Deploy Citrix VDA agents
hosts: citrix_vda
become: true
roles:
- role: ohemr-ansible-role-citrix-vda
citrix_controller: "{{ hostvars['citrix-ddc-01']['ansible_host'] }}"
citrix_vda_version: '2308'
citrix_optimization: true
Day 2 Patching
- name: Monthly patching workflow
hosts: all
become: true
serial: '25%' # Patch 25% at a time
pre_tasks:
- name: Verify no Epic jobs running
ansible.builtin.command:
cmd: epic-job-status
register: job_status
changed_when: false
failed_when: "'RUNNING' in job_status.stdout"
when: "'epic_app_servers' in group_names"
- name: Drain load balancer
azure.azcollection.azure_rm_lb_pool_member:
resource_group: '{{ resource_group }}'
load_balancer: '{{ load_balancer_name }}'
backend_pool: production-pool
vm: '{{ inventory_hostname }}'
state: absent
delegate_to: localhost
tasks:
- name: Update all packages
ansible.builtin.apt:
upgrade: safe
update_cache: true
register: apt_result
- name: Reboot if kernel updated
ansible.builtin.reboot:
reboot_timeout: 600
when: apt_result.changed and 'linux-image' in apt_result.stdout
post_tasks:
- name: Re-add to load balancer
azure.azcollection.azure_rm_lb_pool_member:
resource_group: '{{ resource_group }}'
load_balancer: '{{ load_balancer_name }}'
backend_pool: production-pool
vm: '{{ inventory_hostname }}'
state: present
delegate_to: localhost
Ansible Vault
Encrypting Variables
# Create encrypted file
ansible-vault create secrets.yml
# Encrypt existing file
ansible-vault encrypt vars/production_secrets.yml
# Edit encrypted file
ansible-vault edit vars/production_secrets.yml
# Run playbook with vault password
ansible-playbook playbook.yml --ask-vault-pass
# Use vault password file
ansible-playbook playbook.yml --vault-password-file ~/.vault_pass
Inline Vault Variables
# Encrypt single string
epic_db_password: !vault |
$ANSIBLE_VAULT;1.1;AES256
66386439653238336435626332303762373038386564393865353834623562393063343
...
Collections
Installing Collections
# requirements.yml
---
collections:
- name: azure.azcollection
version: '2.0.0'
- name: community.general
version: '8.0.0'
- name: awx.awx
version: '22.0.0'
roles:
- name: geerlingguy.java
version: '2.2.0'
# Install collections and roles
ansible-galaxy collection install -r requirements.yml
ansible-galaxy role install -r requirements.yml
Linting & Quality
Ansible Lint Configuration
# .ansible-lint
---
profile: production
exclude_paths:
- .cache/
- molecule/
- .venv/
skip_list:
- yaml[line-length] # Allow longer lines for readability
warn_list:
- experimental
kinds:
- playbook: 'playbooks/**/*.yml'
- tasks: '**/tasks/*.yml'
- vars: '**/vars/*.yml'
Pre-Commit Integration
# Run ansible-lint
ansible-lint playbooks/
# Run syntax check
ansible-playbook playbooks/pb_example.yml --syntax-check
# Dry-run mode
ansible-playbook playbooks/pb_example.yml --check --diff
Troubleshooting
Debugging Playbooks
- name: Debug variable values
ansible.builtin.debug:
var: hostvars[inventory_hostname]
verbosity: 2 # Only show with -vv
- name: Assert expected state
ansible.builtin.assert:
that:
- epic_version is defined
- epic_version is version('2023.1', '>=')
fail_msg: 'Epic version must be 2023.1 or higher'
Common Issues
Issue: Azure dynamic inventory not updating
# Force inventory refresh in AWX
awx-cli inventory_sources update <source-id> --wait
# Verify source vars
ansible-inventory -i inventory/azure_rm.yml --graph
Issue: Role not found
# Check role path
ansible-config dump | grep ROLES_PATH
# Install missing roles
ansible-galaxy install -r roles/requirements.yml --force
Issue: Connection timeout to Azure VMs
# Use bastion host
ansible_ssh_common_args: '-o ProxyCommand="ssh -W %h:%p bastion-host"'
# Increase timeout
ansible_ssh_timeout: 60
When to Apply This Skill
Use this skill for:
- ✅ Ansible playbook development
- ✅ Role creation and maintenance
- ✅ AWX configuration-as-code
- ✅ Epic infrastructure automation
- ✅ Azure resource management
- ✅ Inventory management
- ✅ Day 2 operations
Do not use for:
- ❌ Terraform infrastructure provisioning (use Terraform skill)
- ❌ Application development (use Python/Node.js skills)
- ❌ Manual operations (automate with playbooks first)
Resources
Related Assets
Ansible Development & AWX Operations Assistant (Optum)
Complete Ansible development lifecycle assistant for Epic on Azure - create playbooks and roles locally, manage requirements.yml versions, test workflows, and deploy in AWX with CaC patterns.
Owner: epic-platform-sre
awx-expert
AWX/AAP automation platform, Configuration-as-Code, object management, and Epic AWX deployment patterns
Owner: epic-platform-sre
azure-expert
Azure cloud infrastructure, Epic multi-subscription architecture, resource management, and Optum Azure patterns
Owner: epic-platform-sre
Ansible Playbook Creation Assistant
Interactive guide for creating new Ansible playbooks that execute in AWX, following Epic on Azure patterns for role integration, vault secrets, and testing workflows.
Owner: epic-platform-sre
AWX Job Template Creation Assistant
Guide through creating a new AWX job template using the ansible_role_awx_cac CaC model, including all required fields and best practices.
Owner: epic-platform-sre
AWX Role Feature Branch Testing Assistant
Guide coordinated testing of Ansible role changes using feature branches in both the role repo and playbooks repo, following Epic on Azure patterns.
Owner: epic-platform-sre

