Mês: novembro 2014

Ansible: an easy way to start with config management

Many, many years back, I developed a script to help me run commands on many unix machines or network equipment easyly. I called it tabatool (yeah, weird name, don’t ask). It used Tcl/expect, it had a config file with host names, groups, usernames for login and sudo, etc. I eventually stopped working on it and settled on using a mix of parallel-ssh, clusterssh, fabric, and more specific expect scripts when needed. And them later got to config management with Puppet.

But I really miss a good push mode for Puppet sometimes. Some way of saying “take this code I have right here live in front of me, and apply it to that (or these) machines, NOW” and bam, it goes and do it. Not always, but sometimes. Like adding more powerful config management to fabric or to my old script.

Ansible is like that. It is all that I ever wanted my old script to be, and much much more. It is very flexible, powerful, and it tries very hard to KISS as much as possible. It runs over SSH and it needs no agent on the remote machine. Having played with it a little bit already, I believe Ansible is about the easiest way for a sysadmin to start at configuration management. With the plus of having a very good chance of not needing/wanting to leave it for other tools as the journey progresses.

If you want to follow the examples bellow with “hands on”, let’s get installation out of the way: see the docs here. They publish packages for Ubuntu, CentOS/RHEL, and it’s also pretty common running from a source checkout from Github. On Debian and Ubuntu, a straight apt-get install ansible will also work, but the versions on the official repos might be quite old.

Basic start: ad-hoc usage

Say you want to make sure NTP is installed on all your Debian/Ubuntu machines. List their names on a file (it will be your ansible inventory) and run the command below. I’ll call my file production.

ansible -i production all -u morales --ask-pass --sudo \
  --ask-sudo-pass -m apt -a "name=ntp update-cache=yes"

Depending on your machines setup you might not need any or all of the user and password options above. You can set them as default in a configuration file as well. Ansible recommends you use authorized SSH keys to login, but is fine with passwords, sudo, su and all that. The all soon after -i means all hosts of that inventory, -m apt means the ansible module for apt, and you pass parameters to that module with -a.

You can also run an arbitrary command in a similar way:

ansible -i production all -u morales --ask-pass --sudo \
  --ask-sudo-pass -a "rm -f /tmp/sad_unwanted_file"

This uses the command module, which happens to be the default module, so you don’t need to say -m command in this case. There are many many other modules available. Want just to upload a script and run it? Fine, use the script module. Bonus: with the optional creates= parameter it only runs the script if the said file does not exist.

ansible -i production all -u morales --ask-pass --sudo \
  --ask-sudo-pass -m script -a "/tmp/myscript.sh \ 
  --some-arguments 1234 creates=/etc/blah/file.conf"

Enter configuration management

Ok, that’s cool, you can already dump parallel-ssh in favor of ansible. But the real deal comes next. We can organize several of those module calls together in a file or set of files, called playbooks. Very very shortly, a playbook is an ordered set of plays, and a play is an ordered set of tasks applied to a set of hosts. Playbooks and plays are written in YAML, Yet Another Markup Language, one that’s pretty easy for humans to work with. The tasks are performed by modules like apt and script and so on.

Let’s see that ad-hoc ntp install, in a playbook called ntp.yml:

---
- hosts: all
  tasks:
  - name: Install NTP
    apt: name=ntp update-cache=yes

You now run it using the ansible-playbook executable instead of ansible:

ansible-playbook -i production -u morales --ask-pass --sudo \
--ask-sudo-pass ntp.yml

Too much work for just installing a lame package. So let’s also set the configuration file and make sure NTP is restarted whenever ansible changes that file:

---
- name: Install and Configure NTP
 hosts: all
 vars:
   ntp_server: [pool.ntp.org, south-america.pool.ntp.org]
 tasks:
 - name: Install NTP
   apt: name=ntp update_cache=yes
 - name: Copy the ntp.conf template file
   template: src=ntp.conf.j2 dest=/etc/ntp.conf
   notify: 
     - restart ntp
 handlers:
 - name: restart ntp
   service: name=ntp state=restarted

Now the play has two tasks and calls a handler to restart NTP. It also sets variables in the beggining of the play (but they could be set on other files), and one of the tasks fills a template of the ntp.conf file, using those variables. The templating language is Jinja 2. Here’s a possible ntp.conf.j2 for use above:

driftfile /var/lib/ntp/drift

{% for i in ntp_server %}
server {{ i }}
{% endfor %}

restrict -4 default kod notrap nomodify nopeer noquery
restrict 127.0.0.1
restrict -6 default kod notrap nomodify nopeer noquery
restrict ::1

statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable

Let’s finish showing some ansible’s output for the code above (assuming the machines didn’t have ntp installed):

 $ ansible-playbook -i production -u morales --ask-pass --sudo \
     --ask-sudo-pass ntp.yml
 
 PLAY [Install and Configure NTP] ***********************************
 
 GATHERING FACTS ****************************************************
 ok: [box1]
 ok: [box2]
 
 TASK: [Install NTP] ************************************************
 changed: [box2]
 changed: [box1]
 
 TASK: [Copy the ntp.conf template file] ****************************
 changed: [box1]
 changed: [box2]
 
 NOTIFIED: [restart ntp] ********************************************
 changed: [box2]
 changed: [box1]
 
 PLAY RECAP *********************************************************
 box1 : ok=4 changed=3 unreachable=0 failed=0 
 box2 : ok=4 changed=3 unreachable=0 failed=0

If you decide to change the ntp_server list, for example, and run it again, here’s what yout get:

 $ ansible-playbook -i production -u morales --ask-pass --sudo \
     --ask-sudo-pass ntp.yml

 PLAY [Install and Configure NTP] ***********************************
 
 GATHERING FACTS ****************************************************
 ok: [box2]
 ok: [box1]
 
 TASK: [Install NTP] ************************************************
 ok: [box2]
 ok: [box1]
 
 TASK: [Copy the ntp.conf template file] ****************************
 changed: [box2]
 changed: [box1]
 
 NOTIFIED: [restart ntp] ********************************************
 changed: [box1]
 changed: [box2]
 
 PLAY RECAP *********************************************************
 box1 : ok=4 changed=2 unreachable=0 failed=0 
 box2 : ok=4 changed=2 unreachable=0 failed=0 

It figures out ntp is already installed, but changes the file and restarts the service. And if you run it again without changing anything, it will just report ok=4 changed=0 at the end: nothing needs doing, so nothing is done.

And there you have it. A small piece of your infrastructure codified in a text file, which can be versioned, reviewed, debugged, shared, and… run. It’s also executable documentation, the only kind of documentation that has a real chance of not getting outdated. New guy arrives in the team, and needs to understand how we do X? Read the code.

Beautiful. Welcome to configuration management.

That was just a very tiny scratch on the surface of Ansible

Really tiny. Let me list some others things Ansible can do:

  • The inventory can be organized in groups. You can define variables based on those groups and use them inside plays (think stuff like “this group of servers uses NTP/DNS servers A and B, that other one uses NTP/DNS servers B and C).
  • There are some automatically available variables with facts about the system: things like hostname, FQDN, os version, and much more.
  • You have loops and conditionals.
  • You can “package” your tasks in reusable roles, and them apply them to hosts in several different plays/playbooks.
  • There are many community-made roles on the Ansible Galaxy (like the Forge for Puppet).
  • You can do a dry run before doing it for real.
  • Sensitive vars/data may be encrypted using ansible-vault, so you don’t need to store it in clear text inside your code.
  • You can run commands on your local control machine, having them in same playbook together with other regular remote tasks. You can also prompt yourself for information.
  • You can even wait for reboots.
  • Windows support already exists, modules for it are starting to appear.
  • You can do everything from the command line, but you can pay for support and access to the web interface called Ansible Tower.

The Ansible documentation could maybe be richer, but it’s also pretty easy to follow.

Final Remarks

I guess the main thing I would like to transmit here can be summed up like this: ANY sysadmin (at least *nix ones) can find an immediate use for Ansible, and can start using it without really studying it any more than reading this post, and then move on to automating more and more stuff with it, little by little, in an easy learning/practice curve.

There are other alternatives. I have used a lot of Puppet, and intend to take Salt for a spin next. But most of all I find Ansible KISS attitude really compelling, specially for people starting with this class of tools.