Testing Ansible

Ansible is an ingenious piece of software which saves time put into documenting configuration procedures, documenting configuration schemes, running procedures and bootstrapping your infrastructure a whole lot easier.

By it’s design, Ansible does not force you to use any particular style or structure. If you’d rather use long playbooks instead of granular and nicely put roles, so be it, you will achieve your goal!

The problem is that the resources are scarce when it comes to testing deployed configuration, and the method for testing configurations must be as flexible as Ansible itself.

Every journey starts with analysis, cometh the 4 types of tests.

Syntax test

The first thing you should do after writing any configuration script is to test it’s syntax for errors. Most errors are syntax related so don’t hesitate to run a quick check.

Dry run

ansible-playbook provision.yml --check

Dry running will run your specified playbook without making changes to the target machines. You will see if the tasks can run, not if they will run.

When doing dry runs you will encounter some false negatives. For example, I want to install Elasticsearch from a .deb package that I packaged without pulling it from a repository: the way I’m doing that is:

  1. Copy es.deb to /tmp/
  2. Install debian package located in /tmp/es.deb
  3. Remove the package from /tmp/es.deb

Dry run will fail in this case because it will check if it can install the file located at /tmp/es.deb  which wasn’t previously copied – just checked if the copy was possible.

How we go around this problem is by adding an always_run flag to the tasks we want to run regardless if it is a dry or a regular run – in this case the copy and clean up tasks.

Idempotency test

Idempotency is a fancy word borrowed from maths with a very simple meaning. Basically it means that no matter how many times you paint the wall green – the wall will stay green. In other words another taking of the same action will not have a different outcome when repeated in succession. Idempotency is what separates ad-hoc from production grade Ansible scripts. If I want to install a new utility on my entire infrastructure, and for some reason the script fails after running on 50% of the infrastructure, I don’t want to cherry pick the failed/unconfigured machines, I want to run the same script again, because I will be double sure that the first half is A-OK!

Delayed assertion

Delayed assertion is what a Java programmer thinks of when you mention the word test.

So far we’ve only done smoke tests – tests which tell us if the code has any chance of achieving our result – not if it does achieve.

Idempotency, dry run and syntax tests are easily automated, but the computer won’t know what we want unless we explicitly state our cause.

How the development process looks so far:

  1. You want to have an Nginx web server listening on port 80.
  2. You write an Ansible script that passes all before mentioned tests.
  3. … you come to a sad realization that the only thing you are sure of is that the script you wrote in step 2. does something right, not if it does what you defined in step 1.

How the process should look like:

  1. You put your wish into code.
  2. You write an Ansible script that passes the syntax, dry run and idempotency tests – aiming for the fulmilment of the wish.
  3. You check if the wish you put into code is satisfied.

Putting wish into code in this example would look like this:

  • I want to have an open TCP socket on port 80
  • I want to curl http://localhost:80 and receive status code 200
  • I want to have Nginx started & listening on port 80

Making assertions in Ansible can be done using the script and assert modules.

But why put “delayed” next to assertion?

Let’s say I want to check if all of my tasks needed for installation of Nginx executed properly on the remote machine.

I will execute the dry run test on the remote machine and see if something is different:

If  I get an output from the above command that nothing needs to be changed (in other words – no differences exist) I know that the machine is configured according to my Nginx role and that all tasks were executed.

But who will guarantee that my server is listening on port 80 and returning 200 status codes? 

The answer can’t be simpler: I will have it checked by Ansible!

The delayed assertion is looking at the outcome and testing if it you can use it after properly doing all the exact steps.

Why delayed? Because we know all the steps work what they are supposed to, but we don’t know if in the bigger picture the purpose is met, we have to delay the test of the bigger picture after we have assembled the smaller bits and pieces.

Leave a Reply

Your email address will not be published. Required fields are marked *