Ansible DTAP – Development, testing, acceptance and production

Quis custodiet ipsos custodes?

Droste
A majority of Ansible use cases are in application deployment and continuous delivery, a job at which Ansible truly excels. But when using Ansible for such mission critical things, an age-old question might arise:
Who is going to guard to guardians?
In other words, how are we going about continuous delivery and super-cool automated deployments if the Ansible scripts themselves don’t pass the same process?
In my previous post – Testing Ansible I’ve identified 4 different steps in testing our Ansible scripts.
This DTAP process should give an overview and a loosely coupled framework for putting our Ansible code to production.

Development – “Ground zero”

The most important thing about development is to be completely fearless about making errors and to make errors completely and easily reversible.

To achieve that comfort in development, a virtual environment is a must: be it a container, VPS, or a VM… What ever suits the project you are developing. My recommended development environment is Hashicorp’s Vagrant.
Vagrant gives you the comfort of putting up multiple virtual machines in a single environment – properly abstracting the production infrastructure you might have.
When the development is finished, the following tests need to be run:

  1. Syntax test – did we write code or did we write gibberish?
  2. Dry run – are all the prerequisites for the configuration changes present?
  3. Run the scripts – will the script actually run?
  4. Idempotency test – will I make any harm by running the configuration script again on an already configured machine?

It’s important to note that these tests can be fully automated and take almost no time to run – ideal for development.

Testing – “The first trials”

Delayed assertion

The development is finished – great! We’re 1/4 of the way closer to production.
We are left with one important test mentioned in Testing Ansible: Delayed assertion.
Delayed assertion is just you writing more code to accurately test if all conditions required by the feature are met.
After running the smoke tests mentioned in the Development phase and running the Delayed assertion tests, we need to ask the authority to give us clearance for staging.

Authorization for staging

Our code is now swimming with the big fishes – no more development comfort, ad-hoc changes and SSH sessions…

Once we ask the authority for permission to go to staging – we are in the rapids flowing to production, everything from now on is fully automated.
The authority, in our case, is the CI Server.

CI Server’s role in this step is to re-run all the steps done in development, to fix a common cause for failure – developer not testing properly.

The CI Server workflow:

  1. Code is commited to the repository.
  2. Needed VMs/VPSs are spun up exactly as in the development environment
  3. Tests are being run on the machines in the exact way as they should be run on the development environment
  4. If everything is ok – we are ready to get serious.

Staging / Acceptance – “Getting serious”

We have the code, we have the proof that the tests are passing. Now comes staging.
The attitude we have towards the staging environment should be identical to the production environment – otherwise we simply haven’t set the stage well.
The only difference between the staging and the production environment is in the fact that no end-users are using it!
The same tests as in the previous step are run but this time on an exact copy of the production infrastructure.
Depending on the CI tool which you are using, this will be easier or harder to setup, but the ideal workflow should be the following:

  1. Run tests in the staging environment
  2. If the tests fail, mark the build as unsafe and don’t destroy the staging environment
  3. If the tests pass, mark the build as passing and destroy the staging environment

 

Why don’t we destroy the staging environment when the tests are failing but quickly dispose of it if they’re not?
Simply because we wan’t to have access to the environment which failed to provision normally – to gather data on the failure and to make sure we avoid it in the next build. Marking the build as unsafe in this case simply means that this specific revision CANNOT finish up in the production environment – no excuses.

Production – “The point of no return”

There isn’t much to say about production, I recommend visiting List of religions and spiritual traditions on Wikipedia and picking who to pray to that nothing breaks. Once you stop praying that nothing breaks and start praying that the tests you have written have good coverage you know you’re getting better. Once you stop praying even that the tests are ok, and leave the office immediately after deploying to production, you know that you are a sociopath who just likes to watch the world burn, congratulations!

Recommendations

CI tool – I’m currently experimenting with Go CD which, coming from a short and troublesome experience with Jenkins seems like a nice refreshment.

Anti-concurrent deployments – this is a major issue when you’ve built an automated workflow from development to production, you don’t want people running deployments at the same time, because something will break, and it will break hard. If you can’t setup this kind of control in your CI tool, I recommend Etsy’s PushBot which is an IRC Bot which allows developers to queue in for their turn on deploying.

Military-grade ACLs – you don’t want to trust no-one, not even yourself. Granularize access to certain parts of the workflow wherever, whenever possible. A good practice would be to implement a sharded key shared by multiple members of the team for deploying changes to production, after successfully passing the tests in staging environment.

9 Comments

  1. I was extremely pleased to discover this web site.
    I want to to thank you for your time just for this fantastic read!!
    I definitely liked every little bit of it and I
    have you saved to fav to look at new things in your web site.

  2. In June 2014-days before the previously declared shutdown day -Microsoft reported it’d the
    GFWL servers to actually shut down, thus games could continue to function generally.

  3. I think that what you typed made a lot of sense.
    But, what about this? what if you typed a catchier title?
    I ain’t suggesting your information is not good., but what
    if you added something that grabbed folk’s attention?
    I mean Ansible DTAP – Development, testing, acceptance
    and production – Lazar's blog is kinda
    vanilla. You could look at Yahoo’s front page and watch how they create article
    headlines to grab viewers interested. You might add a related video or a picture or two to get readers interested about
    what you’ve written. Just my opinion, it could make your posts a little bit more
    interesting.

  4. I am really enjoying the theme/design of your blog. Do you ever
    run into any web browser compatibility issues?

    A couple of my blog visitors have complained about my website not operating correctly in Explorer but looks great in Firefox.
    Do you have any suggestions to help fix this problem?

  5. Actually no matter if someone doesn’t know then its up to
    other viewers that they will assist, so here it takes place.

  6. “Sport Bild”, sagte, dass, obwohl die Heimmannschaft Stuttgart herabgestuft , aber Gross Crowe Izz Bayern scheint die Energie zu einem gewissen Spott zu haben.

    Wolfsburg 1-3 verloren die Herabstufung, um zu
    bestimmen, wegen einer Oberschenkelmuskelverletzung
    Abwesenheit von Konkurrenz Groß Kreutz noch im Gange in Gegenwart eines alternativen Bürste mit einem Gefühl der Gegenwart, schrieb er auf seinem Instagram: “Ich schwöre, : bis zum nächsten Jahr warten, um die zweite Liga Meisterschaft zu bekommen, ich einen neuen Anfang sind wir eine gute Sache ist, auch jetzt verletzt vielleicht Meister Bayern Leistung sicherlich die Bundesliga spannender !!! bekommen würde, aber ich bin ein sicher !!! Dinge, wir kommen wieder !!! ich bin voller Stolz, Leidenschaft und Ruhm !!! wir werden bald in der Bundesliga sein Abschied, und Stuttgart in der gleichen. ”
    http://www.trikotsem2016.de/

  7. Paragraph writing is also a excitement, if you be acquainted
    with then you can write otherwise it is complicated to write.

Leave a Reply

Your email address will not be published. Required fields are marked *