Testing your automation

I was prompted to write this after enquiring why the team was applauding in the background while I was on a conference call the other day and I recognised that a deploy is still fraught and laden with unnecessary angst.

It’s time for an admission!

A long time ago I was part of a team that spent six months automating the delivery of a major platform for a client.

We automated everything, tested it and waited to deployment day.

So far so good…..

We tested every unit, tested everything in isolation, then integration tested in a range of pre-production environments.

But it went horribly wrong on release day!

Why?

Storage, security, authentication, routing, name services, timing and perhaps just poor planning or bad luck.

Actually it was a combination of these things that I should have foreseen.

To name just a few root causes.
1) firewalls
2) routing and name service deltas
3) latency
4) user permissions.

All sounds simple and straightforward but all caused problems plus perhaps 20 other things. Our unit and integration tests did not protect us.

The solution; we developed a framework called probes. Non-destructive tests, some as simple as a traceroute or ping to be run repeatedly in the weeks, days, hours and immediately before and after deployment to help determine readiness and identify failure. We choreographed the probes using iConclude, now HP Operations Orchestrator, in a matter of days and it was one of the most pleasing aspects of our project.

Anytime we needed to check status run the probe suite and validate our tests. Prior to the deploy run the probe suite and look for red alarms. No alarms then all pre-conditions were met. Run the probe at deployment time and it also raised and closed tickets in the release system and updated the monitoring tools. Deployment down from 48 hours of overtime to minutes on a normal day.

Sounds simple. It was simple!

As I look at the current state of DevOps I cannot see anything as elegant as this being developed.

I think there is huge value in this approach and perhaps people are doing it so I will continue to look and hope to see it soon.

Oh, and by the way, probes were our friend in finding problems with the solution once it had gone live.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s