What to monitor and why it matters

In a previous post I made it clear that I am no believer in enterprise systems management.

In most cases ESM is nothing more than IT monitoring which has little relevance.

ESM adds no value yet it remains central to most modern IT strategies – the industry must do better.

Applications must be developed to address both functional and non-functional requirements; this means the use of instrumentation in the code. Developers are increasingly inserting instrumentation that helps search engines, data scientists / marketing so why are they not routinely coding for operations.

Few people in Operations challenge the norm and fewer still have developed the concept of a service or have direct exposure to customers. My approach is that I should only monitor services that have an SLA and an SLA should be about the performance of a service and not a single IT object.

We hve been through a cycle of in-sourced, out-sourced, off-shore and right-shored IT operations and they all have their merits.

Surely we are at a period where our operations approach should mature; the industrial revolution for IT where automation, computer-learning and autonomics drove ALL routine IT functions. The motor industry managed this transition so what is stopping IT?

My guidance :-

  1. Monitor the service [end-user experience that abstracts all IT]
    1. the experience of the user is key
    2. deviation from business norms is a good metric
    3. be careful of the calendar and of statistical outliers
  2. Present service health using language and with pictures that make sense to the business
  3. Ensure your services are properly described [with meta-data]
    1. enforce this rigour when developing|enhancing new services
    2. use a discovery methodology or maybe tool to determine what makes a service
    • this might make it sound like I am an advocate of the enterprise service bus; I am yet I have yet to see one implemented [not implemented properly, simply implemented]  so have some skepticism
  4. Alarm at the service-tier only
  5. Remove opportunities for prima donna
    1. develop operating procedures for common fixes
    2. automate [guided|fully] these procedures
  6. Integrate the change/release/testing cycles
    1. no change without a ticket, no exceptions
    2. forensically assess the impact of a release and an incident
    3. a change to the service demands that the service map is updated [automatically]
    4. expose fools and frauds

I must|might share my thoughts on enterprise service buses

ESM: It’s never worked for me!

I was an early adopter installing Tivoli way back in the mid-90’s.

I have to say that after lots of effort the promise of enterprise systems management was never achieved and while ESM is engrained in most enterprises I wonder how many have achieved the goals that they established.

Even with advances in products from HP and IBM and the change from scripts and thresholds (including dynamic) the space has moved on little and success if still painfully elusive.

With the coming of the Cloud can a static and largely human-driven approach to monitoring ever be successful? Obviously it cannot – so what is the answer?

Does the Cloud and/or virtualisation change the game and is monitoring actually necessary and if it is can an analytics-based approach solve problem?

Perhaps the answer we are looking for it mixed into the DevOps scene; all too often IT management (monitoring) is an after-thought that is dealt with by the operations guys.

I will be focusing on an approach and techniques to build management into the design and delivery processes to see if we can actually move away from the old modes of operation and create intelligent systems in future projects.

Watch this space to see if this is successful or not!