What to monitor and why it matters

In a previous post I made it clear that I am no believer in enterprise systems management.

In most cases ESM is nothing more than IT monitoring which has little relevance.

ESM adds no value yet it remains central to most modern IT strategies – the industry must do better.

Applications must be developed to address both functional and non-functional requirements; this means the use of instrumentation in the code. Developers are increasingly inserting instrumentation that helps search engines, data scientists / marketing so why are they not routinely coding for operations.

Few people in Operations challenge the norm and fewer still have developed the concept of a service or have direct exposure to customers. My approach is that I should only monitor services that have an SLA and an SLA should be about the performance of a service and not a single IT object.

We hve been through a cycle of in-sourced, out-sourced, off-shore and right-shored IT operations and they all have their merits.

Surely we are at a period where our operations approach should mature; the industrial revolution for IT where automation, computer-learning and autonomics drove ALL routine IT functions. The motor industry managed this transition so what is stopping IT?

My guidance :-

  1. Monitor the service [end-user experience that abstracts all IT]
    1. the experience of the user is key
    2. deviation from business norms is a good metric
    3. be careful of the calendar and of statistical outliers
  2. Present service health using language and with pictures that make sense to the business
  3. Ensure your services are properly described [with meta-data]
    1. enforce this rigour when developing|enhancing new services
    2. use a discovery methodology or maybe tool to determine what makes a service
    • this might make it sound like I am an advocate of the enterprise service bus; I am yet I have yet to see one implemented [not implemented properly, simply implemented]  so have some skepticism
  4. Alarm at the service-tier only
  5. Remove opportunities for prima donna
    1. develop operating procedures for common fixes
    2. automate [guided|fully] these procedures
  6. Integrate the change/release/testing cycles
    1. no change without a ticket, no exceptions
    2. forensically assess the impact of a release and an incident
    3. a change to the service demands that the service map is updated [automatically]
    4. expose fools and frauds

I must|might share my thoughts on enterprise service buses

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s