HOW THINGS USED TO BE:
Some of our readers may remember how SAP Basis used to be the one-stop shop for all technical SAP, Database, OS, Servers, Storage, yes and even Network and Security. Basis teams designed, built and operated, which meant we also scheduled jobs, monitored performance, and of course were on call for all things technical SAP. Those days of working in Basis Operations maybe long gone for large enterprises with segregation of duties and specialization due to increased complexities and compliance requirements, but sometimes we wonder if those were the good old days when it was more efficient.
Today, most enterprise IT division comprise of many technology silos and teams which function as shared services. IT Operations is one of those centralized shared services, and SAP Basis are separated out from System Administrators, DBAs, etc. Everybody relies on IT Operations (sometimes referred to NOC - Network Operations Center) to provide centralized monitoring and share incident management with IT Helpdesks. Sounds rosy right? It can be, but many times there's also the law of unintended consequences when things are really inefficient. ITSM (IT Service Management) specialists would know - there must be people, processes and products to implement to make it operational. It's like the 3-legged stool, all must be present to be functional.I have been on both sides of Basis and Operations, here are 7 telltale signs that may indicate an organization's IT Operations may not be Operational, or at least not optimally, why we think it's not, and what typical fixes are.
7 TELLTALE SIGNS:
These are listed in no particular order of importance, and yes you can have your own list of telltale signs reflective of your organization and culture.
Sign 1: Monitoring plan is unclear what and how services are monitored, as well as documented responses to exceptions
Why it's broke: Technical teams do not know what Operations is capable of monitoring, but also the Technical teams should provide a minimum set of requirement for monitoring critical services
Fix: Operations should have some domain expertise about the environment, and their application monitoring tool's capability, while Technical teams should already have a list of critical things to monitor, frequency, appropriate thresholds, and exception handling
Sign 2: Lack of availability monitoring
Why it's broke: Operations should be the first to know when unplanned availability occurs
Fix: Availability monitoring should be top priority and appropriate actions taken if it's unplanned according to predetermined criteria such as percentage and time window
Sign 3: Complaints of too many alerts
Why it's broke: When Operations get flooded with alerts and doesn't know why, they tend to ignore them
Fix: Technical team needs to filter what goes to Operations for monitoring and the frequency of events that would indicate potential problems. Operations should have capability to de-duplicate repetitive events while alerting on the main problem
Sign 4: Auto-assignment of trouble tickets to Technical team without any first level diagnosis
Why it's broke: This is like Helpdesk passing on every user complaint to Technical team without first basic checklist of whether there is an issue
Fix: Operations should have basic troubleshooting capability with the tools to do first level diagnosis before escalating as an incident to Technical team
Sign 5: Service levels are not being monitored, or sufficiently managed
Why it's broke: Service level management (SLM) provides metrics or KPI on service health and quality objectives. Without SLM, application monitoring would be too noisy
Fix: Tools are essential to automate service level monitoring, alerting, and reporting. Tools help proactive SLM rather than just historical reporting of why service levels were missed.
Sign 6: Lack of centralized enterprise monitoring
Why it's broke: Every technical team prefers its own set of tools, but it's essential to integrate them for centralized management and correlation
Fix: 360 degrees view of services across application and infrastructure would provide better root cause analysis and top down service management
Sign 7: Us and Them
Why it's broke: Operations, IT Support, SME are all critical part of IT Service Management (ITSM). Operational excellence is a shared role and responsibility of all teams
Fix: IT Service Management should incorporate cross-functional DevOps to automate the data flow between traditional silos, to share information and intelligence, rather than segregated Sysadmins, DBAs, etc.
Whether you choose to be philosophical about these signs or prefer your own set of signs and remedies, the premise is that IT Ops is a critical part of any enterprise IT organization, without which the services delivered to end-users are poorly managed. So where do we go from here?
HOW THINGS OUGHT TO BE:
We can debate on where the future of enterprise IT is heading, but one thing is for sure, with the proliferation of business services and apps, as well as every kind of organization needing to innovate with software to be efficient, automation of IT Operations will be one of the major key to high quality service delivery.
How is the health of your IT Operations?
Want fo find out more how we can help automate monitoring and service level management?