Is Your SAP HANA Highly Available?

  

Pull out your SAP HANA Technical Operations Manual for the latest SPS09 release and you will find section 2.4 about Monitoring the SAP HANA Database, and section 2.9 discussing High Availability for SAP HANA at a very shallow depth. After reading them, are you confident that your HANA setup is well-monitored and highly available enough to go into Production? If the answer is NO, then you've got some work to do before that in-memory investment can go live.

Let's start with Monitoring. Here are your choices from SAP:

  1. SAP HANA Studio
  2. SAP HANA Cockpit from a Web browser (essentially SAP DBA Cockpit)
  3. SAP Management Console
  4. SAP Solution Manager technical monitoring
  5. SAP Landscape and Virtualization Manager (LVM)

Basic monitoring should provide information on your landscape (hosts, services, detail/coordinator type, status), performance utilization, and alerts. I have set up and played with all the above and some work right out of the box but is not an enterprise solution, like the HANA Studio or the Management Console. On the other hand, Solution Manager and LVM can be time-consuming to set up and maintain (see my other article Thoughts on SAP LVM). Most IT operations teams are not going to use Solution Manager for enterprise-wide monitoring, it's just too much to manage!

SAPHANAStudio-Monitor-Landscape-1
 

SAP HANA Studio Administration Console - Landscape Monitor

If you can get the above combinations to play together for your organization, you have a head start. Otherwise, look for a third-party solution and worry about running your HANA environment instead of the monitoring solution.

Once you have monitoring taken care of, how do you test your HANA scenario, whether it's scale-out or scale-up deployment, for operational robustness during a failure? Sure you have the best hardware redundancy and operating system setup, but that doesn't mean HANA services or servers won't fail. This is a brief set of HA scenarios you need to test, and document expected versus actual results. Correct them if it does not meet your service level or operational objectives.

  1. Shut off a single instance of a system (whether you have one or multiple HANA instances on a cluster)
  2. Shut off one node of a system (if you have multiple nodes in a scale-out cluster)
  3. Restart one of the physical servers of the system
  4. Failover of a HANA Master node
  5. Failover of a HANA Slave node
  6. Kill the STATISTICS service on a system
  7. Kill the INDEXSERVER service on a system
  8. Kill the NAMESERVER service on a system
  9. Kill the DAEMON service on a system
  10. Fill up the memory on one node by creating a rogue memory hog
  11. Fill up one of the HANA data volumes by populating a table with dummy data

In testing these and I'm sure there are many other scenarios you can come up with as variations for your needs, you must ensure your monitoring solution detects and handles the alerts/notifications appropriately.

Please plan your operational readiness for SAP HANA adequately!

What's been your experience in the effort, time, and cost associated with getting SAP HANA production ready?

For HANA Availability monitoring using our best practices template try our SaaS solution.