If you're working in IT, no doubt by now you have likely heard in the last week of Meltdown and Spectre affecting modern computers. In fact, as I am preparing this article, I have an urgent System Update popup that I cannot dismiss, and I'm sure it's related to the chip flaw. According to the article Kernel-memory-leaking Intel processor design flaw forces Linux and Windows redesign, there are expected performance hits. The effects are still being benchmarked, but we already see with some of our SAP monitoring that it is likely already having an impact. We know from past work that Patches can Fix or Break SAP Performance. Let's dig deeper.
Expected Performance Hit
As per the article cited above "a fundamental design flaw in Intel's processor chips has forced a significant redesign of the Linux and Windows kernels to defang the chip-level security bug.....these updates to both Linux and Windows will incur a performance hit on Intel products. The effects are still being benchmarked, however, we're looking at a ballpark figure of five to 30 percent slow down, depending on the task and the processor model".
How SAP Performance May Show Up
Below is a series of charts taken from a sample Production system showing CPU utilization, Users, and Average Dialog Response Times comparing recent days versus a similar period last month when the IT-Conductor monitor SAP.
This series of SAP app servers' CPU Utilization initially caught our eye as they are higher than any period last month. It could be attributed to the new year processing, but we'll assume they are the same as last month for now and will address that point later.
The composite chart above shows both higher average CPU across all app servers > 10%.
The charts below break them down by individual app server on a daily basis and are incrementally higher, not by much but they do fall in line with the 5-30% impact cited by earlier benchmarks.
Average Dialog Response Times
SAP Performance is typically baselined on transactional systems by average dialog response times. Based on the same system above, the chart below shows the average at least 10% higher compared to the same period last month. Tabular data was used to examine more closely each day.
What If It's Just More Workload?
The earlier question was "What if the higher CPU and response times are due to a higher workload for the new year?" Well, a few simple measures of workload can be the number of active users and/or the number of SAP dialog steps. We'll just show the user count by day below which shows that with the exception of the last couple of days, the performance impact already started several days earlier when user count was lower than the same period last month. Ideally, we would have the same data for the same period last year but this system was only recently monitored in the last few months.
Cloud-based Systems and Databases
AWS and Azure, two of the most dominant SAP public cloud vendors have issued accelerated patch schedules to address the issues. Based on AWS Processor Speculative Execution Research Disclosure, they "have not observed meaningful performance impact for the overwhelming majority of EC2 workloads". Yet in our monitoring, we can see clearly below the CPU utilization increase on our EC2 and Database instances after we patched.
It's still early and some industry analysts have said this deep chip-level flaw will have long-lasting repercussions. The advice would be to stay on top of patches as required to prevent possible security exploits from these issues. Equally important is to stay vigilant in monitoring and managing the impact of performance on critical enterprise applications by following the SAP Performance Best Practices.
The Fastest and Most Efficient Way to Monitoring SAP for Your Environment