Assess performance impact of Spectre & Meltdown patches using vRealize Operations Manager

Assess performance impact of Spectre & Meltdown patches using vRealize Operations Manager

This post was originally published on this site ---

With this article, I wanted to share some quick tips and tricks which can help you manage the performance of your workloads while you go through the patching process for Spectre & Meltdown vulnerabilities. Before my recommendations, here are a few things which you should know about Spectre & Meltdown:

 

What is Spectre & Meltdown?

“Meltdown and Spectre exploit critical vulnerabilities in modern processors. These hardware vulnerabilities allow programs to steal data which is currently processed on the computer. While programs are typically not permitted to read data from other programs, a malicious program can exploit Meltdown and Spectre to get hold of secrets stored in the memory of other running programs. This might include your passwords stored in a password manager or browser, your personal photos, emails, instant messages and even business-critical documents.”

                                                                                                          Reference – https://meltdownattack.com

What are impacted vendors recommending?

Since the vulnerabilities are exposed at the processor layer, almost every layer of the stack is affected by this. Be it hardware, operating systems, hypervisors or applications, each layer of the stack needs to patched.

Multiple vendors who are impacted by this vulnerabilities have announce product patches so far and this website has done a great job of tracking this. VMware advisories can be found on the this link.

Patching recommendations across various vendors is continously being updated and hence it is important to contact your vendors to get the level of detail required before taking any action.

 

What would be the impact of patching?

Based on advisories given by multiple operating system vendors, the patches would impact the speculative execution capabilities of the processor.  In a shared infrastructure, overcommitment of CPU is a pretty common practice. This is done with an underlying assumption that not every workload will peak at the same time. However, with these patches, mostly every OS would lead to an increased utilization of CPU in your environment due to slower processing times. The collective impact of this increase may impact the overall performance of your workloads due to a sudden growth in utilization. This could also mean that you might run out of capacity on clusters where you thought you had just enough.

 

How to be proactive?

At this juncture, you would need to strike a balance between which areas of your environment requires security more than performance and your patching strategy should incorporate the following recommendations:

 

How vRealize Operations Manager can help?

While it is obvisous that you cannot stop this increase in utilization if you chose to patch the vulnerability, with vRealize Operations Manager you can track the following key areas of your environment:

 

The following out of the box dashboards and features of these products can help you track the impact of these patches.

 

Note – All these capabilities are available with vRealize Operations 6.6.1 Standard Edition and above. For customers with vRealize Operations Advanced Edition, please download the Spectre & Meltdown dashboard Kit mentioned at the end of this post.

For those who are not entitled to vRealize Operations license, you can download the evaluation version of the product and use this capability for atleast 60 days to prepare yourself Sprectre & Meltdown Performance & Capacity Impact.

a) During patching, track the BIOS and ESXi version of each host using “Host Configuration” Dashboard

 

 

b) After a VM is patch, check if performance is impacted using “VM Utilization” Dashboard

 

 

c) Track the growth of CPU and Memory Usage per cluster using “Capacity Overview” Dashboard

 

 

d) Find out the Heavy Hitters VMs using “Heavy Hitters” Dashboard

 

 

e) Use “Projects” to run a What-if Scenario to model added growth and see the Capacity Shortfall for each cluster.

 

f) For large customers with multiple clusters of same purpose, relieve the CPU and Memory pressure by using “Workload Balance” to move VMs from Stressed Clusters to Underused Clusters.

 

h) Power off Idle VMs to reduce CPU and Memory pressure with “Capacity Reclaimable“dashboard.

 

Special Spectre & Meltdown Custom Dashbaords Kit (Needs vRealize Operations Manager Advanced Edition or above)

 

Since vRealize Operations Manager Advanced edition allows you to create powerful custom dashboards, some of our internal VMware employees have leveraged this functionality and created a small library of THREE custom dashobards which are very specific to focus on Configuration, Performance, Capacity & Utilization of your virtual infrastructure. Iwan Rahabok, Mark Scott and Luciano Gomes. I am happy to share that you can download and import these dashboards in your environment to get a dedicated set of dashboards for this purpose. Here is what you will get post the import:

 

 

 

 

Performance Monitoring Dashbord

 

This dashboard helps track the CPU and Memory usage of your environment at all levels. You can see the impact of patching on the entire environment, vSphere Clusters, ESXi Hosts and every virtual machine. This helps you keep a tap on the increase and allows you to take precautions before you hit the high level watermark.

 

 

Guest Operating System Patching

vRealize Operations manager keeps a track of usage of your virtual machines. As recommended, it would be good to phase out the roll out of guest OS patches to ensure that you can steadily measure the impact of these patches on overall utilization. With vRealize Operations, you can pull up a list of idle VMs which should be your first target for patching. The heavy hitters, who are usually high on CPU usage at most of times should be touched last and with more precautin for patching. This dashboard, helps you plan by providing you that list based on current and hostorical workload behavior of these virtual machines.

 

 

vSphere Patching

You need to carefully plan the patching of ESXi hosts and the virtual machine hardware versions. While the reocmmendations are available in the VMware advisory, this kit comes with a dashboard which can simply help you track your progress on which all ESXi hosts are patched along with the virtual machine hardware versions. You can track the ESXi build numbers you should be on after patching and the virtual machine hardware version at the same time using this custom dashboard.

 

 

 

How do I get this kit? – Download the kit from this link along with the instructions on how to import these dashboards.

 

Note – Since this topic is still evolving, it would be a good idea to stay current on this topic.

 

The post Assess performance impact of Spectre & Meltdown patches using vRealize Operations Manager appeared first on VMware Cloud Management.

Leave a Reply

Your email address will not be published. Required fields are marked *