VMWare Monitoring Plug-in Technical Reference
Introduction
VMWare delivers the world’s most trusted virtualisation and cloud infrastructure solutions that accelerate IT transformation by reducing complexity and enabling more flexible, agile service delivery.
While virtualisation has tremendous benefits, it adds new complexity when it comes to managing your network. Virtual Machines (VMs) and their host machines need performance and availability monitoring, just like their physical server counterparts. The Geneos VMWare plug-in monitors the VMWare ESXi server by querying webservice API. By collecting key parameters from the VMWare host, Geneos users are able to correlate both the host and guest health with the rich application data they collect.
Application Support teams need to have visibility of the health, performance and availability of their Virtual Machines (VMs) in order to be proactive and provide the best service to the business. When multiple layered operating systems are supporting their applications, it is just as critical to collect and analyze those metrics from guest and host to ensure the best possible performance to an end user.
The VMWare Monitoring plug-in provides Application Support teams with a view of the entire VMWare environment, drill down details to identify root cause and out of the box alerts to take actions, and fix problems on time all within Geneos solution.
The VMware monitoring plug-in has two basic types of dataviews:
- Info View — views based on managed objects, such as the virtual machine or the host. This view shows the summary or status information.
- Monitor View — views based on performance counters. The real-time sampling period is 20 seconds.
Technology
The VMWare plug-in is a Java process that uses the VMWare vSphere API to continuously monitor a VMWare host and associated virtual machines, delivering the real-time monitoring into the Geneos framework using the XML–RPC interface.
Architecture
The VMWare plug-in integrates with your existing architecture. You can connect the plug-in to existing gateways to allow you to correlate VMWare monitoring information with monitoring from the application level that runs on this virtual infrastructure.
Prerequisites
The following requirements must be met prior to the installation and setup of the template:
- Geneos XML-RPC API token.
- VMWare Solution package with dependent libs (these are included in the
lib
subdirectory). - VMWare plug-in licence:
VMWareMonitor.lic
. - This requires an additional licence to use. Please contact your ITRS Account Manager for more information.
Java requirements
You must have Java installed on the machine running the Netprobe. For information on supported Java versions, see Java support in 5.x Compatibility Matrix.
Installation
Sampler
Set up a sampler. This is set up as an API plug-in.
- Set the name to “Cluster”. If you wish to change the name, make sure that this value is used in the
VMWareMonitor.properties
file. - Set the plugin type to API.
<sampler name="Cluster">
<plugin>
<api></api>
</plugin>
</sampler>
Managed Entity
Set up a managed entity that joins the probe and the sampler.
- Set the name to “VMWare”. If you wish to change the name, make sure that this value is used in the
VMWareMonitor.properties
file. - Set Options to probe, and select the probe you set up in Netprobe.
- Reference the sampler you set up in Sampler.
<managedEntity name="VMWare">
<probe ref="VMWare probe"></probe>
<sampler ref="Cluster"></sampler>
</managedEntity>
VMWare Permissions
Using the vSphere client, ensure that the user has full administration permissions.
VMWare Solution with Dependent Libs
Create a directory on the server where you are running the netprobe you want to use to monitor VMWare. Copy the contents of the tar file to this location.
VMWareMonitor/
VMWareMonitor.jar
lib/
log4j-1.2.16.jar
vim25.jar
ws-commons-util-1.0.2.jar
xmlrpc-client-3.1.3.jar
xmlrpc-common-3.1.3.jar
xmlrpc-server-3.1.3.ja
Plug-in Configuration
By default, the plug-in uses a config file called VMWareMonitor.properties
. If there is no config file to be found,
running the plug-in the first time will generate a default config file. Confirm that the VMwareMonitor.properties
file has the correct
settings especially:
netprobeServer=localhost
netprobePort=7036
vSphere Connection Details
You will need to supply the URL of the webservice, which is usually https://<IPADDRESSOFSERVER>/sdk
, and the username and password.
The VMware Monitoring plug-in can support only one host at a time. If you want to connect to several hosts, set up as many VMware samplers as needed. For guidance, see Sampler.
Upgrade Instructions
You can unpack the tallball directly over an existing installation in which a new default.properties
file will be created,
but the plug-in will continue to use your existing config file. You may need to restart your netprobe and reset your gateway connection
to clear out any obsolete views or columns.
Virtual Machine Views
For each webservice that the plug-in connects, there will be a set of Virtual Machine dataviews.
VM Info
Name | Description |
---|---|
Name | Name of the VM. |
Boot Time |
The timestamp when the virtual machine was most recently powered on. This property is updated when the virtual machine is powered on from the poweredOff state, and is cleared when the virtual machine is powered off. This property is not updated when a virtual machine is resumed from a suspended state. Note: This metric might have no value for some virtual machines. |
Connection State | Indicates whether or not the virtual machine is available for management. |
dasVmProtection |
The vSphere HA protection state for a virtual machine. Property is unset if vSphere HA is not enabled. Since vSphere API 5.0 Note: This metric might have no value for some virtual machines. |
faultToleranceState |
The fault tolerance state of the virtual machine. Since vSphere API 4.0 |
host |
The host that is responsible for running a virtual machine. This property is null if the virtual machine is not running and is not assigned to run on a particular host. Note: This metric might have no value for some virtual machines. |
guestMemoryUsage |
Guest memory utilisation statistics, in MB. This is also known as active guest memory. The number can be between 0 and the configured memory size of the virtual machine. Valid while the virtual machine is running. Note: This metric might have no value for some virtual machines. |
hostMemoryUsage |
Host memory utilisation statistics, in MB. This is also known as consumed host memory. This is between 0 and the configured resource limit. Valid while the virtual machine is running. This includes the overhead memory of the VM. Note: This metric might have no value for some virtual machines. |
overallCpuDemand |
Basic CPU performance statistics, in MHz. Valid while the virtual machine is running. Since vSphere API 4.0 Note: This metric might have no value for some virtual machines. |
overallCpuUsage |
Basic CPU performance statistics, in MHz. Valid while the virtual machine is running. Note: This metric might have no value for some virtual machines. |
powerState | The current power state of the virtual machine. |
question |
The current question, if any, that is blocking the virtual machine’s execution. Note: This metric might have no value for some virtual machines. |
recordReplayState |
Record / replay state of this virtual machine. Since vSphere API 4.0 |
suspendInterval |
The total time the virtual machine has been suspended since it was initially powered on. This time excludes the current period, if the virtual machine is currently suspended. This property is updated when the virtual machine resumes, and is reset to zero when the virtual machine is powered off. Note: This metric might have no value for some virtual machines. |
suspendTime |
The timestamp when the virtual machine was most recently suspended. This property is updated every time the virtual machine is suspended. Note: This metric might have no value for some virtual machines. |
toolsInstallerMounted | Flag to indicate whether or not the VMWare Tools installer is mounted as a CD-ROM. |
Disk Monitor
Name | Description |
---|---|
Name | Name of the VM. |
usage | Aggregated disk I/O rate. For hosts, this metric includes the rates for all virtual machines running on the host during the collection interval. |
Read rate |
Average number of kilobytes read from the disk each second during the collection interval.
read rate = # blocksRead per second x blockSize. |
Write rate |
Rate at which data is written to each virtual disk on the virtual machine. write rate = # blocksRead per second x blockSize |
Commands issued | Number of SCSI commands issued during the collection interval. |
Commands Aborted | Number of SCSI commands aborted during the collection interval. |
Bus Resets | Number of SCSI-bus reset commands issued during the collection interval. |
physical device Read Latency | Average amount of time, in milliseconds, to complete read from the physical device. |
Kernel Read Latency | Average amount of time, in milliseconds, spent by VMKernel processing each SCSI read command. |
physical device Read Latency | Average amount of time, in milliseconds, to complete read from the physical device. |
Kernel Write Latency | Average amount of time, in milliseconds, spent by VMKernel processing each SCSI write command. |
Write Latency | Average amount of time taken during the collection interval to process a SCSI write command issued by the Guest OS to the virtual machine. The sum of kernelWriteLatency and deviceWriteLatency. |
Queue Write Latency | Average amount time taken during the collection interval per SCSI write command in the VMKernel queue. |
Highest Latency | Highest latency value across all disks used by the host. Latency measures the time taken to process a SCSI command issued by the guest OS to the virtual machine. The kernel latency is the time VMkernel takes to process an IO request. The device latency is the time it takes the hardware to handle the request. |
Average Read request per second |
Number of disk reads during the collection interval.
|
Average Write request per second |
Number of disk writes during the collection interval.
|
VM Virtual Disk
The VM Virtual Disk view is available beginning the VMWareMonitor1.4.16.tar.gz
package.
Name | Description |
---|---|
name | Name of the virtual disk. |
Read rate | Average number of read commands issued per second to the virtual disk during the collection interval. |
Write rate | Average number of write commands issued per second to the virtual disk during the collection interval. |
Read latency |
Average amount of time, in milliseconds, for a read operation from the virtual disk. The total latency is computed as |
Write latency |
Average amount of time, in milliseconds, for a write operation to the virtual disk. The total latency is computed as |
Read Latency (us) |
Average amount of time, in microseconds, for a read operation from the virtual disk. The total latency is computed as |
Write Latency (us) |
Average amount of time, in microseconds, for a write operation to the virtual disk. The total latency is computed as |
Average number of outstanding read requests | Average number of outstanding read requests to the virtual disk. |
Average number of outstanding write requests | Average number of outstanding write requests to the virtual disk. |
Average read requests per second | Number of disk read commands completed on each virtual machine disk, per second. |
Average write requests per second | Number of disk write commands completed on each virtual machine disk on the host, per second. |
Read workload metric | Virtual disk metric for the read workload model. |
Write workload metric | Virtual disk metric for the write workload model. |
Read request size | Read IO request size. |
Write request size | Write IO request size. |
Number of small seeks | Number of small disk seeks. |
Number of medium seeks | Number of medium disk seeks. |
Number of large seeks | Number of large disk seeks. |
Memory Monitor
Name | Description |
---|---|
Name | Inventory path to Guest machine (e.g., Datacenter1/vm/myvm ). |
Usage |
Amount of machine memory or “physical” memory, as follows: Virtual machine - Guest “physical” memory that is mapped to machine memory. Includes shared memory amount. Does not include overhead. |
active |
Amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages. Virtual machine - Amount of guest “physical” memory actively used. |
shared | Amount of guest “physical” memory shared with other virtual machines (through the VMkernel’s transparent page-sharing mechanism, a RAM de-duplication technique). Includes amount of zero memory area. |
consumed | Virtual machine: Amount of guest physical memory consumed by the virtual machine for guest memory. Consumed memory does not include overhead memory. It includes shared memory and memory that might be reserved, but not actually used. Use this metric for charge-back purposes. |
Shared common |
Amount of machine memory that is shared by all powered-on virtual machines and vSphere services on the host. Subtract this metric from the shared metric to gauge how much machine memory is saved due to sharing: shared - sharedcommon = machine memory (host memory) savings (KB) |
Swapped used |
Current amount of guest physical memory swapped out to the virtual machine’s swap file by the VMkernel. Swapped memory stays on disk until the virtual machine needs it. This statistic refers to VMkernel swapping and not to guest OS swapping. |
heap |
VMkernel virtual address space dedicated to VMkernel main heap and related data. Note: For informational purposes only, not useful for performance monitoring. |
Heap free |
Free address space in the VMkernel’s main heap. Varies based on number of physical devices and configuration options. There is no direct way for the user to increase or decrease this statistic. Note: For informational purposes only, not useful for performance monitoring. |
state |
Amount of free machine memory on the host. VMkernel has four free-memory thresholds that affect memory reclamation:
|
overhead | Amount of machine memory used by the VMkernel to run the virtual machine. |
Swap target |
Amount of memory available for swapping. Target size for virtual machine swap file, as calculated by the VMkernel. The VMkernel uses values for this metric with the swap metric to stop and start swapping, as follows:
Since swapped memory stays swapped until the virtual machine accesses it, swapped memory can be greater than the memory swap target, possibly for a prolonged period of time. This simply means that the swapped memory is not currently needed by the virtual machine and is not a cause for concern. |
Swap in | Total amount of data that has been read into machine memory from the swap file since the virtual machine was powered on. |
Swap out | Total amount of data that the VMkernel has written to the virtual machine’s swap file from machine memory. This statistic refers to VMkernel swapping and not to guest OS swapping. |
Swap in Rate | Rate at which memory is swapped from disk into active memory during the interval. This counter applies to virtual machines and is generally more useful than the swapin counter to determine if the virtual machine is running slow due to swapping, especially when looking at real-time statistics. |
Swap out Rate | Rate at which memory is being swapped from active memory to disk during the current interval. This counter applies to virtual machines and is generally more useful than the swapout counter to determine if the virtual machine is running slow due to swapping, especially when looking at real-time statistics. |
DataStore Info
Name | Description |
---|---|
name | Name of the datastore. |
type | Type of file system volume, such as VMFS or NFS. See type. |
uncommitted |
Total additional storage space, in bytes, potentially used by all virtual machines on this datastore. The server periodically updates this value. It can be explicitly refreshed with the RefreshDatastoreStorageInfo operation. This property is valid only if accessible is true.
Note: This metric might have no value for some virtual machines.
Since vSphere API 4.0
|
url | The unique locator for the datastore. This property is guaranteed to be valid only if accessible is true. |
accessible | The connectivity status of this datastore. If this is set to false, meaning the datastore is not accessible, this datastore’s capacity and freespace properties cannot be validated. Furthermore, if this property is set to false, some of the properties in this summary and in DatastoreInfo should not be used. Refer to the documentation for the property of your interest. For datastores accessed from multiple hosts, vCenter Server reports accessible as an aggregated value of the properties reported in MountInfo. For instance, if a datastore is accessible through a subset of hosts, then the value of accessible will be reported as true by vCenter Server, and the reason for a daastore being inaccessible from a host will be reported in inaccessibleReason. |
capacity | Maximum capacity of this datastore, in bytes. This value is updated periodically by the server. It can be explicitly refreshed with the Refresh operation. This property is guaranteed to be valid only if accessible is true. |
freeSpace | Available space of this datastore, in bytes. The server periodically updates this value. It can be explicitly refreshed with the Refresh operation. This property is guaranteed to be valid only if accessible is true. |
maintenanceMode |
The current maintenance mode state of the datastore. The set of possible values is described in DatastoreSummaryMaintenanceModeState. Since vSphere API 5.0 |
Network Monitor
Name | Description |
---|---|
Name | Name of the VM. |
usage | Network Usage (Average). |
Packets Received | Number of packets received by each vNIC (virtual network interface controller) on the virtual machine. |
Packets Transmitted | Number of packets transmitted by each vNIC on the virtual machine. |
Data received rate | The rate at which data is received across the virtual machine’s vNIC (virtual network interface controller). |
Data Transmitted rate | The rate at which data is transmitted across the virtual machine’s vNIC (virtual network interface controller). This represents the bandwidth of the network. |
Received packets dropped | Number of receive packets dropped during the collection interval. |
Transmitted Packets dropped | Number of transmit packets dropped during the collection interval. |
CPU Monitor
Name | Description |
---|---|
Name | Name of the Guest machine. |
usage |
CPU usage as a percentage (in units of 1/100th of a percent) during the interval. VM - Amount of actively used virtual CPU, as a percentage of total available CPU. This is the host’s view of the CPU usage, not the guest operating system view. It is the average CPU utilisation over all available virtual CPUs in the virtual machine. For example, if a virtual machine with one virtual CPU is running on a host that has four physical CPUs and the CPU usage is 100%, the virtual machine is using one physical CPU completely. virtual CPU usage = usagemhz / (# of virtual CPUs x core frequency) |
Usage mhz |
CPU usage, as measured in megahertz, during the interval. VM - Amount of actively used virtual CPU. This is the host’s view of the CPU usage, not the guest operating system view. |
wait | Total CPU time spent in wait stat. |
ready | Percentage of time (in units of 1/100th of a percent) that the virtual machine was ready, but could not get scheduled to run on the physical CPU. CPU ready time is dependent on the number of virtual machines on the host and their CPU loads. |
used | Total CPU usage. |
idle | Total time that the CPU spent in an idle state (meaning that a virtual machine is not runnable). |
system | Amount of time spent on system processes on each virtual CPU in the virtual machine. This is the host view of the CPU usage, not the guest operating system view. |
VMWare Monitor Views
Admin
Name | Description |
---|---|
Name | Lists the metrics that gives an overview on the host and the VMs for administration purposes. |
Value |
Defines the specified metric in the |
Host Memory Monitor
Name | Description |
---|---|
Name | The machine name of the host. |
Usage |
Amount of machine memory used on the host. Consumed memory includes Includes memory used by the Service Console, the VMkernel, vSphere services, plus the total consumed metrics for all running virtual machines. host consumed memory = total host memory - free host memory |
active | Amount of memory that is actively used, as estimated by VMkernel based on recently touched memory pages. This is a sum of all active metrics for all powered-on virtual machines plus vSphere services (such as COS, vpxa) on the host. |
shared | Sum of all shared metrics for all powered-on virtual machines, plus amount for vSphere services on the host. The host’s shared memory may be larger than the amount of machine memory if memory is overcommitted (the aggregate virtual machine configured memory is much greater than machine memory). The value of this statistic reflects how effective transparent page sharing and memory over commitment are for saving machine memory. |
Shared common |
Amount of machine memory that is shared by all powered-on virtual machines and vSphere services on the host. Subtract this metric from the shared metric to gauge how much machine memory is saved due to sharing. shared - sharedcommon = machine memory (host memory) savings (KB) |
Swapped used |
Current amount of guest physical memory swapped out to the virtual machine’s swap file by the VMkernel. Swapped memory stays on disk until the virtual machine needs it. This statistic refers to VMkernel swapping and not to guest OS swapping. |
heap |
VMkernel virtual address space dedicated to VMkernel main heap and related data. Note:For informational purposes only, not useful for performance monitoring. |
state |
Amount of free machine memory on the host. VMkernel has four free-memory thresholds that affect memory reclamation:
|
overhead | Amount of machine memory used by the VMkernel to run the virtual machine. |
Swap target | Amount of memory available for swapping. Target size for virtual machine swap file, as calculated by the VMkernel. The VMkernel uses values for this metric with the swap metric to stop and start swapping, as follows:Since swapped memory stays swapped until the virtual machine accesses it, swapped memory can be greater than the memory swap target, possibly for a prolonged period of time. This simply means that the swapped memory is not currently needed by the virtual machine and is not a cause for concern. |
Swap in | Total amount of data that has been read into machine memory from the swap file since the virtual machine was powered on. |
Swap out | Total amount of data that the VMkernel has written to the virtual machine’s swap file from machine memory. This statistic refers to VMkernel swapping and not to guest OS swapping. |
Swap in Rate | Rate at which memory is swapped from disk into active memory during the interval. This counter applies to virtual machines and is generally more useful than the swapin counter to determine if the virtual machine is running slow due to swapping, especially when looking at real-time statistics. |
Swap out Rate | Rate at which memory is being swapped from active memory to disk during the current interval. This counter applies to virtual machines and is generally more useful than the swapout counter to determine if the virtual machine is running slow due to swapping, especially when looking at real-time statistics. |
Host Disk Monitor
Name | Description |
---|---|
Name | Name of the Host. |
usage | Aggregated disk I/O rate. For hosts, this metric includes the rates for the host during the collection interval. |
Read rate |
Average number of kilobytes read from the disk each second during the collection interval. Host - Rate at which data is read from each LUN on the host. read rate = # blocksRead per second x blockSize |
Write rate |
Rate at which data is written to each virtual disk on the virtual machine. write rate = # blocksRead per second x blockSize |
Commands issued | Number of SCSI commands issued during the collection interval. |
Commands Aborted | Number of SCSI commands aborted during the collection interval. |
Bus Resets | Number of SCSI-bus reset commands issued during the collection interval. |
physical device Read Latency | Average amount of time, in milliseconds, to complete read from the physical device. |
Kernel Read Latency | Average amount of time, in milliseconds, spent by VMKernel processing each SCSI read command. |
physical device Read Latency | Average amount of time, in milliseconds, to complete read from the physical device. |
Kernel Write Latency | Average amount of time, in milliseconds, spent by VMKernel processing each SCSI write command. |
Write Latency | Average amount of time taken during the collection interval to process a SCSI write command issued by the Guest OS to the virtual machine. The sum of kernelWriteLatency and deviceWriteLatency. |
Queue Write Latency | Average amount time taken during the collection interval per SCSI write command in the VMKernel queue. |
Highest Latency | Highest latency value across all disks used by the host. Latency measures the time taken to process a SCSI command issued by the guest OS to the virtual machine. The kernel latency is the time VMkernel takes to process an IO request. The device latency is the time it takes the hardware to handle the request. |
Average Read request per second | Host - Number of times data was read from each LUN on the host during the collection interval. |
Average write request per second | Host - Number of times data was written to each LUN on the host during the collection interval. |
Host CPU Monitor
Name | Description |
---|---|
Name | Name of the Host. |
usage | CPU usage as a Percentage (in units of 1/100th of a percent) during the interval. Actively used CPU of the host, as a percentage of the total available CPU. Active CPU is approximately equal to the ratio of the used CPU to the available CPU.available CPU = # of physical CPUs x clock rate100% represents all CPUs on the host. For example, if a four-CPU host is running a virtual machine with two CPUs, and the usage is 50%, the host is using two CPUs completely. |
Usage mhz | CPU usage, as measured in megahertz, during the interval.
Sum of the actively used CPU of all powered on virtual machines on a host. The maximum possible value is the frequency of the processors
multiplied by the number of processors. For example, if you have a host with four 2GHz CPUs running a virtual machine that is using 4000MHz,
the host is using two CPUs completely. 4000 / (4 x 2000) = 0.50 |
wait | Total CPU time spent in wait stat. |
ready | Percentage (in units of 1/100th of a percent) of time that the virtual machine was ready, but could not get scheduled to run on the physical CPU. CPU ready time is dependent on the number of virtual machines on the host and their CPU loads. |
used | Total CPU usage. |
Latency |
Latency is a measure of 3 things:
|
Host Info
Name | Description |
---|---|
Name | Name of the Host machine. |
apiType | Indicates whether the service instance represents a standalone host. If the service instance represents
a standalone host, then the physical inventory for that service instance is fixed to that single host. VirtualCenter server provides
additional features over single hosts. For example, VirtualCenter offers multi-host management. Examples of values are:
Note: This metric might have no value for some virtual machines. |
apiVersion | The version of the API as a dot-separated string. For example, “1.0.0” |
build | Build string for the server on which this call is made. For example, x.y.z-num. This string does not apply to the API. |
fullName | The complete product name, including the version information. |
instanceUuid | A globally unique identifier associated with this service instance. Since vSphere API 4.0 Note: This metric might have no value for some virtual machines. |
licenseProductName | The licence product name. Since vSphere API 4.0 Note: This metric might have no value for some virtual machines. |
licenseProductVersion | The licence product version. Since vSphere API 4.0 Note: This metric might have no value for some virtual machines. |
localeBuild |
Build number for the current session’s locale. Typically, this is a small number reflecting a localisation change from the normal product build. Note: This metric might have no value for some virtual machines. |
localeVersion |
Version of the message catalog for the current session’s locale. Note: This metric might have no value for some virtual machines. |
osType |
Operating system type and architecture. Examples of values are:
|
productLineId | The product ID is a unique identifier for a product line. Examples of values are:
|
vendor | Name of the vendor of this product. |
version | Dot-separated version string. For example, “1.2”. |