Cassandra

Overview Copied

Cassandra monitoring is a Gateway configuration file that enables monitoring of Cassandra through a set of samplers with customised JMX plug-in settings.

Apache Cassandra is a free and open-source distributed NoSQL database management system that provides scalability and high-availability.

Some of Cassandra’s key attributes are:

Fault tolerant - Data is automatically replicated to multiple nodes for fault-tolerance.
Decentralized - There are no single points of failure.
Elastic - Read and write throughput increase linearly as new machines are added, with no downtime or interruption to applications.

It is important to monitor Cassandra performance to identify database slowdowns, interruptions, and take quick and appropriate actions to correct them.

Intended audience Copied

This guide is intended for users who are setting up, configuring, troubleshooting and maintaining this integration. This is also intended for users who will be using Active Console to monitor data from Cassandra. Once the integration is set up, the samplers providing the dataviews become available to that Gateway.

As a user, you should be familiar with SQL Server or any other database, and with the administration of the Cassandra services.

Prerequisites Copied

The following requirements must be met before the installation and setup of the template:

A machine running the Netprobe must have access to the host where the Cassandra instance is installed and the port Cassandra is listening to.
A JMX-enabled Cassandra cluster.
Netprobe version 4.6 or higher.
Gateway version 4.8 or higher.
Cassandra version 3.11.1.

Installation procedure Copied

Ensure that you have read and can follow the system requirements prior to installation and setup of this integration template.

Download the integration package geneos-integration-cassandra-<version>.zip from the Downloads site.
Open Gateway Setup Editor.
In the Navigation panel, click Includes to create a new file.
Enter the location of the file to include in the Location field. In this example, it is the include/CassandraMonitoring.xml.
Update the Priority field. This can be any value except 1. If you input a priority of 1, the Gateway Setup Editor returns an error.
Expand the file location in the Includes section.
Select Click to load.
Click Yes to load the new Cassandra include file.
Click Managed entities in the Navigation panel.
Add the Cassandra type to the Managed Entity section that you will use to monitor Cassandra.
Click Validate current document to check your configuration.
Click Save current document to apply the changes.

Set up the samplers Copied

These are the pre-configured samplers available to use in CassandraMonitoring.xml.

Configure the required fields by referring to the table below:

Set up the variables Copied

The CassandraMonitoring.xml template provides the variables that are set in the Environments section:

Samplers
Cassandra-GC
Cassandra-Throughput
Cassandra-Latency
Cassandra-DiskUsage
Cassandra-Errors
Cassandra-Tasks

Variable	Description
CASSANDRA_JMX_PORT	Cassandra host name. Default: localhost
CASSANDRA_JMX_HOST	Cassandra JMX port. Default: 7199
CASSANDRA_MONITORING_GROUP_NAME	Sampler group name. Default: Cassandra

Set up the rules Copied

The CassandraMonitoring-SampleRules.xml template also provides a separate sample rules that you can use to configure the Gateway Setup Editor.

Your configuration rules must be set in the Includes section. In the Navigation panel, click Rules.

The table below shows the included rule setup in the configuration file:

Sample Rules	Description
Errors - Read Unavailable	Sets the severity to critical if the number of read unavailable exceptions exceeds CASSANDRA_ERRORS_READ_UNAVAILABLE_THERESHOLD. Default: Threshold is set to 0
Errors - Write Unavailable	Sets the severity to critical if the number of write unavailable exceptions exceeds CASSANDRA_ERRORS_WRITE_UNAVAILABLE_THERESHOL. Default: Threshold is set to 0
Disk Usage - High Load	Sets the severity to critical if the storage load exceeds CASSANDRA_DISK_USAGE_HIGH_LOAD_THRESHOLD. The threshold must be set for the rule to take effect. There is no default value provided.
Throughput - High Read Throughput	Sets the severity to critical if the read throughput of the last minute exceeds CASSANDRA_THROUGHPUT_HIGH_READ_THRESHOLD. The threshold must be set for the rule to take effect. There is no default value provided.
Throughput - High Write Throughput	Sets the severity to critical if the write throughput of the last minute exceeds CASSANDRA_THROUGHPUT_HIGH_WRITE_THRESHOLD. The threshold must be set for the rule to take effect. There is no default value provided.
GC - ParNew - High Last GC Duration	Sets the severity to critical is the last ParNew GC duration exceeds CASSANDRA_GC_LONG_PARNEW_GC_DURATION_THRESHOLD. Default: Threshold is set to 300 milliseconds .
GC - CMS - High Last GC Duration	Sets the severity to critical if the last CMS GC duration exceeds CASSANDRA_GC_LONG_CMS_GC_DURATION_THRESHOLD. Default: Threshold is set to 300 milliseconds

Metrics and dataviews Copied

Cassandra disk usage Copied

This dataview displays the disk usage-related metrics. Monitoring these node-level metrics are critical to determine if additional nodes are needed:

Row	Description
Compaction CompletedTasks	Number of completed compactions since the server (re)start.
Compation PendingTasks	Estimated number of compactions remaining to perform.
Storage Load	The size, in bytes, of the on disk data size this node manages.

MBeans for Cassandra-DiskUsage Copied

org.apache.cassandra.metrics:type=Compaction,name=CompletedTasks
org.apache.cassandra.metrics:type=Compaction,name=PendingTasks
org.apache.cassandra.metrics:type=Storage,name=Load

Cassandra errors Copied

This dataview displays the count of specific errors and exceptions encountered by a Cassandra node. These metrics are helpful in identifying problematic nodes:

Column	Description
StorageExceptions	Number of internal exceptions caught. Under normal exceptions, this should be zero.
ReadTimeouts	Number of read timeouts encountered.
WriteTimeouts	Number of write timeouts encountered.
ReadUnavailables	Number of read unavailable exceptions encountered.
WriteUnavailables	Number of write unavailable exceptions encountered.

MBeans for Cassandra-Errors Copied

org.apache.cassandra.metrics:type=Storage,name=Exceptions
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Timeouts
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Timeouts
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Unavailables
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Unavailables

Cassandra GC Copied

This dataview displays the selected JVM Garbage Collector metrics. Cassandra is a Java-based system so it relies on Java garbage collection (GC) processes to free up memory. Any significant increase in GC latency will impact Cassandra’s performance:

Column	Description
ConcurrentMarkSweep CollectionCount	Total number of CMS collections that have occurred.
ConcurrentMarkSweep CollectionTime	Approximate accumulated CMS collection elapsed time in milliseconds.
ConcurrentMarkSweep LastGCDuration	Elapsed time of the last CMS GC in milliseconds.
ParNew CollectionCount	Total number of ParNew collections that have occurred.
ParNew CollectionTime	Approximate accumulated ParNew collection elapsed time in milliseconds.
ParNew LastGCDuration	Elapsed time of the last ParNew GC in milliseconds.

MBeans for Cassandra-GC Copied

java.lang:type=GarbageCollector,name=ConcurrentMarkSweep,*
java.lang:type=GarbageCollector,name=ParNew,*

Cassandra latency Copied

This dataview displays the node-level latency metrics. It gives a view on Cassandra’s performance and can identify potential, network issues, or bottlenecks:

Column	Description
Operation	Type of operation (Read or Write).
Events	Number of operation events.
TotalLatency	Accumulated latency in microseconds.
AverageLatency	Average latency in microseconds (TotalLatency divided by Events).

MBeans for Cassandra-Latency Copied

org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=TotalLatency
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=TotalLatency

Cassandra tasks Copied

This dataview displays the count of pending and blocked tasks in various stages. It identifies bottlenecks and potential problems:

Column	Description
Status	Task status (Blocked or Pending).
CounterMutationStage	Number of tasks in the Counter Mutation stage.
MutationStage	Number of tasks in the Mutation stage.
ReadRepairStage	Number of tasks in the Read Repair stage.
ReadStage	Number of tasks in the Read stage.
RequestResponseStage	Number of tasks in the Request Response stage.

MBeans for Cassandra-Tasks Copied

org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=<CurrentlyBlockedTasks | PendingTasks>
org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=MutationStage,name=<CurrentlyBlockedTasks | PendingTasks> PendingTasks>org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadRepairStage,name==<CurrentlyBlockedTasks | PendingTasks>
org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name==<CurrentlyBlockedTasks | PendingTasks>
org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=RequestResponseStage,name==<CurrentlyBlockedTasks | PendingTasks>

Cassandra throughput Copied

This dataview displays the node-level throughput metrics. It gives a high-level view on the node’s activity levels and is important in understanding how and how much the node is being used:

Column	Description
TimePeriod	Time period (1 or 5 minutes).
ReadThroughput	Read events per second during the last time period.
WriteThroughput	Write events per second during the last time period.

MBeans for Cassandra-Throughput Copied

org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency

Previous article Next article