Cassandra Monitoring Technical Reference

Overview

Cassandra monitoring is a Gateway configuration file that enables monitoring of Cassandra through a set of samplers with customised JMX plug-in settings.

Apache Cassandra is a free and open-source distributed NoSQL database management system that provides scalability and high-availability.

Some of Cassandra's key attributes are:

Fault tolerant - Data is automatically replicated to multiple nodes for fault-tolerance.
Decentralized - There are no single points of failure.
Elastic - Read and write throughput increase linearly as new machines are added, with no downtime or interruption to applications.

It is important to monitor Cassandra performance to identify database slowdowns, interruptions, or pressing resource limitations - and take quick and appropriate actions to correct them.

This technical reference provides information on the metrics and dataviews for the samplers available through the Cassandra integration. If you are setting up the Cassandra integration for the first time, see Cassandra Monitoring User Guide.

Intended audience

This technical reference is intended for users who will be using Active Console to monitor data from Cassandra. If you are setting up the integration for the first time, see Cassandra Monitoring User Guide.

Metrics and dataviews

Cassandra disk usage

This dataview displays the disk usage-related metrics. Monitoring these node-level metrics are critical to determine if additional nodes are needed:

Row	Description
Compaction CompletedTasks	Number of completed compactions since the server (re)start.
Compation PendingTasks	Estimated number of compactions remaining to perform.
Storage Load	The size, in bytes, of the on disk data size this node manages.

MBeans for Cassandra-DiskUsage

org.apache.cassandra.metrics:type=Compaction,name=CompletedTasks
org.apache.cassandra.metrics:type=Compaction,name=PendingTasks
org.apache.cassandra.metrics:type=Storage,name=Load

Cassandra errors

This dataview displays the count of specific errors and exceptions encountered by a Cassandra node. These metrics are helpful in identifying problematic nodes:

Column	Description
StorageExceptions	Number of internal exceptions caught. Under normal exceptions, this should be zero.
ReadTimeouts	Number of read timeouts encountered.
WriteTimeouts	Number of write timeouts encountered.
ReadUnavailables	Number of read unavailable exceptions encountered.
WriteUnavailables	Number of write unavailable exceptions encountered.

MBeans for Cassandra-Errors

org.apache.cassandra.metrics:type=Storage,name=Exceptions
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Timeouts
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Timeouts
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Unavailables
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Unavailables

Cassandra GC

This dataview displays the selected JVM Garbage Collector metrics. Cassandra is a Java-based system so it relies on Java garbage collection (GC) processes to free up memory. Any significant increase in GC latency will impact Cassandra’s performance:

Column	Description
ConcurrentMarkSweep CollectionCount	Total number of CMS collections that have occurred.
ConcurrentMarkSweep CollectionTime	Approximate accumulated CMS collection elapsed time in milliseconds.
ConcurrentMarkSweep LastGCDuration	Elapsed time of the last CMS GC in milliseconds.
ParNew CollectionCount	Total number of ParNew collections that have occurred.
ParNew CollectionTime	Approximate accumulated ParNew collection elapsed time in milliseconds.
ParNew LastGCDuration	Elapsed time of the last ParNew GC in milliseconds.

MBeans for Cassandra-GC

java.lang:type=GarbageCollector,name=ConcurrentMarkSweep,*
java.lang:type=GarbageCollector,name=ParNew,*

Cassandra latency

This dataview displays the node-level latency metrics. It gives a view on Cassandra's performance and can identify potential, network issues, or bottlenecks:

Column	Description
Operation	Type of operation (Read or Write).
Events	Number of operation events.
TotalLatency	Accumulated latency in microseconds.
AverageLatency	Average latency in microseconds (TotalLatency divided by Events).

MBeans for Cassandra-Latency

org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency
org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=TotalLatency
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=TotalLatency

Cassandra tasks

This dataview displays the count of pending and blocked tasks in various stages. It identifies bottlenecks and potential problems:

Column	Description
Status	Task status (Blocked or Pending).
CounterMutationStage	Number of tasks in the Counter Mutation stage.
MutationStage	Number of tasks in the Mutation stage.
ReadRepairStage	Number of tasks in the Read Repair stage.
ReadStage	Number of tasks in the Read stage.
RequestResponseStage	Number of tasks in the Request Response stage.

MBeans for Cassandra-Tasks

org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=<CurrentlyBlockedTasks | PendingTasks>
org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=MutationStage,name=<CurrentlyBlockedTasks | PendingTasks> PendingTasks>org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadRepairStage,name==<CurrentlyBlockedTasks | PendingTasks>
org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name==<CurrentlyBlockedTasks | PendingTasks>
org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=RequestResponseStage,name==<CurrentlyBlockedTasks | PendingTasks>

Cassandra throughput

This dataview displays the node-level throughput metrics. It gives a high-level view on the node’s activity levels and is important in understanding how and how much the node is being used:

Column	Description
TimePeriod	Time period (1 or 5 minutes).
ReadThroughput	Read events per second during the last time period.
WriteThroughput	Write events per second during the last time period.

MBeans for Cassandra-Throughput

org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency
org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency