Cassandra Monitoring Technical Reference
Overview
Cassandra monitoring is a Gateway configuration file that enables monitoring of Cassandra through a set of samplers with customised JMX plug-in settings.
Apache Cassandra is a free and open-source distributed NoSQL database management system that provides scalability and high-availability.
Some of Cassandra's key attributes are:
- Fault tolerant - Data is automatically replicated to multiple nodes for fault-tolerance.
- Decentralized - There are no single points of failure.
- Elastic - Read and write throughput increase linearly as new machines are added, with no downtime or interruption to applications.
It is important to monitor Cassandra performance to identify database slowdowns, interruptions, or pressing resource limitations - and take quick and appropriate actions to correct them.
This technical reference provides information on the metrics and dataviews for the samplers available through the Cassandra integration. If you are setting up the Cassandra integration for the first time, see Cassandra Monitoring User Guide.
Intended audience
This technical reference is intended for users who will be using Active Console to monitor data from Cassandra. If you are setting up the integration for the first time, see Cassandra Monitoring User Guide.
Cassandra disk usage
This dataview displays the disk usage-related metrics. Monitoring these node-level metrics are critical to determine if additional nodes are needed:
Row | Description |
---|---|
Compaction CompletedTasks | Number of completed compactions since the server (re)start. |
Compation PendingTasks | Estimated number of compactions remaining to perform. |
Storage Load | The size, in bytes, of the on disk data size this node manages. |
MBeans for Cassandra-DiskUsage
- org.apache.cassandra.metrics:type=Compaction,name=CompletedTasks
- org.apache.cassandra.metrics:type=Compaction,name=PendingTasks
- org.apache.cassandra.metrics:type=Storage,name=Load
Cassandra errors
This dataview displays the count of specific errors and exceptions encountered by a Cassandra node. These metrics are helpful in identifying problematic nodes:
Column | Description |
---|---|
StorageExceptions | Number of internal exceptions caught. Under normal exceptions, this should be zero. |
ReadTimeouts | Number of read timeouts encountered. |
WriteTimeouts | Number of write timeouts encountered. |
ReadUnavailables | Number of read unavailable exceptions encountered. |
WriteUnavailables | Number of write unavailable exceptions encountered. |
MBeans for Cassandra-Errors
- org.apache.cassandra.metrics:type=Storage,name=Exceptions
- org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Timeouts
- org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Timeouts
- org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Unavailables
- org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Unavailables
Cassandra GC
This dataview displays the selected JVM Garbage Collector metrics. Cassandra is a Java-based system so it relies on Java garbage collection (GC) processes to free up memory. Any significant increase in GC latency will impact Cassandra’s performance:
Column | Description |
---|---|
ConcurrentMarkSweep CollectionCount | Total number of CMS collections that have occurred. |
ConcurrentMarkSweep CollectionTime | Approximate accumulated CMS collection elapsed time in milliseconds. |
ConcurrentMarkSweep LastGCDuration | Elapsed time of the last CMS GC in milliseconds. |
ParNew CollectionCount | Total number of ParNew collections that have occurred. |
ParNew CollectionTime | Approximate accumulated ParNew collection elapsed time in milliseconds. |
ParNew LastGCDuration | Elapsed time of the last ParNew GC in milliseconds. |
MBeans for Cassandra-GC
- java.lang:type=GarbageCollector,name=ConcurrentMarkSweep,*
- java.lang:type=GarbageCollector,name=ParNew,*
Cassandra latency
This dataview displays the node-level latency metrics. It gives a view on Cassandra's performance and can identify potential, network issues, or bottlenecks:
Column | Description |
---|---|
Operation | Type of operation (Read or Write). |
Events | Number of operation events. |
TotalLatency | Accumulated latency in microseconds. |
AverageLatency | Average latency in microseconds (TotalLatency divided by Events). |
MBeans for Cassandra-Latency
- org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency
- org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=TotalLatency
- org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency
- org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=TotalLatency
Cassandra tasks
This dataview displays the count of pending and blocked tasks in various stages. It identifies bottlenecks and potential problems:
Column | Description |
---|---|
Status | Task status (Blocked or Pending). |
CounterMutationStage | Number of tasks in the Counter Mutation stage. |
MutationStage | Number of tasks in the Mutation stage. |
ReadRepairStage | Number of tasks in the Read Repair stage. |
ReadStage | Number of tasks in the Read stage. |
RequestResponseStage | Number of tasks in the Request Response stage. |
MBeans for Cassandra-Tasks
- org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=CounterMutationStage,name=<CurrentlyBlockedTasks | PendingTasks>
- org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=MutationStage,name=<CurrentlyBlockedTasks | PendingTasks> PendingTasks>org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadRepairStage,name==<CurrentlyBlockedTasks | PendingTasks>
- org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=ReadStage,name==<CurrentlyBlockedTasks | PendingTasks>
- org.apache.cassandra.metrics:type=ThreadPools,path=request,scope=RequestResponseStage,name==<CurrentlyBlockedTasks | PendingTasks>
Cassandra throughput
This dataview displays the node-level throughput metrics. It gives a high-level view on the node’s activity levels and is important in understanding how and how much the node is being used:
Column | Description |
---|---|
TimePeriod | Time period (1 or 5 minutes). |
ReadThroughput | Read events per second during the last time period. |
WriteThroughput | Write events per second during the last time period. |
MBeans for Cassandra-Throughput
- org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency
- org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency