Elasticsearch Monitoring Technical Reference

Overview

Elasticsearch monitoring is a Gateway configuration file that enables monitoring of ElasticsearchCluster through the Toolkit plug-in.

Elasticsearch is a distributed, search and analytics engine that is capable of scaling horizontally, allowing to add more nodes to the cluster. This means that it can search and analyze large scale of data.

The elements that make Elasticsearch work are defined as follows:

Node is a running instance of Elasticsearch that is capable of knowing the location of the document.
Cluster consists of one or more nodes with the same cluster name that can share their data and load.

Track the following key areas when using Elasticsearch monitoring:

Key Area	Description
Search performance	Determine how the search function perform over time by monitoring the query operations, load or latency, field data cache and evictions.
Indexing performance	Each shard in the index can be updated through flush and refresh process. Shard is a container for data that can be either a primary or a replica shard. It is how the Elasticsearch distributes data in the clusters. Index refresh - creates a new in-memory segment allowing the newly indexed documents searchable. Index flush - new documents are added to the in-memory buffer, the segments are committed, and the transaction log is cleared.
Cluster health and node availability	Monitors the current state of all clusters and nodes.
Resource utilisation	Provides information on how the thread pool queues and rejection works in monitoring the bulk, index, merge, and operations.
System and network metrics	Shows information about every node in the cluster, resource and memory usage, and active connections opened over time.

In this Elasticsearch monitoring template, you will see these metrics in your dataview:

Cluster health
Indexing performance
Search performance
Node and resource information
Thread pool

Intended audience

This technical reference is intended for users who will be using Active Console to monitor data from Elasticsearch. If you are setting up the integration for the first time, see Elasticsearch Monitoring User Guide.

Metrics and dataviews

Elasticsearch cluster health

This monitors the overall health of the cluster by indicating how it is functioning:

Column Name	Description
cluster	Name of the cluster.
status	Health status of the cluster: Green - all primary and replica shards are active. Yellow - indicates that at least one replica shard is not properly allocated or missing. Red - indicates that at least one primary shard is missing that can cause data loss.
nodeTotal	Total number of nodes in the cluster.
nodeData	Total number of nodes in the cluster that can store data.
shardsTotal	Total number of shards.
shardsInitializing	Number of initialising nodes.
shardsUnassigned	Number of unassigned shards.

Elasticsearch indexingPerf-ByIndex

This dataview monitors indexing performance by index. Data is grouped per index:

Column Name	Description
index	Name of the index.
indexingIndexTotal	Total number of indexing operations.
indexingIndexTime	Time spent in indexing. Unit: millisecond (ms)
indexingIndexCurrent	Number of current indexing operations.
refreshTotal	Total number of refreshes.
refreshTime	Time spent in refresh operations. Unit: millisecond (ms)
flushTotal	Total number of flushes.
flushTotalTime	Time spent in flushes. Unit: millisecond (ms)
averageIndexingLatency	Average time spent in indexing. This is computed from indexingIndexTime / indexingIndexTotal. Unit: millisecond (ms) per indexing operation
averageRefreshLatency	Average time spent in refresh operations. This is computed from refreshTime / refreshTotal. Unit: millisecond (ms) per refresh
averageFlushLatency	Average time spent in flush operations. This is computed from flushTotalTime / flushTotal. Unit: millisecond (ms) per flush

Elasticsearch indexingPerfp-ByNode

This monitors indexing performance by node. Data is grouped per node:

Column Name	Description
nodeID	Unique node ID.
name	Name of the node.
indexingIndexTotal	Total number of indexing operations.
indexingIndexTime	Time spent in indexing. Default: millisecond (ms)
indexingIndexCurrent	Number of current indexing operations.
refreshTotal	Total number of refreshes.
refreshTime	Time spent in refresh operations. Unit: millisecond (ms)
flushTotal	Total number of flushes.
flushTotalTime	Time spent in flushes. Unit: millisecond (ms)
averageIndexingLatency	Average time spent in indexing. This is computed from indexingIndexTime / indexingIndexTotal. Unit: millisecond (ms) per indexing operation
averageRefreshLatency	Average time spent in refresh operations. This is computed from refreshTime / refreshTotal. Unit: millisecond (ms) per refresh
averageFlushLatency	Average time spent in flush operations. This is computed from flushTotalTime / flushTotal. Unit: millisecond (ms) per flush

Elasticsearch nodeInfo

This displays information about the nodes in the cluster:

Column Name	Description
nodeID	Unique node ID.
name	Name of the node.
IP	IP address.
port	Bound transport port.
http	Bound http address and port.
version	Elasticsearch version.
build	Elasticsearch build hash.
jdk	JDK version.
nodeRole	Role of the node. This can have more than one value: m - master eligible node. d - data note. i - ingest node.
master	Current master node in the cluster: * (asterisk) - current master. - (hyphen) - non-master.

Elasticsearch resource

This monitors the resources of each node in the cluster:

Column Name	Description
nodeID	Unique node ID.
name	Name of the node.
cpu	CPU usage in percentage (%).
heapCurrent	Current heap usage. Unit: bytes
heapPercent	Percent used heap.
ramCurrent	Current RAM usage. Unit: bytes
ramPercent	Percent RAM used.
diskUsed	Used disk space. Unit: bytes
diskAvail	Available disk space.
diskUsedPercent	Percent disk used.
fileDescriptorCurrent	Number of used file descriptors.
fileDescriptorPercent	Percent file descriptors used.

Elasticsearch SearchPerf-ByIndex

This monitors search performance by index. Data is grouped per index:

Column Name	Description
index	Name of the index.
searchQueryTotal	Number of query phase operations.
searchQueryTime	Time spent in query phase. Default: millisecond (ms)
searchQueryCurrent	Number of current query phase operations.
searchFetchTotal	Number of fetch phase operations.
searchFetchTime	Time spent in fetch phase. Default: millisecond (ms)
searchFetchCurrent	Number of current fetch phase operations.
fielddataMemory	Used fielddata cache.
fielddataEvictions	Used fielddata evictions.
averageQueryLatency	Average time spent in query phase that is computed from searchQueryTime/searchQueryTotal. Default: millisecond (ms) per query
averageFetchLatency	Average time spent in fetch phase that is computed from searchFetchTime/searchFetchTotal. Default: millisecond (ms) per fetch

Elasticsearch searchPerf-ByNode

This monitors search performance by node. Data is grouped per node:

Column Name	Description
nodeID	Unique node ID.
name	Name assigned to the node.
searchQueryTotal	Number of query phase operations.
searchQueryTime	Time spent in query phase. Unit: millisecond (ms)
searchQueryCurrent	Number of current query phase operations.
searchFetchTotal	Number of fetch phase operations.
searchFetchTime	Time spent in fetch phase. Unit: millisecond (ms)
searchFetchCurrent	Number of current fetch phase operations.
fielddataMemory	Used fielddata cache.
fielddataEvictions	Used fielddata evictions.
averageQueryLatency	Average time spent in query phase that is computed from searchQueryTime/searchQueryTotal. Unit: millisecond (ms) per query
averageFetchLatency	Average time spent in fetch phase that is computed from searchFetchTime/searchFetchTotal. Unit: millisecond (ms) per fetch

Elasticsearch ThreadPool

This monitors the bulk, index, and search thread pools of each node in the cluster:

Column Name	Description
node_id/name	Node ID/Thread Pool Name.
node_name	Name of the node.
name	Thread Pool name.
type	Thread Pool Type.
active	Number of active threads.
queue	Number of tasks currently in queue.
rejected	Number of rejected tasks.
size	Number of threads.
queue_size	Size of the queue with pending requests that have no threads to execute.