Collection Agent configuration reference

Overview

Collection Agent configuration reference contains details about setting up collectors, reporters, workflows and plugins.

Note: The latest version of the Collection Agent is 2.2.2.

Configuration reference

Below is an example YAML file which may require some changes for your project’s configuration:

# Collection Agent Configuration Reference

# Directory containing plugin artifacts. Required.
plugin-directory: /usr/local/lib/geneos/plugins

# Agent monitoring and metrics settings.
monitoring:

# Health and metrics reporting interval in milliseconds. Defaults to 10 seconds.
reporting-interval: 10000

# The agent will listen on an HTTP port so that an external system can check its health.
# In Kubernetes, this can be used in conjunction with the readiness/liveness probes.
# 200 is returned if the agent is started, 500 otherwise.
health-check:

# Defaults to true.
enabled: true

# HTTP listen port, defaults to 8080.
listen-port: 8080

# Agent metrics.
# See the documentation for a full list of metrics that are collected.
metrics:

# Whether metric reporting is enabled. Defaults to true.
enabled: true

# Whether counter values are reset per interval. Defaults to false.
disable-windowing: false

# Dimensions added to all metrics.
dimensions:
namespace: ${env:NAMESPACE}
pod_name: ${env:POD_NAME}
container: ${env:CONTAINER_NAME}

# Properties added to all metrics.
properties:
prop1: value

#
# Collector definitions
#
collectors:

# Collector type (all collectors are of type 'plugin').
- type: plugin

# Optional name used in logging. If omitted, an auto-generated name will be assigned.
name: statsd

# Simple class name of the collector in the plugin jar.
class-name: StatsdServer

# Data point processors applied to data points published from this collector.
# This optional processing chain allows for manipulating and/or filtering data points prior
# to workflow publication. This is the recommended way to perform edge processing, when applicable, so that
# unneeded data can be dropped before incurring workflow overhead.
processors:
# For example, drop all events collected from statsd. See "workflow -> common -> processors" section for
# details on each type of processor.
- type: drop-filter
matchers:
- type: kind
kind: generic-event

# Additional properties are specific to each collector type. See plugin's configuration reference for details.
listen-port: 8125

#
# Reporter definitions
#
reporters:

# Kafka Reporter
- type: kafka

# Reporter name. Referenced from a pipeline's 'reporter' setting.
name: kafka-metrics

# Remaining settings are specific to each reporter type.

# Comma-separated list of kafka broker addresses and ports
brokers: ${env:KAFKA_BROKERS}

# Topic name/format. Can be a literal string, an environment/message variable, or a combination thereof.
# Valid variables:
#
# Metric name: ${datapoint.name}
# Dimension value: ${datapoint.dimension:DIMENSION_NAME}
# Property value: ${datapoint.property:PROPERTY_NAME}
# Environment variable: ${env:SOME_VAR}
#
topic: metrics-${datapoint.dimension:namespace}

# Retry/consistency behavior. Valid values:
#
# exactly-once: The kafka producer will operate with idempotence enabled. This method is recommended
# for logs and events pipelines.
#
# at-most-once: Messages are published once and no retries are made if unsuccessful. This method is recommended
# for the metrics pipeline.
delivery-method: at-most-once

- type: kafka
name: kafka-logs
brokers: ${env:KAFKA_BROKERS}
topic: logs-${datapoint.dimension:namespace}
delivery-method: exactly-once

- type: kafka
name: kafka-events
brokers: ${env:KAFKA_BROKERS}
topic: events-${datapoint.dimension:namespace}
delivery-method: exactly-once

# Logging reporter that simply logs each data point. This is intended only for testing purposes.
- type: logging

# Reporter name. Referenced from a pipeline's 'reporter' setting.
name: stdout

# Log level at which each data point is logged. Can be: error, info (default), warn, debug or trace.
level: info

# TCP reporter sends internal DataModel messages over a TCP connection.
- type: tcp

# Reporter name. Referenced from a pipeline's 'reporter' setting.
name: my-tcp-reporter

# The TCP server hostname. Default value shown.
hostname: localhost

# The TCP server port. Default value shown.
port: 7137

# The TCP server connection timeout in milliseconds. Default value shown.
connection-timeout-millis: 10000

# The TCP server write timeout in milliseconds. Default value shown.
write-timeout-millis: 10000

# Maximum message length in bytes. Default value shown.
max-message-length: 65536

# External/custom reporters are defined using the 'plugin' type.
- type: plugin

# Reporter name. Referenced from a pipeline's 'reporter' setting.
name: my-custom-reporter

# Simple class name of the reporter in the plugin jar.
class-name: CustomReporter

# Additional properties are specific to each reporter type. See plugin's configuration reference for details.
custom-prop: asdf

#
# Workflow settings for controlling the flow of data points from plugins to reporters.
#
workflow:

# Directory to store pipeline persistence. Required only if at least one pipeline is configured to buffer data
# points on disk. The directory must be writable.
store-directory: /var/lib/geneos/collection-agent

# Pipelines.
#
# A pipeline exists for each class of data (metrics/logs/events)
#
# At least one pipeline must be configured. A runtime error will occur if a plugin attempts delivery to a pipeline
# that is not configured.
#

# Metrics pipeline.
metrics:

# Reporter to which all data points on this pipeline are sent.
# This property is optional if there is only one reporter configured. Otherwise the value is required and
# must correspond to the 'name' of a reporter defined above.
reporter: kafka-metrics

# Store settings.
#
# Data points are stored either in memory or on disk before delivery to a reporter.
#
# If a reporter's target becomes unavailable, data points are queued until either the store is full or
# the reporter target becomes available again.
#
# Plugins are informed when a store becomes full and are free to handle the situation in a way that makes
# sense for that plugin (i.e. dropping the message if not critical, or waiting for the store to re-open before
# collecting any more data).
store:

# Store type.
#
# Permitted values:
# 'memory': A circular, fixed-size, in-memory store that provides no persistence. The oldest data point
# is removed when adding to a full store, therefore this store never rejects new data points
# and will begin to drop data if a slow reporter cannot keep up.
#
# 'disk': A fixed-size store that is persisted to disk. Requires the workflow 'store-directory' setting
# to be configured.
#
# For the metrics pipeline, it is recommended (and the default) to use a memory store, as metric data is
# generally non-critical and loses relevance if delayed.
#
type: memory

# Maximum number of data points to hold before the store is considered full and new data points are rejected.
# The default capacity for a memory store is 8192 data points and 10,000,000 data points for a disk store.
capacity: 8192

# Number of retries after initial delivery fails. Defaults to 3. For infinite retries set to -1.
# The interval between consecutive retries for the same message increases from 1 second up to 120 seconds.
max-retries: 3

# Custom processing of data points on this pipeline. Processors can manipulate, enrich and/or filter
# data points before reporting.
#
# See the 'common' pipeline for more details.
processors:
- type: enrichment
name: metrics-enricher
dimemsions:
custom_dimension: value

# Logs pipeline.
logs:
reporter: kafka-logs
store:
# For logs, it is recommended (and the default) to use a disk store if data loss is not tolerable.
type: disk

# For logs, it is recommended (and the default) to retry infinitely if data loss is not tolerable.
max-retries: -1

# Events pipeline.
events:
reporter: kafka-events
store:
# For events, it is recommended (and the default) to use a disk store if data loss is not tolerable.
type: disk

# For events, it is recommended (and the default) to retry infinitely if data loss is not tolerable.
max-retries: -1

# Common pipeline.
#
# This is a unique pipeline that only has data-point processors (there is no reporter). The processors are applied
# to data points on all pipelines, before any pipeline-specific processors are applied.
common:

# Data-point processors.
#
# Processors can manipulate, enrich and/or filter data points before reporting. They are applied before
# a data point is saved in the pipeline's store.
#
processors:

# Enrichment processor. Adds dimensions and/or properties to all data points.
- type: enrichment

# Optional name used in logging. If omitted, an auto-generated name will be assigned.
name: enricher

# Whether to overwrite an existing dimension or property with the same name (defaults to false)
overwrite: false

# Dimensions to add
dimensions:
node_name: ${env:NODE_NAME}

# Properties to add
properties:
prop: value

# Drop filter processor. Drops data points that match the configured criteria.
- type: drop-filter

# One or more match criteria.
# For a data point to be dropped, all configured criteria must match, otherwise the data point
# will be forwarded. If no matchers are configured, all data points will be forwarded.
matchers:

# Match by data point name, either exactly or via regex.
- type: name

# Exact match
name: kubernetes_node_cpu_usage

# Regex match (only one of 'name' or 'name-pattern' can be configured)
name-pattern: kubernetes_.*

# Match by data point dimension key and either an exact value or a regex pattern.
- type: dimension
key: namespace

# Exact value match
value: finance

# Regex match (only one of 'value' or 'value-pattern' can be configured)
value-pattern: ns.*

# Match by data point property key and either an exact value or a regex pattern.
- type: property
key: someProperty

# Exact value match
value: someValue

# Regex match (only one of 'value' or 'value-pattern' can be configured)
value-pattern: value.*

# Forward filter processor. Forwards data points that match the configured criteria.
# This behaves inversely to "drop-filter" above but is configured identically.
- type: forward-filter

# One or more match criteria.
# For a data point to be forwarded, all configured criteria must match, otherwise the data point
# will be dropped. If no matchers are configured, all data points will be dropped.
# See "drop-filter" for details on each type of matcher.
matchers:
- type: name
pattern: myCounter

# Normalize processor. Normalizes dimension names for consistency in subsequent processing and reporting.
- type: normalize

# Optional name used in logging. If omitted, an auto-generated name will be assigned.
name: normalize

# Dimension normalization settings.
dimensions:

# Default overwrite behavior, can be overridden per mapping. Defaults to false.
overwrite: false

# Dimension mappings.
mappings:

# Old dimension name.
- from: project

# New dimension name.
to: namespace

# Whether to overwrite if a dimension already exists with the same name. Defaults to parent setting.
overwrite: false

# External/custom processors are defined using the 'plugin' type.
- type: plugin

# Optional name used in logging. If omitted, an auto-generated name will be assigned.
name: kube-enricher

# Simple class name of the processor in the plugin jar.
class-name: KubernetesEnricher

# Additional properties are specific to each processor type. See plugin's configuration reference for details.
custom-prop: abc