AWS
Overview Copied
The AWS plugin is a Collection Agent plugin that gathers metrics through AWS CloudWatch. This plugin also provides an API Destination that can interact with AWS services, such as EventBridge and SNS. The AWS plugin improves Geneos cloud monitoring capabilities by building a more easy-to-use and scalable solution to interface with AWS CloudWatch to monitor various services being deployed in AWS.
In addition, the AWS plugin allows you to:
- Combine metrics from AWS with on-premise, multi-cloud, and hybrid environments for end-to-end visibility.
- Monitor real-time alerts using rule and alert capabilities.
- Minimise cost by using a single enterprise tool for monitoring.
Monitored services, logs, and events Copied
You can use the AWS plugin to monitor different services by using the following collectors in the Collection Agent YAML file:
AwsCollector
— for the list of collected services, see AWS Plugin services.AwsBillingCollector
— for billing services including a breakdown of estimated charges by service. See AWS/Billing in Plugin services.ApiDestinationCollector
— for receiving the following real-time data through its API Destination collector for Amazon Events, Logs, and Alarms services. Logs and events that are reported to Netprobe will result in stream messages that can be monitored by the FKM plugin of Netprobe. See AWS API destination collector.AwsSdkUsageMetricsCollector
— for publishing the aggregated SDK Usage metrics from the last 5-minute window. See AWS SDK Usage metrics collector.AwsCustomNamespaceCollector
— for collection and publishing custom metrics published into custom namespaces created in AWS CloudWatch.AwsMetricStreamCollector
— for exposing an HTTPS endpoint to connect to Data Firehose, wherein data received is processed to provide similar datapoints as AwsCollector.
Deployment recommendations Copied
Launch ITRS Geneos EC2 instance in AWS Marketplace Copied
Use the Plugin in AWS Marketplace deployment option for the following reasons:
- To deploy the AWS plugin through the AWS Marketplace with built-in dynamic entity mapping to minimise Gateway configuration.
- To deploy the AWS plugin through the marketplace with minimal configuration.
Configure Geneos to deploy the AWS plugin Copied
Use the Configure Geneos to deploy AWS CloudWatch deployment option in the following cases:
- When you require a native deployment option where you configure the Gateway and Netprobe in Geneos on your local machine.
Prerequisites Copied
Geneos environment Copied
The AWS Collection Agent plugin requires the following versions of Geneos components:
- Gateway and Netprobe 5.11.x or higher. The same version must be used for the GSE schema.
- Collection Agent 2.x or higher. To run a Collection Agent, see Collection Agent setup.
The AWS binaries are packaged with Netprobe, and are stored in the collection_agent
folder. Alternatively, you can download separate binaries for the AWS plugin from the ITRS Downloads.
Caution
Collection Agent and its plugins is no longer packaged with the Netprobe in Geneos 5.14.7 and the subsequent 5.x versions. If you want to run Collection Agent via Netprobe, please upgrade to the current 6.x version of Geneos.
AWS environment Copied
The AWS plugin requires valid AWS credentials to use, such as an Access Key ID and a Secret Access Key. Please refer to the Setting the default credentials page for how to specify your AWS credentials on your machine.
To see the required permissions for some of the monitored services, see Required AWS plugin permissions in Plugin services.
CloudWatch API usage Copied
Since the AWS plugin interacts with AWS CloudWatch using Amazon provided APIs, you should be aware of CloudWatch services quotas.
Otherwise, you might encounter an error similar to this: 2021-12-02 14:05:11.833 [EC2Service-Processor-0] ERROR com.itrsgroup.collection.plugins.aws.AwsCollector(awsSG) - CloudwatchMetricDataSource Get Metrics Error: Rate exceeded (Service: CloudWatch, Status Code: 400, Request ID: 31d9a57e-c6c5-46fd-94da-7fe62e74010c, Extended Request ID: null)
CloudWatch query time windows Copied
The values obtained from AWS are the values averaged over the last complete 5-minute window. This is because Cloudwatch makes the data available with a 5-minute latency.
For example, if the collection time is 2021-07-04T03:31:12.34Z
, then the time window used to query the data in CloudWatch is:
StartTime
:2021-07-04T03:25:00.00Z
EndTime
:2021-07-04T03:30:00.00Z
In the case where the collection interval is less than 5
minutes, the AWS plugin will query CloudWatch using the adjusted time windows at first. Then for the succeeding samples where the adjusted time window is the same as the previous window, no queries will be done so the plugin generates no data.
For example, when the collectionInterval
is set to 1
minute, the AWS plugin will query CloudWatch at first, then until it hits the next complete 5-minute window (for example, after 5
minutes), the plugin will not return any data.
Configure Geneos to deploy the AWS plugin Copied
The AWS plugin supports Collection Agent publication into Geneos using dynamic Managed Entities. Setting up this plugin in Geneos involves these primary steps:
- Set up your Collection Agent plugin.
- Configure your mappings.
- Configure your other Dynamic Entities in the Gateway, see Create Dynamic Entities in Collection Agent setup for a more detailed procedure.
Set up your Collection Agent plugin Copied
Use one of the following options listed below to configure the plugin.
- Setting up your collector in the Gateway Setup Editor by adding the following configuration in Dynamic Entities > Collectors. For more information, see Collectors in Dynamic Entities.
Below are the available collectors for the AWS plugin:
Collectors | Description |
---|---|
AwsCollector |
Enables AWS collector configuration. To add more AWS services to monitor, you can add them in |
AwsBillingCollector |
Enables AWS Billing collector configuration. See AWS/Billing. |
ApiDestinationCollector |
Enables AWS API destination configuration to monitor the following AWS logs and events: |
AwsSdkUsageMetricsCollector |
Publishes the aggregated SDK Usage metrics from the last 5-minute window. See AWS SDK Usage metrics collector . |
AwsCustomNamespaceCollector |
Collects and publishes custom metrics published into custom namespaces created in AWS CloudWatch. See AWS Custom namespace collector. |
AwsMetricStreamCollector |
Exposes an HTTPS endpoint to connect to Data Firehose and the received data is processed to provide similar datapoints as AwsCollector. See CloudWatch Metric Streaming. |
- Adding the following configuration in
collection-agent.yml
file on your local machine.
collectors:
# AWS collector configuration
- name: aws
type: plugin
className: AwsCollector
# Interval (in millis) between collections (optional, defaults to five minutes).
collectionInterval: 300000
# AWS regions from which metrics will be collected
regions:
- ap-southeast-1
- ap-southeast-2
# List of services to collect metrics from (optional, case-sensitive, and defaults to all metrics)
enabledServices:
- AWS/EC2
- AWS/EBS
- AWS/EKS
- AWS/RDS
- AWS/ECS
# Publish SDK usage metrics
sdkMetrics: true
# FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
awsUrl: "http://localhost:4566"
# AWS custom metric collector configuration
- name: awscustom
type: plugin
className: AwsCustomNamespaceCollector
# Interval (in millis) between collections (optional, defaults to five minutes).
collectionInterval: 300000
# AWS regions from which metrics will be collected
regions:
- ap-southeast-1
- ap-southeast-2
# Publish SDK usage metrics
sdkMetrics: true
# FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
awsUrl: "http://localhost:4566"
# List of custom namespaces to collect metrics from
customMonitoredNamespaces:
- namespace: ECS/ContainerInsights
# List of metrics to collect from the custom namespace (optional, defaults to collect all available metrics if not specified)
customMonitoredMetrics:
# Regex pattern to match the metrics to collect from the namespace
- nameIncludes: Network*
# Length of time used to aggregate the metric (optional, defaults to 300 seconds)
period: 300
# Aggregation to be used (optional, defaults to "average")
statistic: average
# AWS Billing collector configuration
- name: aws-billing
type: plugin
className: AwsBillingCollector
# Interval (in millis) between collections (optional, defaults to 6 hours).
collectionInterval: 21600000
# Publish SDK usage metrics
sdkMetrics: true
# AWS SDK Metrics collector configuration
# Publishes the aggregated SDK Usage Metrics from the last 5-minute window
- name: aws-sdk-metrics
type: plugin
className: AwsSdkUsageMetricsCollector
# Interval (in millis) between collections (optional, defaults to five minutes).
collectionInterval: 300000
# AWS API destination configuration
- name: apidestination
type: plugin
className: ApiDestinationCollector
# AWS region from which alarm notifications are expected.
snsRegion: ap-southeast-1
# Port on which to receive API destination events
port: ${env:API_DESTINATION_PORT}
# Acceptor thread pool size (optional, defaults to 2)
acceptorThreadPoolSize: 2
# Worker thread pool size (optional, defaults to 4)
workerThreadPoolSize: 4
# TLS configuration (TLS is required by AWS API Destination to be a valid endpoint)
tlsConfig:
certFile: ${env:CERT_FILE}
keyFile: ${env:KEY_FILE}
trustChainFile: ${env:TRUST_CHAIN_FILE}
# Authentication type (at the moment, only basic authentication for EventBridge is supported)
authentication:
# The basic authentication credentials here should match the ones set in EventBridge
basicAuthentication:
username: ${env:BASIC_AUTH_USERNAME}
password: ${env:BASIC_AUTH_PASSWORD}
# FOR ENGINEERING USE ONLY - Override AWS to connect to the endpoint of your choice (optional).
awsUrl: "http://localhost:4566"
# FOR ENGINEERING USE ONLY
# This option specifies whether the signatures of SNS messages will be verified first.
# Verifying SNS message signatures ensures that the message was sent from Amazon SNS.
# Set this option to false in order to handle SNS messages that are not from Amazon SNS.
# (optional, defaults to true)
verifySnsSignature: true
# AWS Metric Stream configuration
- name: metricstream
type: plugin
className: AwsMetricStreamCollector
# TLS configuration (TLS is required by AWS Metric Stream to be a valid endpoint)
tlsConfig:
certFile: ${env:CERT_FILE}
keyFile: ${env:KEY_FILE}
# Metric stream format (optional, defaults to otel-10)
# Options are case-insensitive: json, otel-07, and otel-10
metricFormat: json
# Statistics configuration for each metric (optional, default uses internal table statistics from aws-collector)
# This configuration can overwrite or add to the default internal table of statistics
# Options are case-insensitive: average, sum, minimum, and maximum
# statistics:
# AWS/EC2:
# CPUUtilization: Sum
Configure your mappings Copied
To be able to show metrics and events in Geneos, dynamic mappings must be configured and attached to the Netprobe receiving the data from the Collection Agent. Use one of the following options listed below to configure your dynamic mappings.
- Adding
templates/aws_mapping.xml
as an include file in your Gateway. This mapping template also includes sample FKM streams that can be used as a template which can then be customised to handle your intended Events, Logs, and Alarms. - Choosing a built-in mapping in > Mapping. For more information, see Mapping and mapping group in Dynamic Entities.
- Setting up a custom mapping in Dynamic Entities Health > Mapping. For more information, see Mapping and mapping group in Dynamic Entities.
Access AWS cloud through a proxy host Copied
Accessing AWS cloud through a proxy can be configured by adding the http.*
properties in the JVM arguments. For example, you can access the cloud via a proxy host and port by adding the following properties:
-Dhttp.proxyHost=webcache.example.com -Dhttp.proxyPort=8080
For more information on adding JVM arguments, see Managed Collection Agent.
To learn more about the available properties to enable proxy access, see Java Networking and Proxies.
Dealing with large volumes of data Copied
When the AWS plugin collects and publishes a large volume of data, the Collection Agent may perform garbage collection more frequently. This can lead to performance issues.
To address this, you can increase the heap size of the Collection Agent by adding the following JVM arguments, for example:
-Xms1024m -Xmx1024m
This increases the initial heap size to 1024 MB and the maximum heap size to 1024 MB. Note that the default values are 512 MB for both the initial and maximum heap sizes.
For more information on adding JVM arguments, see Managed Collection Agent.
AWS API destination collector Copied
Aside from the AWS Plugin services, you can also use this collector to monitor the following AWS services:
You can use the sample mapping included in the Gateway package: templates/aws_mapping.xml
, that contains the sample FKM streams to handle AWS Events, Logs, and Alarms services.
Note
Ensure that you enable the AWS API destination configuration in the Collection Agent YAML file before you set up these services in the AWS Management Console site.
EventBridge event and CloudWatch log streaming Copied
Amazon EventBridge, a serverless event bus service, streams real-time events, and logs from various AWS resources and applies its rules to route events to its targets, one of which is the ApiDestinationCollector
collector. The ApiDestinationCollector
collector exposes an HTTPS endpoint to connect to EventBridge, and the received data can then be processed as FKM streams.
Logs and events are not formatted and are displayed as it is. A source dimension is available with the following behaviour:
- For logs, the value is always
lambda
since logs are passed through a lambda function. - For events, value is the source of the event. For example, for EC2 Instance State-change events, the source is
aws.ec2
.
HTTPS and authentication Copied
For the ApiDestinationCollector
to be considered a valid endpoint by AWS, it needs to use the HTTPS protocol. This can be achieved by using the TLS configuration of the ApiDestinationCollector
. Using self-signed TLS certificates will not work. Events sent by AWS can only be read by the collector if certificates from a trusted CA (certificate authority) are used.
For EventBridge events, the ApiDestinationCollector
endpoint requires basic authentication. An event can only reach the API destination endpoint if it has the proper basic authentication credentials.
Set up EventBridge event bus events Copied
To receive events from EventBridge, create an FKM stream in your Gateway then set up the following in your AWS Management Console. If you need more information about EventBrigde, see Amazon EventBridge.
- Navigate to Amazon EventBridge > Event buses > Create an event bus where you will stream the logs or events.
- In the Integration > API destinations > Connections, click Create connection.
- In the Authorization type, select Basic (Username/Password) to input your desire username and password.
- Authorization for EventBridge is required, but at the moment only basic authentication is supported by the plugin.
- In the Integration > API destinations > API destinations, click Create API destination.
- In the API destination endpoint, enter
https://<url-where-aws-plugin-is-hosted>:<port-defined-in-api-destination-collector-config>
. - In the HTTP method, choose POST.
- In the API destination endpoint, enter
- Navigate to Amazon EventBridge > Rules to create a rule.
- In the Define pattern, select the type of event that the rule will apply to.
- For lambda functions, the source is
lambda
and detail-type is eitherLambda Function Invocation Result - Success
orLambda Function Invocation Result - Failure
- In the Target, select API destination. Select the API destination that you created.
API destination Events FKM dataview Copied
Set up CloudWatch Logs Copied
CloudWatch logs can be sent to the ApiDestinationCollector
collector through a subscription filter to a Lambda function, which then passes the logs to EventBridge. Similar to events, data is then sent from the EventBridge event bus to the collector.
Prerequisites Copied
-
An EventBridge event bus and rule to the API destination. To create an event bus, follow the steps in Set up EventBridge event bus events.
-
A lambda function.
- When creating a lambda function, make sure that the execution role used has EventBridge PutEvents permission.
- Sample code from AWS is available here. Ensure that the lambda function’s destination points to the event bus created in Set up EventBridge event bus events.
To receive CloudWatch logs, create an FKM stream in your Gateway then set up the following in AWS Management Console. If you need more information about AWS CloudWatch, see CloudWatch.
- Navigate to CloudWatch > Log groups to select the AWS resource log group that you want to stream.
- In the Subscription filters, click Create > Create Lambda subscription filter.
- In the Choose destination, select the lambda function you have set up as prerequisite.
- In the Configure log format and filters, choose a filter pattern to match the logs that you want to stream.
- Click Start streaming.
- Navigate to Lambda > Functions.
- Select the lambda function you have set up as prerequisite, then verify that the subscription filter is under Configuration > Triggers and that the correct EventBridge event bus is set as the destination under Configuration > Destinations.
API destination Logs FKM dataview Copied
Alarm Notifications Copied
The ApiDestinationCollector
collector can handle alarm notifications from SNS messages that are sent from standard topics. These alarm notifications are then published to the Collection Agent as log events.
Note
The alarm notifications only from thesnsRegion
specified in theApiDestinationCollector
collector configuration will be handled.
The following describes some of the parameters of the log events related to Alarm notifications:
Parameter | Description |
---|---|
name | Name of the alarm that triggered the notification. This corresponds to the stream name of the FKM plugin in the Netprobe. |
message | Shows the details regarding the alarm that triggered the notification. This corresponds to the triggerDetails column of the FKM plugin in the Netprobe. Format:
Namespace , Metric , and Dimensions fields. Sample message for Metric alarm:2021-12-03T06:38:23.047Z Namespace=AWS/EC2 Metric=CPUUtilization State=ALARM Reason=“Threshold Crossed: 6 out of the last 30 datapoints were less than or equal to the threshold (0.7). The most recent datapoints which crossed the threshold: [0.0650449497620362 (03/12/21 06:31:00), 0.0327868852458998 (03/12/21 06:26:00), 0.0666666666666628 (03/12/21 06:21:00), 0.09947207557655299 (03/12/21 06:16:00), 0.0333333333333314 (03/12/21 06:11:00)] (minimum 30 datapoints for OK -> ALARM transition).” Dimensions=[{“value”:“i-07d7e8d7dd0ee675f”, “name”:“InstanceId”}]Sample message for Composite alarm:s2021-12-03T08:18:20.602Z State=OK Reason=“arn:aws:cloudwatch:ap-southeast-2: 164181677543:alarm:alarm_when_greaterthan_0.07 transitioned to INSUFFICIENT_DATA at Friday 03 December, 2021 08:18:20 UTC” |
entity dimensions | Entity dimensions parameters:
|
Set up for Alarm Notifications Copied
To receive SNS messages related to alarm notifications, create an FKM stream in your Gateway then set up the following in AWS Management Console. If you need more information about Alarm notifications, see Amazon SNS and Amazon CloudWatch alarms.
- Navigate to Amazon SNS > Topic to create an HTTPS subscription under the SNS topic that receives the alarm notifications that will be collected.
- In the Create subscription > Details, set the Endpoint of this subscription to the URL corresponding to the server started by the collector.
- Keep the Enable raw message delivery unchecked. The
ApiDestinationCollector
will not handle SNS messages where raw message delivery is enabled. For more information, see Subscribing to an Amazon SNS topic.
Once an HTTPS subscription is confirmed, the collector will be able to receive alarm notifications from this subscription.
-
The collector should log
Successfully confirmed subscription to the topic TOPIC_NAME
, and in the AWS SNS Console, the status of the subscription should be Confirmed. -
If the collector was not running when the HTTPS subscription is created, the status of the subscription will remain as Pending confirmation. To confirm this subscription, ensure that the collector is running and then request confirmation for this subscription. To request this confirmation:
- Go to either the topic page or in the AWS SNS Console > Subscription Page.
- Select the subscription with pending confirmation, and then click the Request confirmation button.
API destination Alarms FKM dataview Copied
AWS SDK Usage metrics collector Copied
Since the AWS plugin interacts with AWS CloudWatch using Amazon provided APIs, take note of the CloudWatch service quotas.
To monitor the overall SDK Usage of the AWS plugin, follow these steps:
- Enable the
sdkMetrics
configuration under theAwsCollector
and theAwsBillingCollector
. See thecollection-agent.yml
in Configure Geneos to deploy the AWS plugin. - Configure the
AwsSdkUsageMetricsCollector
to publish the SDK Usage metrics under the namespaceAWS/SdkUsage
.
The AwsSdkUsageMetricsCollector
reports the SDK Usage from the last 5-minute window. Below is an example dataview for AWS SDK Usage metrics:
For more information, see AWS/SDK Usage in Plugin services.
AWS Custom namespace collector Copied
The AWS plugin can collect custom metrics for custom namespaces created in AWS CloudWatch.
To monitor the custom namespace for the AWS plugin, follow these steps:
- Add a
AwsCustomNamespaceCollector
class name in yourcollection-agent.yml
file. For more information, see Configure Geneos to deploy the AWS plugin. - Enable the
sdkMetrics
configuration under theAwsCustomNamespaceCollector
. - Add and configure your custom namespace and the metrics to be collected under the
AwsCustomNamespaceCollector
class. - Set up and configure your custom mappings and other Dynamic Entities. Follow the steps in Create Dynamic Entities in Collection Agent setup.
Below is an example dataview for the following custom namespace configuration.
Sample configuration:
# AWS custom metric collector configuration
- name: awscustom
type: plugin
className: AwsCustomNamespaceCollector
# Interval (in millis) between collections (optional, defaults to five minutes).
collectionInterval: 300000
# AWS regions from which metrics will be collected.
regions:
- eu-west-1
# Publish SDK usage metrics
sdkMetrics: true
# List of custom namespaces to collect metrics from
customMonitoredNamespaces:
- namespace: ECS/ContainerInsights
# List of metrics to collect from the custom namespace (optional, defaults to collect all available metrics if not specified)
customMonitoredMetrics:
# Regex pattern to match the metrics to collect from the namespace
- nameIncludes: Network*
- nameIncludes: Storage*
# Length of time used to aggregate the metric (optional, defaults to 300 seconds)
period: 300
# Aggregation to be used (optional, defaults to "average")
statistic: average
Sample dataview:
CloudWatch Metric Streaming Copied
Amazon CloudWatch Metric Streams, together with Amazon Data Firehose, can send real-time metrics from various AWS resources to its configured HTTPS destination endpoint where the AwsMetricStreamCollector
collector is located. The collector exposes an HTTPS endpoint to connect to Data Firehose, and the received data is processed to provide similar datapoints as AwsCollector
.
Note
Failed communication between AWS Data Firehose andAwsMetricStreamCollector
results in logs being stored in S3 buckets, leading to increased storage usage.
HTTPS and Authentication Copied
For the AwsMetricStreamCollector
to be considered a valid endpoint by AWS, it needs to use the HTTPS protocol. This can be achieved by using the TLS configuration of the AwsMetricStreamCollector
.
Note
Using self-signed or free TLS certificates, like Let’s Encrypt, will not work. Metrics sent by AWS can only be read by the collector if certificates from a trusted CA (certificate authority) are used.
Setup for CloudWatch Data Firehouse Copied
- Select the source and destination:
- Source — Direct PUT
- Destination — HTTP Endpoint
- Transform records — Not supported under the current revision
- Destination settings:
- HTTP endpoint URL — Endpoint of the collector
- Authentication/Access key — Not supported under the current revision
- Content encoding — GZIP is not supported under the current revision
- Buffer hints — Lower buffer size and interval values will notify firehose to send metrics more frequently.
- Advanced settings
- Server-side encryption — Not supported under the current revision
Setup for CloudWatch Metric Streams Copied
CloudWatch metrics can be sent to the AwsMetricStreamCollector
collector through an AWS Data Firehose stream.
- Destination:
- Custom setup with Firehose
- Select your Amazon Data Firehose stream — Select the created Data Firehose
- Change output format — Ensure this configuration matches with collector configuration.
- Metrics to be streamed:
- Select either All metrics or Select metrics. For Select Metrics, select individual metrics to be included or excluded.
- Add additional statistics — Only default statistics are currently supported (Minimum, Maximum, Sample Count, and Sum).