AWS Plugin services

Overview Copied

The AWS plugin is a Collection Agent plugin that gathers metrics through AWS CloudWatch. This plugin also provides an API Destination that can interact with AWS services, such as EventBridge and SNS.

Monitored AWS services Copied

The AWS plugin can also get CloudWatch metrics from the following AWS services. See the Required AWS plugin permissions for each service.

AWS/ApplicationELB Copied

The AWS/ApplicationELB service collects metrics from Application Load Balancers.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
ActiveConnectionCount gauge LoadBalancer, Namespace, Region sum 300 Total number of concurrent TCP connections active from clients to the load balancer and from the load balancer to targets.
AvailabilityZones attribute LoadBalancer, Namespace, Region Subnets for the load balancer.
ClientTlsNegotiationErrorCount gauge LoadBalancer, Namespace, Region sum 300 Number of TLS connections initiated by the client that did not establish a session with the load balancer due to a TLS error. Possible causes include a mismatch of ciphers or protocols or the client failing to verify the server certificate and closing the connection.
ConsumedLCUs gauge LoadBalancer, Namespace, Region average 300 Number of load balancer capacity units (LCU) used by your load balancer. You pay for the number of LCUs that you use per hour.
CreatedTime attribute LoadBalancer, Namespace, Region Date and time the load balancer was created.
DesyncmitigationmodeNoncompliantRequestCount gauge LoadBalancer, Namespace, Region sum 300 Number of requests that do not comply with RFC 7230.
DnsName attribute LoadBalancer, Namespace, Region Public DNS name of the load balancer.
DroppedInvalidHeaderRequestCount gauge LoadBalancer, Namespace, Region average 300

Number of requests where the load balancer removed HTTP headers with header fields that are not valid before routing the request.

The load balancer removes these headers only if the routing.http.drop_invalid_<br>header_fields.enabled attribute is set to true.

ELBAuthError gauge LoadBalancer, Namespace, Region sum 300 Number of user authentications that could not be completed because an authenticate action was misconfigured, the load balancer cannot establish a connection with the IdP, or the load balancer cannot complete the authentication flow due to an internal error.
ELBAuthFailure gauge LoadBalancer, Namespace, Region sum 300 Number of user authentications that could not be completed because the IdP denied access to the user or an authorisation code was used more than once.
ELBAuthLatency gauge LoadBalancer, Namespace, Region average 300 Time elapsed, in milliseconds, to query the IdP for the ID token and user info. If one or more of these operations fail, this is the time to failure.
ELBAuthRefreshTokenSuccess gauge LoadBalancer, Namespace, Region sum 300 Number of times the load balancer successfully refreshed user claims using a refresh token provided by the IdP.
ELBAuthSuccess gauge LoadBalancer, Namespace, Region sum 300 Number of authenticate actions that were successful.
ELBAuthUserClaimsSizeExceeded gauge LoadBalancer, Namespace, Region sum 300 Number of times that a configured IdP returned user claims that exceeded 11K bytes in size.
ForwardedInvalidHeaderRequestCount gauge LoadBalancer, Namespace, Region average 300

Number of requests routed by the load balancer that had HTTP headers with header fields that are not valid.

The load balancer forwards requests with these headers only if the routing.http.drop_invalid_<br>header_fields.enabled attribute is set to false.

GrpcRequestCount gauge LoadBalancer, Namespace, Region average 300 Number of gRPC requests processed over IPv4 and IPv6.
HealthyHostCount gauge LoadBalancer, Namespace, Region sum 300 Number of targets that are considered healthy.
HealthyStateDNS gauge LoadBalancer, Namespace, Region minimum 300 Number of zones that meet the DNS healthy state requirements.
HealthyStateRouting gauge LoadBalancer, Namespace, Region minimum 300 Number of zones that meet the routing healthy state requirements.
HTTP_Fixed_Response_Count gauge LoadBalancer, Namespace, Region sum 300 Number of fixed-response actions that were successful.
HTTP_Redirect_Count gauge LoadBalancer, Namespace, Region sum 300 Number of redirect actions that were successful.
HTTP_Redirect_Url_Limit_Exceeded_Count gauge LoadBalancer, Namespace, Region sum 300 Number of redirect actions that cannot be completed.
HTTPCode_ELB_3XX_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 3XX redirection codes that originate from the load balancer. This count does not include response codes generated by targets.
HTTPCode_ELB_4XX_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 4XX redirection codes that originate from the load balancer. This count does not include response codes generated by targets.
HTTPCode_ELB_500_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 500 error codes that originate from the load balancer.
HTTPCode_ELB_502_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 502 error codes that originate from the load balancer.
HTTPCode_ELB_503_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 503 error codes that originate from the load balancer.
HTTPCode_ELB_504_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 504 error codes that originate from the load balancer.
HTTPCode_ELB_5XX_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 5XX redirection codes that originate from the load balancer. This count does not include response codes generated by targets.
HTTPCode_Target_2XX_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 2XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
HTTPCode_Target_3XX_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 3XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
HTTPCode_Target_4XX_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 4XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
HTTPCode_Target_5XX_Count gauge LoadBalancer, Namespace, Region sum 300 Number of HTTP 5XX response codes generated by the targets. This does not include any response codes generated by the load balancer.
IpAddressType attribute LoadBalancer, Namespace, Region Type of IP addresses used by the subnets for your load balancer.
Ipv6ProcessedBytes gauge bytes LoadBalancer, Namespace, Region sum 300 Total number of bytes processed by the load balancer over IPv6. This count is included in ProcessedBytes.
Ipv6RequestCount gauge LoadBalancer, Namespace, Region sum 300 Number of IPv6 requests received by the load balancer.
LambdaInternalError gauge LoadBalancer, Namespace, Region sum 300 Number of requests to a Lambda function that failed because of an issue internal to the load balancer or AWS Lambda. To get the error reason codes, check the error_reason field of the access log.
LambdaTargetProcessedBytes gauge LoadBalancer, Namespace, Region sum 300 Total number of bytes processed by the load balancer for requests to and responses from a Lambda function.
LambdaUserError gauge LoadBalancer, Namespace, Region sum 300 Number of requests to a Lambda function that failed because of an issue with the Lambda function.
NewConnectionCount gauge LoadBalancer, Namespace, Region sum 300 Total number of new TCP connections established from clients to the load balancer and from the load balancer to targets.
NonStickyRequestCount gauge LoadBalancer, Namespace, Region sum 300

Number of requests where the load balancer chose a new target because it couldn’t use an existing sticky session.

For example, the request was the first request from a new client and no stickiness cookie was presented, a stickiness cookie was presented but it did not specify a target that was registered with this target group, the stickiness cookie was malformed or expired, or an internal error prevented the load balancer from reading the stickiness cookie.

ProcessedBytes gauge bytes LoadBalancer, Namespace, Region sum 300

Total number of bytes processed by the load balancer over IPv4 and IPv6.

This count includes traffic to and from clients and Lambda functions, and traffic from an Identity Provider (IdP) if user authentication is enabled.

RejectedConnectionCount gauge LoadBalancer, Namespace, Region sum 300 Number of connections that were rejected because the load balancer had reached its maximum number of connections.
RequestCount gauge LoadBalancer, Namespace, Region sum 300

Number of requests processed over IPv4 and IPv6.

This metric is only incremented for requests where the load balancer node was able to choose a target. Requests rejected before a target is chosen (for example, HTTP 460, HTTP 400, some kinds of HTTP 503 and 500) are not reflected in this metric.

RequestCountPerTarget gauge LoadBalancer, Namespace, Region sum 300

Average number of requests received by each target in a target group.

You must specify the target group using the TargetGroup dimension. This metric does not apply if the target is a Lambda function.

RuleEvaluations gauge LoadBalancer, Namespace, Region sum 300 Number of rules processed by the load balancer given a request rate averaged over an hour.
Scheme attribute LoadBalancer, Namespace, Region Nodes of an Internet-facing load balancer that have public IP addresses.
State attribute LoadBalancer, Namespace, Region State of the load balancer.
TargetConnectionErrorCount gauge LoadBalancer, Namespace, Region sum 300

Number of connections that were not successfully established between the load balancer and target.

This metric does not apply if the target is a Lambda function.

TargetResponseTime gauge seconds LoadBalancer, Namespace, Region average 300 Time elapsed, in seconds, after the request leaves the load balancer until a response from the target is received. This is equivalent to the target_processing_time field in the access logs.
TargetTlsNegotiationErrorCount gauge LoadBalancer, Namespace, Region sum 300

Number of TLS connections initiated by the load balancer that did not establish a session with the target.

Possible causes include a mismatch of ciphers or protocols. This metric does not apply if the target is a Lambda function.

UnHealthyHostCount gauge LoadBalancer, Namespace, Region max 300 Number of targets that are considered unhealthy.
UnhealthyRoutingRequestCount gauge LoadBalancer, Namespace, Region max 300 Number of requests that are routed using the routing failover action (fail open).
UnhealthyStateDNS gauge LoadBalancer, Namespace, Region minimum 300 Number of zones that do not meet the DNS healthy state requirements and therefore were marked unhealthy in DNS.
UnhealthyStateRouting gauge LoadBalancer, Namespace, Region minimum 300 Number of zones that do not meet the routing healthy state requirements, and therefore the load balancer distributes traffic to all targets in the zone, including the unhealthy targets.
VpcId attribute LoadBalancer, Namespace, Region ID of the VPC for the load balancer.

AWS/Billing Copied

The AWS/Billing service collects billing metrics, including a breakdown of estimated charges by service.

To get these metrics, you need to enable the AWS Billing collector configuration, AwsBillingCollector , in the Collection Agent YAML file. See Configure Geneos to deploy AWS CloudWatch plugin in AWS.

Note

The AWS/Billing service is only available through the AwsBillingCollector and not as a service under AwsCollector.
Metric name Metric type Unit name Dimension Statistic Period(s) Description
ActualSpend gauge USD BudgetName, Namespace, Region average 21600 Actual spending costs for your budget period.
BudgetLimit gauge USD BudgetName, Namespace, Region average 21600 Spending limit for your budget period.
BudgetType attribute BudgetName, Namespace, Region Specifies if the budget tracks costs, usage, RI utilization, RI coverage, Savings Plans utilization, or Savings Plans coverage.
EstimatedCharges gauge USD Currency, Namespace, Region, ServiceName average 21600 Estimated charges for your AWS usage.
ForecastedSpend gauge USD BudgetName, Namespace, Region average 21600 Forecasted spending costs for your budget period.

AWS/AutoScaling Copied

The AWS/AutoScaling service collects metrics from Auto-scaling groups.

Metric name Metric type Dimension Statistic Period(s) Description
GroupAndWarmPoolDesiredCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Desired capacity of the Auto Scaling group and the warm pool combined.
GroupAndWarmPoolTotalCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Total capacity of the Auto Scaling group and the warm pool combined.
GroupDesiredCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Number of instances that the Auto Scaling group attempts to maintain.
GroupInServiceCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Number of capacity units that are running as part of the Auto Scaling group.
GroupInServiceInstances gauge AutoScalingGroupName, Namespace, Region average 300 Number of instances that are running as part of the Auto Scaling group.
GroupMaxSize gauge AutoScalingGroupName, Namespace, Region average 300 Maximum size of the Auto Scaling group.
GroupMinSize gauge AutoScalingGroupName, Namespace, Region average 300 Minimum size of the Auto Scaling group.
GroupPendingCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Number of capacity units that are pending.
GroupPendingInstances gauge AutoScalingGroupName, Namespace, Region average 300 Number of instances that are pending.
GroupStandbyCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Number of capacity units that are in a Standby state.
GroupStandbyInstances gauge AutoScalingGroupName, Namespace, Region average 300 Number of instances that are in a Standby state.
GroupTerminatingCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Number of capacity units that are in the process of terminating.
GroupTerminatingInstances gauge AutoScalingGroupName, Namespace, Region average 300 Number of instances that are in the process of terminating.
GroupTotalCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Total number of capacity units in the Auto Scaling group.
GroupTotalInstances gauge AutoScalingGroupName, Namespace, Region average 300 Total number of instances in the Auto Scaling group.
WarmPoolDesiredCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Amount of capacity that Amazon EC2 Auto Scaling attempts to maintain in the warm pool.
WarmPoolMinSize gauge AutoScalingGroupName, Namespace, Region average 300 Minimum size of the warm pool.
WarmPoolPendingCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Amount of capacity in the warm pool that is pending.
WarmPoolTerminatingCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Amount of capacity in the warm pool that is in the process of terminating.
WarmPoolTotalCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Total capacity of the warm pool, including instances that are running, stopped, pending, or terminating.
WarmPoolWarmedCapacity gauge AutoScalingGroupName, Namespace, Region average 300 Amount of capacity available to enter the Auto Scaling group during scale out.

AWS/CertificateManager Copied

The AWS/CertificateManager service collects metrics from available certificates in ACM.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
DaysToExpiry gauge days CertificateArn, Namespace, Region minimum 86400 Number of remaining days until the certificate expires.
DomainName attribute CertificateArn, Namespace, Region Domain name defined in the certificate.
InUse attribute CertificateArn, Namespace, Region

Indicates whether the certificate is in use by another AWS service.

Possible values are Yes or No.

RenewalEligibility attribute CertificateArn, Namespace, Region Indicates whether the certificate is eligible for renewal.
Status attribute CertificateArn, Namespace, Region Certificate status (EXPIRED, FAILED, INACTIVE, ISSUED, PENDING_VALIDATION, REVOKED, VALIDATION_TIMED_OUT, or UNKNOWN_TO_SDK_VERSION).
Type attribute CertificateArn, Namespace, Region Certificate type (AMAZON_ISSUED, IMPORTED, PRIVATE, or UNKNOWN_TO_SDK_VERSION).

AWS/DynamoDB Copied

The AWS/DynamoDB service collects metrics from DynamoDB tables.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
AccountMaxReads gauge TableName, Namespace, Region max 300 Maximum number of read capacity units that can be used by an account. This limit does not apply to on-demand tables or global secondary indexes.
AccountMaxTableLevelReads gauge TableName, Namespace, Region max 300 Maximum number of read capacity units that can be used by a table or global secondary index of an account. For on-demand tables this limit caps the maximum read request units a table or a global secondary index can use.
AccountMaxTableLevelWrites gauge TableName, Namespace, Region max 300 Maximum number of write capacity units that can be used by a table or global secondary index of an account. For on-demand tables this limit caps the maximum write request units a table or a global secondary index can use.
AccountMaxWrites gauge TableName, Namespace, Region max 300 Maximum number of write capacity units that can be used by an account. This limit does not apply to on-demand tables or global secondary indexes.
AccountProvisionedReadCapacityUnits gauge TableName, Namespace, Region max 300 Sum of read capacity units provisioned for all tables and global secondary indexes of an account.
AccountProvisionedReadCapacityUtilization gauge percent TableName, Namespace, Region average 300 Percentage of provisioned read capacity units utilized by an account.
AccountProvisionedWriteCapacityUnits gauge TableName, Namespace, Region max 300 Sum of write capacity units provisioned for all tables and global secondary indexes of an account.
AccountProvisionedWriteCapacityUtilization gauge percent TableName, Namespace, Region average 300 Percentage of provisioned write capacity units utilized by an account.
AgeOfOldestUnreplicatedRecord gauge milliseconds TableName, Namespace, Region max 300 Elapsed time since a record yet to be replicated to the Kinesis data stream first appeared in the DynamoDB table.
ConditionalCheckFailedRequests gauge TableName, Namespace, Region average 300

Number of failed attempts to perform conditional writes. The PutItem, UpdateItem, and DeleteItem operations let you provide a logical condition that must evaluate to true before the operation can proceed.

If this condition evaluates to false, the ConditionalCheckFailedRequests is incremented by one. ConditionalCheckFailedRequests is also incremented by one for PartiQL Update and Delete statements where a logical condition is provided and that condition evaluates to false.

ConsumedChangeDataCaptureUnits gauge TableName, Namespace, Region average 300 Number of consumed change data capture units.
ConsumedReadCapacityUnits gauge TableName, Namespace, Region sum 300

Number of read capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used.

You can retrieve the total consumed read capacity for a table and all of its global secondary indexes, or for a particular global secondary index.

ConsumedWriteCapacityUnits gauge TableName, Namespace, Region sum 300

Number of write capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used.

You can retrieve the total consumed write capacity for a table and all of its global secondary indexes, or for a particular global secondary index.

Encryption attribute TableName, Namespace, Region Server-side encryption type. The only supported value is KMS if encryption is defined.
FailedToReplicateRecordCount gauge TableName, Namespace, Region average 300 Number of records that DynamoDB failed to replicate to your Kinesis data stream.
Indexes attribute TableName, Namespace, Region Sum of the global secondary indexes and the local secondary indexes.
ItemCount attribute TableName, Namespace, Region Number of items in a table.
MaxProvisionedTableReadCapacityUtilization gauge percentage TableName, Namespace, Region max 300 Percentage of provisioned read capacity utilized by the highest provisioned read table or global secondary index of an account.
MaxProvisionedTableWriteCapacityUtilization gauge percentage TableName, Namespace, Region max 300 Percentage of provisioned write capacity utilized by the highest provisioned write table or global secondary index of an account.
OnlineIndexConsumedWriteCapacity gauge TableName, Namespace, Region average 300

Number of write capacity units consumed when adding a new global secondary index to a table. If the write capacity of the index is too low, the incoming write activity during the backfill phase might be throttled. This can increase the time it takes to create the index.

You should monitor this statistic while the index is being built to determine whether the write capacity of the index is underprovisioned.

OnlineIndexPercentageProgress gauge TableName, Namespace, Region average 300

Percentage of completion when a new global secondary index is being added to a table. DynamoDB must first allocate resources for the new index, and then backfill attributes from the table into the index. For large tables, this process might take a long time.

You should monitor this statistic to view the relative progress as DynamoDB builds the index.

OnlineIndexThrottleEvents gauge TableName, Namespace, Region average 300 Number of write throttle events that occur when adding a new global secondary index to a table. These events indicate that the index creation will take longer to complete, because incoming write activity is exceeding the provisioned write throughput of the index.
PartitionKey attribute TableName, Namespace, Region Value of the defined partition key.
PendingReplicationCount gauge TableName, Namespace, Region average 300 Metric is for DynamoDB global tables. The number of item updates that are written to one replica table, but that have not yet been written to another replica in the global table.
ProvisionedReadCapacityUnits gauge TableName, Namespace, Region average 300 Number of provisioned read capacity units for a table or a global secondary index.
ProvisionedWriteCapacityUnits gauge TableName, Namespace, Region average 300 Number of provisioned write capacity units for a table or a global secondary index.
ReadCapacityMode attribute TableName, Namespace, Region Read capacity can either be On-Demand or Partitioned depending on the read capacity settings.
ReadThrottleEvents gauge TableName, Namespace, Region sum 300 Requests to DynamoDB that exceed the provisioned read capacity units for a table or a global secondary index.
Replicas attribute TableName, Namespace, Region Number of times the given table has been replicated in other regions.
ReplicationLatency gauge milliseconds TableName, Namespace, Region average 300 Metric is for DynamoDB global tables. The elapsed time between an updated item appearing in the DynamoDB stream for one replica table, and that item appearing in another replica in the global table.
ReturnedBytes gauge bytes TableName, Namespace, Region average 300 Number of bytes returned by GetRecords operations (Amazon DynamoDB Streams) during the specified time period.
ReturnedItemCount gauge TableName, Namespace, Region average 300 Number of items returned by Query, Scan or ExecuteStatement (select) operations during the specified time period.
ReturnedRecordsCount gauge TableName, Namespace, Region average 300 Number of stream records returned by GetRecords operations (Amazon DynamoDB Streams) during the specified time period.
Size attribute bytes TableName, Namespace, Region Total size of the specified table, in bytes. DynamoDB updates this value approximately every six hours. Recent changes might not be reflected in this value.
SortKey attribute TableName, Namespace, Region Value of the sort key if sort key has been defined.
SuccessfulRequestLatency gauge milliseconds TableName, Namespace, Region average 300 Successful requests to DynamoDB or Amazon DynamoDB Streams during the specified time period. SuccessfulRequestLatency can provide two different kinds of information: the elapsed time for successful requests (Minimum, Maximum, Sum, or Average) or the number of successful requests (SampleCount). SuccessfulRequestLatency reflects activity only within DynamoDB or Amazon DynamoDB Streams, and does not take into account network latency or client-side activity.
SystemErrors gauge TableName, Namespace, Region sum 300 Requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 500 status code during the specified time period. An HTTP 500 usually indicates an internal service error.
TableClass attribute TableName, Namespace, Region The table class of the specified table. Valid values are STANDARD and STANDARD_INFREQUENT_ACCESS.
TableCount gauge TableName, Namespace, Region max 300 Number of active tables of an account.
TableStatus attribute TableName, Namespace, Region Current state of the table.
ThrottledPutRecordCount gauge TableName, Namespace, Region average 300 Number of records that were throttled by the Kinesis data stream due to insufficient Kinesis Data Streams capacity.
ThrottledRequests gauge TableName, Namespace, Region sum 300

Requests to DynamoDB that exceed the provisioned throughput limits on a resource (such as a table or an index).

ThrottledRequests is incremented by one if any event within a request exceeds a provisioned throughput limit. For example, if you update an item in a table with global secondary indexes, there are multiple events—a write to the table, and a write to each index. If one or more of these events are throttled, then ThrottledRequests is incremented by one.

TimeToLiveDeletedItemCount gauge TableName, Namespace, Region average 300 Number of items deleted by Time to Live (TTL) during the specified time period. This metric helps you monitor the rate of TTL deletions on the table.
TransactionConflict gauge TableName, Namespace, Region sum 300 Rejected item-level requests due to transactional conflicts between concurrent requests on the same items.
UserErrors gauge TableName, Namespace, Region sum 300 Requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 400 status code during the specified time period. An HTTP 400 usually indicates a client-side error, such as an invalid combination of parameters, an attempt to update a non-existent table, or an incorrect request signature.
WriteCapacityMode attribute TableName, Namespace, Region Write capacity can either be On-Demand or Partitioned depending on the write capacity settings.
WriteThrottleEvents gauge TableName, Namespace, Region sum 300 Requests to DynamoDB that exceed the provisioned write capacity units for a table or a global secondary index.

AWS/EBS Copied

The AWS/EBS service collects metrics from non-deleted and non-error EBS volumes.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
AttachedInstances attribute VolumeId, Namespace, Region List of instance IDs attached to this volume.
BurstBalance gauge percent VolumeId, Namespace, Region average 300 Provides information about the percentage of I/O credits (for gp2) or throughput credits (for st1 and sc1) remaining in the burst bucket.
CreateTime attribute VolumeId, Namespace, Region Time when the volume creation was initiated.
FastSnapshotRestoreCreditsBalance gauge VolumeId, Namespace, Region average 300 Number of volume create credits available. This metric is reported per snapshot per Availability Zone.
FastSnapshotRestoreCreditsBucketSize gauge VolumeId, Namespace, Region average 300 Maximum number of volume create credits that can be accumulated. This metric is reported per snapshot per Availability Zone.
Size attribute gibibyte VolumeId, Namespace, Region average 300 Size of the volume in GiBs.
VolumeConsumedReadWriteOps gauge VolumeId, Namespace, Region sum 300 Used with Provisioned IOPS SSD volumes only. The total amount of read and write operations (normalized to 256K capacity units) consumed in a specified period of time.
VolumeIdleTime gauge seconds VolumeId, Namespace, Region average 300 Total number of seconds in a specified period of time when no read or write operations were submitted.
VolumeQueueLength gauge VolumeId, Namespace, Region average 300 Number of read and write operation requests waiting to be completed in a specified period of time.
VolumeReadBytes gauge bytes VolumeId, Namespace, Region average 300 Provides information on the read operations in a specified period of time.
VolumeReadOps gauge VolumeId, Namespace, Region average 300 Total number of read operations in a specified period of time. Note that read operations are counted on completion.
VolumeThroughputPercentage gauge percent VolumeId, Namespace, Region average 300 Used with Provisioned IOPS SSD volumes only. The percentage of I/O operations per second (IOPS) delivered of the total IOPS provisioned for an Amazon EBS volume.
VolumeTotalReadTime gauge seconds VolumeId, Namespace, Region average 300 Total number of seconds spent by all read operations that completed in a specified period of time.
VolumeTotalWriteTime gauge seconds VolumeId, Namespace, Region average 300 Total number of seconds spent by all write operations that completed in a specified period of time.
VolumeType attribute VolumeId, Namespace, Region Type of the volume.
VolumeWriteBytes gauge bytes VolumeId, Namespace, Region average 300 Provides information on the write operations in a specified period of time.
VolumeWriteOps gauge VolumeId, Namespace, Region average 300 Total number of write operations in a specified period of time. Note that write operations are counted on completion.

AWS/EC2 Copied

The AWS/EC2 service collects metrics from non-stopped and non-terminated EC2 instances.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
Architecture attribute InstanceId, Namespace, Region Architecture of the image.
CPUCreditBalance gauge minutes InstanceId, Namespace, Region average 300

Number of earned CPU credits that an instance has accrued since it was launched or started.

For T2 Standard, the CPUCreditBalance also includes the number of launch credits that have been accrued.

CPUCreditUsage gauge minutes InstanceId, Namespace, Region average 300 Number of CPU credits spent by the instance for CPU utilisation. One CPU credit equals one vCPU running at 100% utilisation for one minute, or an equivalent combination of vCPUs, utilisation, and time (for example, one vCPU running at 50% utilisation for two minutes or two vCPUs running at 25% utilisation for two minutes).
CPUSurplusCreditBalance gauge minutes InstanceId, Namespace, Region average 300 Number of surplus credits that have been spent by an unlimited instance when its CPUCreditBalance value is zero.
CPUSurplusCreditsCharged gauge minutes InstanceId, Namespace, Region average 300 Number of spent surplus credits that are not paid down by earned CPU credits, and which thus incur an additional charge.
CPUUtilization gauge percent InstanceId, Namespace, Region average 300

Percentage of the allocated EC2 compute units that are currently in use on the instance.

This metric identifies the processing power required to run an application on a selected instance.

CidrIpv4 attribute SecurityGroupRuleId, InstanceId, Namespace, Region Range of IPv4 CIDR.
CidrIpv6 attribute SecurityGroupRuleId, InstanceId, Namespace, Region Range of IPv6 CIDR.
DedicatedHostCPUUtilization gauge percent InstanceId, Namespace, Region average 300 Percentage of allocated compute capacity that is currently in use by the instances running on the dedicated host.
DiskReadBytes gauge bytes InstanceId, Namespace, Region average 300 Bytes read from all instance store volumes available to the instance.
DiskReadOps gauge InstanceId, Namespace, Region average 300 Completed read operations from all instance store volumes available to the instance in a specified period of time.
DiskWriteBytes gauge bytes InstanceId, Namespace, Region average 300 Bytes written to all instance store volumes available to the instance.
DiskWriteOps gauge InstanceId, Namespace, Region average 300 Completed write operations to all instance store volumes available to the instance in a specified period of time.
EBSByteBalance% gauge percent InstanceId, Namespace, Region average 300 Provides information about the percentage of throughput credits remaining in the burst bucket. This metric is only available for basic monitoring.
EBSIOBalance% gauge percent InstanceId, Namespace, Region average 300 Provides information about the percentage of I/O credits remaining in the burst bucket. This metric is available for basic monitoring only.
EBSReadBytes gauge bytes InstanceId, Namespace, Region average 300 Bytes read from all EBS volumes attached to the instance in a specified period of time.
EBSReadOps gauge InstanceId, Namespace, Region average 300 Completed read operations from all Amazon EBS volumes attached to the instance in a specified period of time.
EBSWriteBytes gauge bytes InstanceId, Namespace, Region average 300 Bytes written to all EBS volumes attached to the instance in a specified period of time.
EBSWriteOps gauge InstanceId, Namespace, Region average 300 Completed write operations to all EBS volumes attached to the instance in a specified period of time.
GroupId attribute SecurityGroupRuleId, InstanceId, Namespace, Region ID of the security group.
GroupOwnerId attribute SecurityGroupRuleId, InstanceId, Namespace, Region ID of the Amazon Web Services account that owns the security group.
Groups attribute NetworkInterfaceId, InstanceId, Namespace, Region Security groups.
InstanceType attribute InstanceId, Namespace, Region Instance type.
InterfaceType attribute NetworkInterfaceId, InstanceId, Namespace, Region Type of the network interface.
IpProtocol attribute SecurityGroupRuleId, InstanceId, Namespace, Region IP protocol name (TCP, UDP, ICMP, ICMPv6, all) or number.
IpV4Prefixes attribute NetworkInterfaceId, InstanceId, Namespace, Region Delegated IPv4 prefixes assigned to the network interface.
IpV6Prefixes attribute NetworkInterfaceId, InstanceId, Namespace, Region Delegated IPv6 prefixes assigned to the network interface.
Ipv6Addresses attribute NetworkInterfaceId, InstanceId, Namespace, Region IPv6 addresses associated with the network interface.
IsEgress attribute SecurityGroupRuleId, InstanceId, Namespace, Region Indicates whether the security group rule is an outbound rule.
LaunchTime attribute InstanceId, Namespace, Region Time when the instance was launched.
MacAddress attribute NetworkInterfaceId, InstanceId, Namespace, Region MAC address.
MetadataNoToken gauge InstanceId, Namespace, Region average 300 Number of times the instance metadata service was successfully accessed using a method that does not use a token.
NetworkIn gauge bytes InstanceId, Namespace, Region average 300

Number of bytes received by the instance on all network interfaces.

This metric identifies the volume of incoming network traffic to a single instance.

NetworkInterfaceDescription attribute NetworkInterfaceId, InstanceId, Namespace, Region Description of the network interface.
NetworkOut gauge bytes InstanceId, Namespace, Region average 300

Number of bytes sent out by the instance on all network interfaces.

This metric identifies the volume of outgoing network traffic from a single instance.

NetworkPacketsIn gauge InstanceId, Namespace, Region average 300

Number of packets received by the instance on all network interfaces.

This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance.

NetworkPacketsOut gauge InstanceId, Namespace, Region average 300

Number of packets sent out by the instance on all network interfaces.

This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance.

OwnerId attribute NetworkInterfaceId, InstanceId, Namespace, Region ID of the AWS account that created the network interface.
PortRange attribute SecurityGroupRuleId, InstanceId, Namespace, Region Start and end of port range for the TCP and UDP protocols, or an ICMP/ICMPv6 type and code.
PrefixListId attribute SecurityGroupRuleId, InstanceId, Namespace, Region ID of the prefix list.
PrivateDnsName attribute NetworkInterfaceId, InstanceId, Namespace, Region Private DNS name.
PrivateIp attribute InstanceId, Namespace, Region Private IPv4 address assigned to the instance.
PrivateIpAddress attribute NetworkInterfaceId, InstanceId, Namespace, Region IPv4 address of the network interface within the subnet.
PrivateIpAddresses attribute NetworkInterfaceId, InstanceId, Namespace, Region Private IPv4 addresses associated with the network interface.
ResourceCount gauge InstanceId, Namespace, Region max 300 Number of the specified resources running in your account. The resources are defined by the dimensions associated with the metric.
SecurityGroupRuleDescription attribute SecurityGroupRuleId, InstanceId, Namespace, Region Description of the security group rule.
State attribute InstanceId, Namespace, Region average Current state of the instance.
Status status metric NetworkInterfaceId, InstanceId, Namespace, Region Status of the network interface.
StatusCheckFailed attribute InstanceId, Namespace, Region average 300 Status checks for instances and systems.
StatusCheckFailedInstance attribute InstanceId, Namespace, Region average 300 Instance status checks monitor the software and network configuration of your individual instance.
StatusCheckFailedSystem attribute InstanceId, Namespace, Region average 300 System status checks monitor the AWS systems on which your instance runs.
SubnetId attribute NetworkInterfaceId, InstanceId, Namespace, Region ID of the subnet.
VpcId attribute NetworkInterfaceId, InstanceId, Namespace, Region ID of the VPC.

AWS/ECS Copied

The AWS/ECS service collects metrics from non-failed and non-inactive ECS clusters.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
ActiveConnectionCount gauge ClusterName, Namespace, Region sum 300 Total number of concurrent active connections from clients to Amazon ECS Service Connect proxies running in tasks that share the selected DiscoveryName.
CpuReservation gauge percent ClusterName, Namespace, Region average 300 Percentage of CPU units that are reserved by running tasks in the cluster.
CpuUtilization gauge percent ClusterName, Namespace, Region average 300 Percentage of CPU units that are used in the cluster.
CpuUtilization gauge percent ClusterName, ServiceName, Namespace, Region average 300 Percentage of CPU units that are used in the service.
GpuReservation gauge percent ClusterName, Namespace, Region average 300 Percentage of total available GPUs that are reserved by running tasks in the cluster.
GrpcRequestCount gauge ClusterName, Namespace, Region sum 300 Number of gRPC inbound traffic requests processed by the Service Connect proxies.
HTTPCode_Target_2XX_Count gauge ClusterName, Namespace, Region sum 300 Number of HTTP response codes with numbers 200 to 299 generated by the applications in these tasks.
HTTPCode_Target_3XX_Count gauge ClusterName, Namespace, Region sum 300 Number of HTTP response codes with numbers 300 to 399 generated by the applications in these tasks.
HTTPCode_Target_4XX_Count gauge ClusterName, Namespace, Region sum 300 Number of HTTP response codes with numbers 400 to 499 generated by the applications in these tasks.
HTTPCode_Target_5XX_Count gauge ClusterName, Namespace, Region sum 300 Number of HTTP response codes with numbers 500 to 599 generated by the applications in these tasks.
MemoryReservation gauge percent ClusterName, Namespace, Region average 300 Percentage of memory that is reserved by running tasks in the cluster.
MemoryUtilization gauge percent ClusterName, Namespace, Region average 300 Percentage of memory that is used in the cluster.
MemoryUtilization gauge percent ClusterName, ServiceName, Namespace, Region average 300 Percentage of memory that is used in the service.
NewConnectionCount gauge ClusterName, Namespace, Region sum 300 Total number of new connections established from clients to Amazon ECS Service Connect proxies running in tasks that share the selected DiscoveryName.
ProcessedBytes gauge bytes ClusterName, Namespace, Region average 300 Total number of bytes of inbound traffic processed by the Service Connect proxies.
RequestCount gauge ClusterName, Namespace, Region sum 300 Number of inbound traffic requests processed by the Service Connect proxies.
RequestCountPerTarget gauge ClusterName, Namespace, Region sum 300 Average number of requests received by each target that share the selected DiscoveryName.
Status attribute ClusterName, Namespace, Region average Status of the cluster.
TargetProcessedBytes gauge bytes ClusterName, Namespace, Region average 300 Total number of bytes processed by the Service Connect proxies.
TargetResponseTime gauge milliseconds ClusterName, Namespace, Region average 300 Latency of the application request processing.

AWS/ECS/ManagedScaling Copied

Metric name Metric type Unit name Dimension Statistic Period(s) Description
CapacityProviderReservation gauge percent CapacityProviderName, ClusterName, Namespace, Region average 300 Percent of cluster container instances used for a specific capacity provider.

AWS/EFS Copied

The AWS/EFS service collects metrics from non-deleted and non-error elastic file systems.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
BurstCreditBalance gauge bytes FileSystemId, Namespace, Region average 300 Number of burst credits that a file system has.
ClientConnections gauge FileSystemId, Namespace, Region average 300 Number of client connections to a file system.
DataReadIOBytes gauge bytes FileSystemId, Namespace, Region average 300 Number of bytes for each file system read operation.
DataWriteIOBytes gauge bytes FileSystemId, Namespace, Region average 300 Number of bytes for each file write operation.
LifeCycleState attribute FileSystemId, Namespace, Region average 300 Lifecycle phase of the file system.
MetadataIOBytes gauge bytes FileSystemId, Namespace, Region average 300 Number of bytes for each metadata operation.
MeteredIOBytes gauge bytes FileSystemId, Namespace, Region average 300 Number of metered bytes for each file system operation, including data read, data write, and metadata operations, with read operations metered at one-third the rate of other operations.
PercentIOLimit gauge percent FileSystemId, Namespace, Region average 300 Shows how close a file system is to reaching the I/O limit of the General Purpose performance mode.
PermittedThroughput gauge bytes_per_second FileSystemId, Namespace, Region average 300 Maximum amount of throughput that a file system can drive.
StorageBytes gauge bytes file_system_id, storage_class, namespace, region sum 300 Size of the file system in bytes, including the amount of data stored in the EFS Standard and EFS Standard–Infrequent Access (EFS Standard-IA) storage classes.
TimeSinceLastSync gauge seconds FileSystemId, Namespace, Region max 300 Shows the amount of time that has passed since the last successful sync to the destination file system in a replication configuration.
TotalIOBytes gauge bytes FileSystemId, Namespace, Region average 300 Number of bytes for each file system operation, including data read, data write, and metadata operations.

AWS/EKS Copied

The AWS/EKS service collects metrics from non-failed EKS clusters.

Metric name Metric type Dimension Description
Status attribute ClusterName, Namespace, Region Current status of the cluster.
Status attribute ClusterName, NodeGroupName, Namespace, Region Current status of the managed node group.

AWS/ElastiCache Copied

The AWS/ElastiCache service collects metrics from non-deleted Amazon ElastiCache clusters.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
ActiveDefragHits gauge CacheClusterId, Namespace, Region average 300 Number of value reallocations per minute performed by the active defragmentation process.
AuthenticationFailures gauge CacheClusterId, Namespace, Region average 300 Total number of failed attempts to authenticate to Redis using the AUTH command.
BytesReadFromDisk gauge bytes CacheClusterId, Namespace, Region average 300 Total number of bytes read from disk per minute.
BytesReadIntoMemcached gauge bytes CacheClusterId, Namespace, Region average 300 Number of bytes that have been read from the network by the cache node.
BytesUsedForCache gauge bytes CacheClusterId, Namespace, Region average 300 Total number of bytes allocated by Redis for all purposes, including the dataset, buffers, and so on.
BytesUsedForCacheItems gauge bytes CacheClusterId, Namespace, Region average 300 Number of bytes used to store cache items.
BytesUsedForHash gauge CacheClusterId, Namespace, Region average 300 Number of bytes currently used by hash tables.
BytesWrittenOutFromMemcached gauge bytes CacheClusterId, Namespace, Region average 300 Number of bytes that have been written to the network by the cache node.
BytesWrittenToDisk gauge bytes CacheClusterId, Namespace, Region average 300 Total number of bytes written to disk per minute.
CacheClusterCreateTime attribute CacheClusterId, Namespace, Region Date and time when the cluster was created.
CacheHitRate gauge percent CacheClusterId, Namespace, Region average 300 Indicates the usage efficiency of the Redis instance.
CacheHits gauge CacheClusterId, Namespace, Region average 300 Number of successful read-only key lookups in the main dictionary.
CacheMisses gauge CacheClusterId, Namespace, Region average 300 Number of unsuccessful read-only key lookups in the main dictionary.
CacheNodeType attribute CacheClusterId, Namespace, Region Name of the compute and memory capacity node type for the cluster.
CacheParameterGroupName attribute CacheClusterId, Namespace, Region Name of the cache parameter group.
CacheSubnetGroupName attribute CacheClusterId, Namespace, Region Name of the cache subnet group associated with the cluster.
CasBadval gauge CacheClusterId, Namespace, Region average 300 Number of CAS (check and set) requests the cache has received where the CAS value did not match the CAS value stored.
CasHits gauge CacheClusterId, Namespace, Region average 300 Number of CAS requests the cache has received where the requested key was found and the CAS value matched.
CasMisses gauge CacheClusterId, Namespace, Region average 300 Number of CAS requests the cache has received where the key requested was not found.
ClusterBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are cluster-based.
ClusterBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of cluster-based commands.
CmdConfigGet gauge CacheClusterId, Namespace, Region average 300 Cumulative number of config get requests.
CmdConfigSet gauge CacheClusterId, Namespace, Region average 300 Cumulative number of config set requests.
CmdFlush gauge CacheClusterId, Namespace, Region average 300 Number of flush commands the cache has received.
CmdGet gauge CacheClusterId, Namespace, Region average 300 Number of get commands the cache has received.
CmdSet gauge CacheClusterId, Namespace, Region average 300 Number of set commands the cache has received.
CmdTouch gauge CacheClusterId, Namespace, Region average 300 Cumulative number of touch requests.
CommandAuthorizationFailures gauge CacheClusterId, Namespace, Region average 300 Total number of failed attempts by users to run commands they do not have permission to call.
CPUCreditBalance gauge minutes CacheClusterId, Namespace, Region average 300 Number of earned CPU credits that an instance has accrued since it was launched or started.
CPUCreditUsage gauge minutes CacheClusterId, Namespace, Region average 300 Number of CPU credits spent by the instance for CPU utilization.
CPUUtilization gauge percent CacheClusterId, Namespace, Region average 300 Percentage of CPU utilization for the entire host.
CurrConfig gauge CacheClusterId, Namespace, Region average 300 Current number of configurations stored.
CurrConnections gauge CacheClusterId, Namespace, Region average 300

For Redis, this is the number of client connections, excluding connections from read replicas.

For Memcached, this is a count of the number of connections connected to the cache at an instant in time.

CurrItems gauge CacheClusterId, Namespace, Region average 300 For Redis and Memcached, this is the number of items in the cache.
CurrVolatileItems gauge CacheClusterId, Namespace, Region average 300 Total number of keys in all databases that have a TTL set.
DatabaseCapacityUsageCountedForEvictPercentage gauge percent CacheClusterId, Namespace, Region average 300 Percentage of the total data capacity for the cluster that is in use, excluding the memory used for overhead and COB.
DatabaseCapacityUsagePercentage gauge percent CacheClusterId, Namespace, Region average 300 Percentage of the total data capacity for the cluster that is in use.
DatabaseMemoryUsageCountedForEvictPercentage gauge percent CacheClusterId, Namespace, Region average 300 Percentage of the total data capacity for the cluster that is in use, excluding the memory used for overhead and COB.
DatabaseMemoryUsagePercentage gauge percent CacheClusterId, Namespace, Region average 300 Percentage of the memory available for the cluster that is in use.
DB0AverageTTL gauge milliseconds CacheClusterId, Namespace, Region average 300 Exposes avg_ttl of DBO from the keyspace statistic of the Redis INFO command.
DecrHits gauge CacheClusterId, Namespace, Region average 300 Number of decrement requests the cache has received where the requested key was found.
DecrMisses gauge CacheClusterId, Namespace, Region average 300 Number of decrement requests the cache has received where the requested key was not found.
DeleteHits gauge CacheClusterId, Namespace, Region average 300 Number of delete requests the cache has received where the requested key was found.
DeleteMisses gauge CacheClusterId, Namespace, Region average 300 Number of delete requests the cache has received where the requested key was not found.
Endpoint attribute CacheClusterId, Namespace, Region Represents a Memcached cluster endpoint which can be used by an application to connect to any node in the cluster.
Engine attribute CacheClusterId, Namespace, Region Name of the cache engine (Memcached or Redis) to be used for this cluster.
EngineCPUUtilization gauge percent CacheClusterId, Namespace, Region average 300 Provides CPU utilization of the Redis engine thread.
EngineVersion attribute CacheClusterId, Namespace, Region Version of the cache engine that is used in this cluster.
EvalBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands for eval-based commands.
EvalBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of eval-based commands.
EvictedUnfetched gauge CacheClusterId, Namespace, Region average 300 Number of valid items evicted from the least recently used cache (LRU) which were never touched after being set.
Evictions gauge CacheClusterId, Namespace, Region average 300

For Redis, this is the number of keys that have been evicted due to the maxmemory limit.

For Memcached, this is the number of non-expired items the cache evicted to allow space for new writes.

ExpiredUnfetched gauge CacheClusterId, Namespace, Region average 300 Number of expired items reclaimed from the LRU which were never touched after being set.
FreeableMemory gauge bytes CacheClusterId, Namespace, Region average 300 Amount of free memory available on the host.
GeoSpatialBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands for geospatial-based commands.
GeoSpatialBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of geospatial-based commands.
GetHits gauge CacheClusterId, Namespace, Region average 300 Number of get requests the cache has received where the key requested was found.
GetMisses gauge CacheClusterId, Namespace, Region average 300 Number of get requests the cache has received where the key requested was not found.
GetTypeCmds gauge CacheClusterId, Namespace, Region average 300 Total number of read-only type commands.
GetTypeCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of read commands.
GlobalDatastoreReplicationLag gauge seconds CacheClusterId, Namespace, Region average 300 Lag between the secondary region’s primary node and the primary region’s primary node.
HashBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are hash-based.
HashBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of hash-based commands.
HyperLogLogBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of HyperLogLog-based commands.
HyperLogLogBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of HyperLogLog-based commands.
IamAuthenticationExpirations gauge CacheClusterId, Namespace, Region sum 300 Total number of expired IAM-authenticated Redis connections.
IamAuthenticationThrottling gauge CacheClusterId, Namespace, Region sum 300 Total number of throttled IAM-authenticated Redis AUTH or HELLO requests.
IncrHits gauge CacheClusterId, Namespace, Region average 300 Number of increment requests the cache has received where the key requested was found.
IncrMisses gauge CacheClusterId, Namespace, Region average 300 Number of increment requests the cache has received where the key requested was not found.
IsMaster gauge CacheClusterId, Namespace, Region Indicates whether the node is the primary node of current shard/cluster. The metric can be either 0 (not primary) or 1 (primary).
JsonBasedCmds gauge CacheClusterId, Namespace, Region sum 300 Total number of JSON commands, including both read and write commands.
JsonBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of all JSON commands, including both read and write commands.
JsonBasedGetCmds gauge CacheClusterId, Namespace, Region sum 300 Total number of JSON read-only commands.
JsonBasedGetCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of JSON read-only commands.
JsonBasedSetCmds gauge CacheClusterId, Namespace, Region sum 300 Total number of JSON write commands.
JsonBasedSetCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of JSON write commands.
KeyAuthorizationFailures gauge CacheClusterId, Namespace, Region average 300 Total number of failed attempts by users to access keys they do not have permission to access.
KeyBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are key-based.
KeyBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of key-based commands.
KeysTracked gauge CacheClusterId, Namespace, Region average 300 Number of keys being tracked by Redis key tracking as a percentage of tracking-table-max-keys.
ListBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are list-based.
ListBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of list-based commands.
MasterLinkHealthStatus gauge CacheClusterId, Namespace, Region max 300 This status has two values: 0 or 1. The value 0 indicates that data in the ElastiCache primary node is not in sync with Redis on EC2. The value of 1 indicates that the data is in sync.
MemoryFragmentationRatio gauge CacheClusterId, Namespace, Region average 300 Indicates the efficiency in the allocation of memory of the Redis engine.
NetworkBandwidthInAllowanceExceeded gauge CacheClusterId, Namespace, Region average 300 Number of packets shaped because the inbound aggregate bandwidth exceeded the maximum for the instance.
NetworkBandwidthOutAllowanceExceeded gauge CacheClusterId, Namespace, Region average 300 Number of packets shaped because the outbound aggregate bandwidth exceeded the maximum for the instance.
NetworkBytesIn gauge bytes CacheClusterId, Namespace, Region average 300 Number of bytes the host has read from the network.
NetworkBytesOut gauge bytes CacheClusterId, Namespace, Region average 300 Number of bytes sent out on all network interfaces by the instance.
NetworkConntrackAllowanceExceeded gauge CacheClusterId, Namespace, Region average 300 Number of packets shaped because connection tracking exceeded the maximum for the instance and new connections could not be established.
NetworkLinkLocalAllowanceExceeded gauge CacheClusterId, Namespace, Region average 300 Number of packets shaped because the PPS of the traffic to local proxy services exceeded the maximum for the network interface.
NetworkMaxBytesIn gauge bytes CacheClusterId,Namespace,Region max 300 Maximum burst of received bytes within each minute.
NetworkMaxBytesOut gauge bytes CacheClusterId,Namespace,Region max 300 Maximum burst of transmitted bytes within each minute.
NetworkMaxPacketsIn gauge CacheClusterId,Namespace,Region max 300 Maximum burst of received packets within each minute.
NetworkMaxPacketsOut gauge CacheClusterId,Namespace,Region max 300 Maximum burst of transmitted packets within each minute.
NetworkPacketsIn gauge CacheClusterId, Namespace, Region average 300 Number of packets received on all network interfaces by the instance.
NetworkPacketsOut gauge CacheClusterId, Namespace, Region average 300 Number of packets sent out on all network interfaces by the instance.
NetworkPacketsPerSecondAllowanceExceeded gauge CacheClusterId, Namespace, Region average 300 Number of packets shaped because the bidirectional packets per second exceeded the maximum for the instance.
NewConnections gauge CacheClusterId, Namespace, Region average 300

For Redis, this is the total number of connections that have been accepted by the server during this period.

For Memcached, this is the number of new connections the cache has received.

NewItems gauge CacheClusterId, Namespace, Region average 300 Number of new items the cache has stored.
NonKeyTypeCmds gauge CacheClusterId, Namespace, Region sum 300 Total number of commands that are not key-based.
NonKeyTypeCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of non-key-based commands.
NumCacheNodes attribute CacheClusterId, Namespace, Region Number of cache nodes in the cluster.
NumItemsReadFromDisk gauge CacheClusterId, Namespace, Region average 300 Total number of items retrieved from disk per minute.
NumItemsWrittenToDisk gauge CacheClusterId, Namespace, Region average 300 Total number of items written to disk per minute.
PreferredAvailabilityZone attribute CacheClusterId, Namespace, Region Name of the Availability Zone in which the cluster is located or “Multiple” if the cache nodes are located in different Availability Zones.
PreferredMaintenanceWindow attribute CacheClusterId, Namespace, Region Specifies the weekly time range during which maintenance on the cluster is performed.
PubSubBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands for pub and sub functionality.
PubSubBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of pub and sub-based commands.
Reclaimed gauge CacheClusterId, Namespace, Region average 300

For Redis, this is the total number of key expiration events.

For Memcached, this is the number of expired items the cache evicted to allow space for new writes.

ReplicationBytes gauge bytes CacheClusterId, Namespace, Region average 300 For nodes in a replicated configuration, ReplicationBytes reports the number of bytes that the primary is sending to all of its replicas.
ReplicationGroupId attribute CacheClusterId, Namespace, Region Replication group to which this cluster belongs.
ReplicationLag gauge seconds CacheClusterId, Namespace, Region average 300

This metric is only applicable for a node running as a read replica.

It represents how far behind, in seconds, the replica is in applying changes from the primary node.

SaveInProgress gauge CacheClusterId, Namespace, Region max 300 This binary metric returns 1 whenever a background saved (forked or forkless) is in progress, and 0otherwise.
SetBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are set-based.
SetBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of set-based commands.
SetTypeCmds gauge CacheClusterId, Namespace, Region average 300 Total number of write types of commands.
SetTypeCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of write commands.
SlabsMoved gauge CacheClusterId, Namespace, Region average 300 Total number of slab pages that have been moved.
SnapshotRetentionLimit attribute CacheClusterId, Namespace, Region Number of days for which ElastiCache retains automatic cluster snapshots before deleting them.
SnapshotWindow attribute CacheClusterId, Namespace, Region Daily time range (in UTC) during which ElastiCache begins taking a daily snapshot of your cluster.
SortedSetBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are sorted set-based.
SortedSetBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of sorted-based commands.
State attribute CacheClusterId, Namespace, Region

Current state of this cluster.

Possible values: available, creating, deleted, deleting, incompatible-network, modifying, rebooting cluster nodes, restore-failed, or snapshotting.

StreamBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are stream-based.
StreamBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of stream-based commands.
StringBasedCmds gauge CacheClusterId, Namespace, Region average 300 Total number of commands that are string-based.
StringBasedCmdsLatency gauge microseconds CacheClusterId, Namespace, Region average 300 Latency of string-based commands.
SwapUsage gauge bytes CacheClusterId, Namespace, Region average 300 Amount of swap used on the host.
TouchHits gauge CacheClusterId, Namespace, Region average 300 Number of keys that have been touched and were given a new expiration time.
TouchMisses gauge CacheClusterId, Namespace, Region average 300 Number of items that have been touched.
TrafficManagementActive gauge CacheClusterId, Namespace, Region max 300 Indicates whether ElastiCache for Redis is actively managing traffic by adjusting traffic allocated to incoming commands, monitoring, or replication.
UnusedMemory gauge bytes CacheClusterId, Namespace, Region average 300 Amount of memory not used by data.

AWS/ELB Copied

The AWS/ELB service collects metrics from classic elastic load balancers.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
BackendConnectionErrors gauge LoadBalancerName, Namespace, Region sum 300

Number of connections that were not successfully established between the load balancer and the registered instances.

Since the load balancer retries the connection when there are errors, this count can exceed the request rate. Note that this count also includes any connection errors related to health checks.

DesyncMitigationMode_NonCompliant_Request_Count gauge LoadBalancerName, Namespace, Region sum 300 HTTP listener: Number of requests that do not comply with RFC 7230.
DnsName attribute LoadBalancerName, Namespace, Region DNS name of the load balancer.
EstimatedALBActiveConnectionCount gauge LoadBalancerName, Namespace, Region average 300 Estimated number of concurrent TCP connections active from clients to the load balancer and from the load balancer to targets.
EstimatedALBConsumedLCUs gauge LoadBalancerName, Namespace, Region average 300 Estimated number of load balancer capacity units (LCU) used by an Application Load Balancer. You pay for the number of LCUs that you use per hour.
EstimatedALBNewConnectionCount gauge LoadBalancerName, Namespace, Region average 300 Estimated number of new TCP connections established from clients to the load balancer and from the load balancer to targets.
EstimatedProcessedBytes gauge bytes LoadBalancerName, Namespace, Region average 300 Estimated number of bytes processed by an Application Load Balancer.
HealthyHostCount gauge LoadBalancerName, Namespace, Region max 300

Number of healthy instances registered with the load balancer.

A newly registered instance is considered healthy after it passes the first health check. If cross-zone load balancing is enabled, the number of healthy instances for the LoadBalancerName dimension is calculated across all Availability Zones. Otherwise, it is calculated per Availability Zone.

HTTPCode_Backend_2XX gauge LoadBalancerName, Namespace, Region sum 300

HTTP listener: Number of HTTP 2XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

HTTPCode_Backend_3XX gauge LoadBalancerName, Namespace, Region sum 300

HTTP listener: Number of HTTP 3XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

HTTPCode_Backend_4XX gauge LoadBalancerName, Namespace, Region sum 300

HTTP listener: Number of HTTP 4XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

HTTPCode_Backend_5XX gauge LoadBalancerName, Namespace, Region sum 300

HTTP listener: Number of HTTP 5XX response codes generated by registered instances.

This count does not include any response codes generated by the load balancer.

HTTPCode_ELB_4XX gauge LoadBalancerName, Namespace, Region sum 300

HTTP listener: Number of HTTP 4XX client error codes generated by the load balancer.

Client errors are generated when a request is malformed or incomplete.

HTTPCode_ELB_5XX gauge LoadBalancerName, Namespace, Region sum 300

HTTP listener: Number of HTTP 5XX server error codes generated by the load balancer.

This count does not include any response codes generated by the registered instances. The metric is reported if there are no healthy instances registered to the load balancer, or if the request rate exceeds the capacity of the instances (spillover) or the load balancer.

Latency gauge seconds LoadBalancerName, Namespace, Region average 300

HTTP listener: Total time elapsed, in seconds, from the time the load balancer sent the request to a registered instance until the instance started to send the response headers.

TCP listener: Total time elapsed, in seconds, for the load balancer to successfully establish a connection to a registered instance.

RequestCount gauge LoadBalancerName, Namespace, Region sum 300

Number of requests completed or connections made during the specified interval (1 or 5 minutes).

HTTP listener: Number of requests received and routed, including HTTP error responses from the registered instances.

TCP listener: Number of connections made to the registered instances.

Scheme attribute LoadBalancerName, Namespace, Region Type of load balancer. Valid only for load balancers in a VPC.
SpilloverCount gauge LoadBalancerName, Namespace, Region sum 300

Total number of requests that were rejected because the surge queue is full.

HTTP listener: Load balancer returns an HTTP 503 error code.

TCP listener: Load balancer closes the connection.

SurgeQueueLength gauge LoadBalancerName, Namespace, Region max 300

Total number of requests (HTTP listener) or connections (TCP listener) that are pending routing to a healthy instance.

The maximum size of the queue is 1024. Additional requests or connections are rejected when the queue is full. For more information, see, spillover_count.

UnHealthyHostCount gauge LoadBalancerName, Namespace, Region max 300

Number of unhealthy instances registered with your load balancer.

An instance is considered unhealthy after it exceeds the unhealthy threshold configured for health checks. An unhealthy instance is considered healthy again after it meets the healthy threshold configured for health checks.

VpcId attribute LoadBalancerName, Namespace, Region ID of the VPC for the load balancer.

AWS/Events Copied

The AWS/Events service collects EventBridge metrics.

Metric name Metric type Dimension Statistic Period(s) Description
DeadLetterInvocations gauge RuleName, EventBusName, Namespace, Region sum 300

Number of times a rule’s target is not invoked in response to an event.

This includes invocations that would result in running the same rule again, causing an infinite loop.

EventPattern attribute RuleName, EventBusName, Namespace, Region Event pattern that triggers this rule.
Events gauge RuleName, EventBusName, Namespace, Region sum 300 Number of partner events ingested by EventBridge.
FailedInvocations gauge RuleName, EventBusName, Namespace, Region sum 300

Number of invocations that failed permanently.

This does not include invocations that are retried or invocations that succeeded after a retry attempt.

It also does not count failed invocations that are counted in DeadLetterInvocations.

IngestionToInvocationStartLatency gauge RuleName, EventBusName, Namespace, Region sum 300 Time to process events measured from when they are ingested by EventBridge to the first invocation of a target in the rules.
Invocations gauge RuleName, EventBusName, Namespace, Region sum 300

Number of times a target is invoked by a rule in response to an event.

This includes successful and failed invocations, but does not include throttled or retried attempts until they fail permanently. It does not include DeadLetterInvocations.

EventBridge only sends this metric to CloudWatch if it is not zero.

InvocationsFailedToBeSentToDlq gauge RuleName, EventBusName, Namespace, Region sum 300

Number of invocations that cannot be moved to a dead-letter queue.

Dead-letter queue errors occur due to permissions errors, unavailable resources, or size limits. EventBridge only sends this metric to CloudWatch if it isn’t zero.

InvocationsSentToDlq gauge RuleName, EventBusName, Namespace, Region sum 300

Number of invocations that are moved to a dead-letter queue.

EventBridge only sends this metric to CloudWatch if it is not zero.

MatchedEvents gauge RuleName, EventBusName, Namespace, Region sum 300 Number of events that matched with any rule.
ScheduleExpression attribute RuleName, EventBusName, Namespace, Region Rule is triggered based on the specified schedule expression.
State attribute RuleName, EventBusName, Namespace, Region Indicates if a rule is enabled or disabled.
ThrottledRules gauge RuleName, EventBusName, Namespace, Region sum 300 Number of rules that have tried to run but are being throttled.
TriggeredRules gauge RuleName, EventBusName, Namespace, Region 300

Number of rules that have run and matched with any event.

You cannot see this metric in CloudWatch until a rule is triggered.

AWS/GatewayELB Copied

The AWS/GatewayELB service collects metrics from Gateway Load Balancers.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
ActiveFlowCount gauge LoadBalancer, Namespace, Region average 300 Total number of concurrent flows (or connections) from clients to targets.
AvailabilityZones attribute LoadBalancer, Namespace, Region Subnets for the load balancer.
ConsumedLCUs gauge LoadBalancer, Namespace, Region average 300 Number of load balancer capacity units (LCU) used by the load balancer.
CreatedTime attribute LoadBalancer, Namespace, Region Date and time the load balancer was created.
DnsName attribute LoadBalancer, Namespace, Region Public DNS name of the load balancer.
HealthyHostCount gauge LoadBalancer, Namespace, Region max 300 Number of targets that are considered healthy.
IpAddressType attribute LoadBalancer, Namespace, Region Type of IP addresses used by the subnets for the load balancer.
NewFlowCount gauge LoadBalancer, Namespace, Region sum 300 Total number of new flows (or connections) established from clients to targets in the time period.
ProcessedBytes gauge bytes LoadBalancer, Namespace, Region sum 300 Total number of bytes processed by the load balancer.
Scheme attribute LoadBalancer, Namespace, Region Nodes of an internet-facing load balancer that have public IP addresses.
State attribute LoadBalancer, Namespace, Region State of the load balancer.
UnHealthyHostCount gauge LoadBalancer, Namespace, Region max 300 Number of targets that are considered unhealthy.
VpcId attribute LoadBalancer, Namespace, Region ID of the VPC for the load balancer.

AWS/Kinesis Copied

The AWS/Kinesis service collects metrics from Kinesis streams.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
EncryptionType attribute StreamName, Namespace, Region Server-side encryption type used on the stream. Possible values: NONE or KMS.
GetRecords.Bytes gauge bytes StreamName, Namespace, Region average 300 Number of bytes retrieved from the Kinesis stream, measured over the specified time period.
GetRecords.IteratorAgeMilliseconds gauge milliseconds StreamName, Namespace, Region average 300

Age of the last record in all GetRecords calls made against a Kinesis stream, measured over the specified time period.

Age is the difference between the current time and when the last record of the GetRecords call was written to the stream. A value of zero indicates that the records being read are completely caught up with the stream.

GetRecords.Latency gauge milliseconds StreamName, Namespace, Region average 300 Time taken per GetRecords operation, measured over the specified time period.
GetRecords.Records gauge StreamName, Namespace, Region average 300 Number of records retrieved from the shard, measured over the specified time period.
GetRecords.Success gauge StreamName, Namespace, Region average 300 Number of successful GetRecords operations per stream, measured over the specified time period.
IncomingBytes gauge bytes StreamName, Namespace, Region average 300 Number of bytes successfully put to the Kinesis stream over the specified time period. This metric includes bytes from PutRecord and PutRecords operations.
IncomingBytes gauge bytes StreamName, ShardId, Namespace, Region average 300 Number of bytes successfully put to the shard over the specified time period. This metric includes bytes from PutRecord and PutRecords operations.
IncomingRecords gauge StreamName, Namespace, Region average 300 Number of records successfully put to the Kinesis stream over the specified time period. This metric includes bytes from PutRecord and PutRecords operations.
IncomingRecords gauge StreamName, ShardId, Namespace, Region average 300 Number of records successfully put to the shard over the specified time period. This metric includes record counts from PutRecord and PutRecords operations.
IteratorAgeMilliseconds gauge milliseconds StreamName, ShardId, Namespace, Region average 300

Age of the last record in all GetRecords calls made against a shard, measured over the specified time period.

Age is the difference between the current time and when the last record of the GetRecords call was written to the stream.

OutgoingBytes gauge bytes StreamName, ShardId, Namespace, Region average 300 Number of bytes retrieved from the shard, measured over the specified time period.
OutgoingRecords gauge StreamName, ShardId, Namespace, Region average 300 Number of records retrieved from the shard, measured over the specified time period.
PutRecord.Bytes gauge bytes StreamName, Namespace, Region average 300 Number of bytes put to the Kinesis stream using the PutRecord operation over the specified time period.
PutRecord.Latency gauge milliseconds StreamName, Namespace, Region average 300 Time taken per PutRecord operation, measured over the specified time period.
PutRecord.Success gauge StreamName, Namespace, Region average 300 Number of successful PutRecord operations per Kinesis stream, measured over the specified time period. Average reflects the percentage of successful writes to a stream.
PutRecords.Bytes gauge bytes StreamName, Namespace, Region average 300 Number of bytes put to the Kinesis stream using the PutRecords operation over the specified time period.
PutRecords.FailedRecords gauge StreamName, Namespace, Region average 300 Number of records rejected due to internal failures in a PutRecords operation per Kinesis data stream, measured over the specified time period. Occasional internal failures are to be expected and should be retried.
PutRecords.Latency gauge milliseconds StreamName, Namespace, Region average 300 Time taken per PutRecords operation, measured over the specified time period.
PutRecords.Success gauge StreamName, Namespace, Region average 300 Number of successful PutRecords operations per Kinesis stream, measured over the specified time period. Average reflects the percentage of successful writes to a stream.
PutRecords.SuccessfulRecords gauge StreamName, Namespace, Region average 300 Number of successful records in a PutRecords operation per Kinesis data stream, measured over the specified time period.
PutRecords.ThrottledRecords gauge StreamName, Namespace, Region average 300 Number of records rejected due to throttling in a PutRecords operation per Kinesis data stream, measured over the specified time period.
PutRecords.TotalRecords gauge StreamName, Namespace, Region average 300 Total number of records sent in a PutRecords operation per Kinesis data stream, measured over the specified time period.
ReadProvisionedThroughputExceeded gauge StreamName, Namespace, Region average 300 Number of GetRecords calls throttled for the stream over the specified time period.
ReadProvisionedThroughputExceeded gauge StreamName, ShardId, Namespace, Region average 300 Number of GetRecords calls throttled for the shard over the specified time period. This exception count covers all dimensions of the following limits: 5 reads per shard per second or 2 MB per second per shard.
RetentionPeriodHours attribute StreamName, Namespace, Region Current retention period, in hours. The minimum value is 24, while its maximum value is 168.
State attribute StreamName, Namespace, Region Indicates whether the stream is being created, active, updating or being deleted.
SubscribeToShard.RateExceeded gauge StreamName, ConsumerName, Namespace, Region average 300 This metric is emitted when a new subscription attempt fails because there already is an active subscription by the same consumer, or if the exceed the number of calls per second allowed for this operation.
SubscribeToShard.Success gauge StreamName, ConsumerName, Namespace, Region average 300 This metric records whether the SubscribeToShard subscription was successfully established. The subscription only lives for at most 5 minutes. Therefore, this metric gets emitted at least once every 5 minutes.
SubscribeToShardEvent.Bytes gauge bytes StreamName, ConsumerName, Namespace, Region average 300

Number of bytes received from the shard, measured over the specified time period.

Minimum, Maximum, and Average statistics represent the bytes published in a single event for the specified time period.

SubscribeToShardEvent.MillisBehindLatest gauge milliseconds StreamName, ConsumerName, Namespace, Region average 300 Difference between the current time and when the last record of the SubscribeToShard event was written to the stream.
SubscribeToShardEvent.Records gauge StreamName, ConsumerName, Namespace, Region average 300

Number of records received from the shard, measured over the specified time period.

Minimum, Maximum, and Average statistics represent the records in a single event for the specified time period.

SubscribeToShardEvent.Success gauge StreamName, ConsumerName, Namespace, Region average 300 This metric is emitted every time an event is published successfully. It is only emitted when there is an active subscription.
WriteProvisionedThroughputExceeded gauge StreamName, Namespace, Region average 300

Number of records rejected due to throttling for the stream over the specified time period.

This metric includes throttling from PutRecord and PutRecords operations. The most commonly used statistic for this metric is Average.

WriteProvisionedThroughputExceeded gauge StreamName, ShardId, Namespace, Region average 300 Number of records rejected due to throttling for the shard over the specified time period. This metric includes throttling from PutRecord and PutRecords operations and covers all dimensions of the following limits: 1,000 records per second per shard or 1 MB per second per shard.

AWS/KMS Copied

The AWS/KMS service collects metrics from non-deleted KMS keys.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
Aliases attribute KeyId, Namespace, Region Alternative names for the key in CSV.
ExternalKeyStoreThrottle gauge KeyId, Namespace, Region max 300 Number of requests for cryptographic operations on KMS keys in each external key store that AWS KMS throttles (responds with a ThrottlingException).
KeySpec attribute KeyId, Namespace, Region Represents the cryptographic configuration of the KMS key.
KeyUsage attribute KeyId, Namespace, Region Indicates the purpose of the key. The value can be either Encrypt and decrypt or Sign and verify.
SecondsUntilKeyMaterialExpiration gauge seconds KeyId, Namespace, Region minimum 300 Number of seconds remaining until the imported key material expires.
Status attribute KeyId, Namespace, Region Indicates whether the key is enabled, disabled or pending deletion.
XksExternalKeyManagerStates gauge KeyId, Namespace, Region sum 300 Count of the number of external key manager instances in each of the following health states: Active, Degraded, and Unavailable.
XksProxyCertificateDaysToExpire gauge days KeyId, Namespace, Region minimum 300 Number of days until the TLS certificate for your external key store proxy endpoint (XksProxyUriEndpoint) expires.
XksProxyCredentialAge gauge days KeyId, Namespace, Region minimum 300 Number of days since the current external key store proxy authentication credential (XksProxyAuthenticationCredential) was associated with the external key store.
XksProxyErrors gauge KeyId, Namespace, Region sum 300 Number of exceptions related to AWS KMS requests to your external key store proxy.
XksProxyLatency gauge milliseconds KeyId, Namespace, Region average 300 Number of milliseconds it takes for an external key store proxy to respond to an AWS KMS request.

AWS/Lambda Copied

The AWS/Lambda service collects metrics from Lambda functions.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
AsyncEventAge gauge FunctionName, Namespace, Region max 300 The time between when Lambda successfully queues the event and when the function is invoked.
AsyncEventsDropped gauge FunctionName, Namespace, Region sum 300 The number of events that are dropped without successfully executing the function.
AsyncEventsReceived gauge FunctionName, Namespace, Region sum 300 The number of events that Lambda successfully queues for processing.
CodeSize attribute bytes FunctionName, Namespace, Region Size of the function’s deployment package, in bytes.
ConcurrentExecutions gauge FunctionName, Namespace, Region max 300 Number of function instances that are processing events. If this number reaches the concurrent executions quota for the region, or the reserved concurrency limit that you configured on the function, the Lambda throttles additional invocation requests.
DeadLetterErrors gauge FunctionName, Namespace, Region sum 300 For asynchronous invocation, the number of times that Lambda attempts to send an event to a dead-letter queue but fails. Dead-letter errors can occur due to permissions errors, misconfigured resources, or size limits.
Description attribute FunctionName, Namespace, Region Function’s description.
DestinationDeliveryFailures gauge FunctionName, Namespace, Region sum 300 For asynchronous invocation, the number of times that Lambda attempts to send an event to a destination but fails. Delivery errors can occur due to permissions errors, misconfigured resources, or size limits.
Duration gauge milliseconds FunctionName, Namespace, Region average 300 Amount of time that the function code spends processing an event. The billed duration for an invocation is the value of duration rounded up to the nearest millisecond.
Errors gauge FunctionName, Namespace, Region sum 300 Number of invocations that result in a function error. Function errors include exceptions that your code throws and exceptions that the Lambda runtime throws. The runtime returns errors for issues such as timeouts and configuration errors.
Invocations gauge FunctionName, Namespace, Region sum 300 Number of times that your function code is invoked, including successful invocations and invocations that result in a function error. Invocations aren’t recorded if the invocation request is throttled or otherwise results in an invocation error. This equals the number of requests billed.
IteratorAge gauge milliseconds FunctionName, Namespace, Region average 300 For event source mappings that read from streams, the age of the last record in the event. The age is the amount of time between when a stream receives the record and when the event source mapping sends the event to the function.
LastModified attribute FunctionName, Namespace, Region Date and time that the function was last updated, in ISO-8601 format (YYYY-MM-DDThh:mm:ss.sTZD).
OffsetLag gauge FunctionName, Namespace, Region average 300 For self-managed ApacheKafka and Amazon Managed Streaming for ApacheKafka (Amazon MSK) event sources, the difference in offset between the last record written to a topic and the last record that your Lambda function processed. Though a Kafka topic can have multiple partitions, this metric measures the offset lag at the topic level.
OversizedRecordCount gauge FunctionName, Namespace, Region sum 300 For Amazon DocumentDB event sources, the number of events your function receives from your change stream that are over 6 MB in size.
PackageType attribute FunctionName, Namespace, Region Type of deployment package. Possible value can be either: ZIP or IMAGE.
PostRuntimeExtensionsDuration gauge milliseconds FunctionName, Namespace, Region average 300 Cumulative amount of time that the runtime spends running code for extensions after the function code has completed.
ProvisionedConcurrencyInvocations gauge FunctionName, Namespace, Region sum 300 Number of times that the function code is invoked on provisioned concurrency.
ProvisionedConcurrencySpilloverInvocations gauge FunctionName, Namespace, Region sum 300 Number of times that the function code is invoked on standard concurrency when all provisioned concurrency is in use.
ProvisionedConcurrencyUtilization gauge FunctionName, Namespace, Region max 300 For a version or alias, the value of ProvisionedConcurrentExecution divided by the total amount of provisioned concurrency allocated. For example, .5 indicates that 50 percent of allocated provisioned concurrency is in use.
ProvisionedConcurrentExecutions gauge FunctionName, Namespace, Region max 300 Number of function instances that are processing events on provisioned concurrency. For each invocation of an alias or version with provisioned concurrency, Lambda emits the current count.
RecursiveInvocationsDropped gauge FunctionName, Namespace, Region sum 300 Number of times that Lambda has stopped invocation of your function because it is detected that your function is part of an infinite recursive loop.
Runtime attribute FunctionName, Namespace, Region Runtime environment for the Lambda function.
State status FunctionName, Namespace, Region Current state of the function.
Throttles gauge FunctionName, Namespace, Region sum 300 Number of invocation requests that are throttled. When all function instances are processing requests and no concurrency is available to scale up, Lambda rejects additional requests with a TooManyRequestsException error. Throttled requests and other invocation errors do not count as invocations or errors.
UnreservedConcurrentExecutions gauge FunctionName, Namespace, Region max 300 For a region, the number of events that functions without reserved concurrency are processing.

AWS/Logs Copied

The AWS/Logs service collects metrics from log groups and their subscription filters.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
CallCount gauge LogGroupName, Namespace, Region sum 300 Number of specified API operations performed in your account.
DeliveryErrors gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of log events for which CloudWatch Logs received an error when forwarding data to the subscription destination.
DeliveryThrottling gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of log events for which CloudWatch Logs was throttled when forwarding data to the subscription destination.
Destination attribute LogGroupName, DestinationType, FilterName, Namespace, Region Destination set for this log group.
EMFParsingErrors gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of parsing errors encountered while processing embedded metric format logs. These errors happen when logs are identified as embedded metric format but do not follow the correct format.
EMFValidationErrors gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of validation errors encountered while processing embedded metric format logs. These errors occur when metric definitions within embedded metric format logs do not adhere to the embedded metric format and MetricDatum specifications.
ErrorCount gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of API operations performed in your account that resulted in errors.
FilterPattern attribute LogGroupName, DestinationType, FilterName, Namespace, Region Sets the FilterPattern property for this object.
ForwardedBytes gauge bytes LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Volume of log events in compressed bytes forwarded to the subscription destination.
ForwardedLogEvents gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of log events forwarded to the subscription destination.
IncomingBytes gauge bytes LogGroupName, Namespace, Region sum 300 Volume of log events in uncompressed bytes uploaded to CloudWatch Logs.
IncomingLogEvents gauge LogGroupName, Namespace, Region sum 300 Number of log events uploaded to CloudWatch Logs.
LogEventsWithFindings gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of log events that matched a data string that you are auditing using the CloudWatch Logs data protection feature.
SubscriptionFilterCount attribute LogGroupName, Namespace, Region Number of subscription filters for this log group.
ThrottleCount gauge LogGroupName, DestinationType, FilterName, Namespace, Region sum 300 Number of API operations performed in your account that were throttled because of usage quotas.

AWS/NATGateway Copied

The AWS/NATGateways service collects metrics from non-deleted NAT Gateways.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
ActiveConnectionCount gauge NatGatewayId, Namespace, Region max 300 Total number of concurrent active TCP connections through the NAT gateway.
BytesInFromDestination gauge bytes NatGatewayId, Namespace, Region sum 300 Number of bytes received by the NAT gateway from the destination.
BytesInFromSource gauge bytes NatGatewayId, Namespace, Region sum 300 Number of bytes received by the NAT gateway from clients in your VPC.
BytesOutToDestination gauge bytes NatGatewayId, Namespace, Region sum 300 Number of bytes sent out through the NAT gateway to the destination.
BytesOutToSource gauge bytes NatGatewayId, Namespace, Region sum 300 Number of bytes sent through the NAT gateway to the clients in your VPC.
ConnectionAttemptCount gauge NatGatewayId, Namespace, Region sum 300 Number of connection attempts made through the NAT gateway.
ConnectionEstablishedCount gauge NatGatewayId, Namespace, Region sum 300 Number of connections established through the NAT gateway.
ConnectivityType attribute NatGatewayId, Namespace, Region Indicates whether the NAT gateway supports public or private connectivity.
ElasticIpAddress attribute NatGatewayId, Namespace, Region Public NAT gateway only: Elastic IP address associated with the NAT gateway.
ErrorPortAllocation gauge NatGatewayId, Namespace, Region sum 300 Number of times the NAT gateway could not allocate a source port.
IdleTimeoutCount gauge NatGatewayId, Namespace, Region sum 300 Number of connections that transitioned from the active state to the idle state.
PacketsDropCount gauge NatGatewayId, Namespace, Region sum 300 Number of packets dropped by the NAT gateway.
PacketsInFromDestination gauge NatGatewayId, Namespace, Region sum 300 Number of packets received by the NAT gateway from the destination.
PacketsInFromSource gauge NatGatewayId, Namespace, Region sum 300 Number of packets received by the NAT gateway from clients in your VPC.
PacketsOutToDestination gauge NatGatewayId, Namespace, Region sum 300 Number of packets sent out through the NAT gateway to the destination.
PacketsOutToSource gauge NatGatewayId, Namespace, Region sum 300 Number of packets sent through the NAT gateway to the clients in your VPC.
PeakBytesPerSecond gauge NatGatewayId, Namespace, Region max 300 Reports the highest 10-second bytes per second average in a given minute.
PeakPacketsPerSecond gauge NatGatewayId, Namespace, Region max 300 Calculates the average packet rate (packets processed per second) every 10 seconds for 60 seconds and then reports the highest average packet rate among the six rates.
PrivateIpAddress attribute NatGatewayId, Namespace, Region Private IP address associated with the NAT gateway.
State attribute NatGatewayId, Namespace, Region State of the NAT gateway.
StateMessage attribute NatGatewayId, Namespace, Region If the NAT gateway could not be created, this specifies the error message for the failure that corresponds to the error code.

AWS/NetworkELB Copied

The AWS/NetworkELB service collects metrics from Network Load Balancers.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
ActiveFlowCount gauge LoadBalancer, Namespace, Region average 300 Total number of concurrent flows (or connections) from clients to targets.
ActiveFlowCount_TCP gauge LoadBalancer, Namespace, Region average 300 Total number of concurrent TCP flows (or connections) from clients to targets.
ActiveFlowCount_TLS gauge LoadBalancer, Namespace, Region average 300 Total number of concurrent TLS flows (or connections) from clients to targets.
ActiveFlowCount_UDP gauge LoadBalancer, Namespace, Region average 300 Total number of concurrent UDP flows (or connections) from clients to targets.
AvailabilityZones attribute LoadBalancer, Namespace, Region Subnets for the load balancer.
ClientTLSNegotiationErrorCount gauge LoadBalancer, Namespace, Region sum 300 Total number of TLS handshakes that failed during negotiation between a client and a TLS listener.
ConsumedLCUs gauge LoadBalancer, Namespace, Region average 300 Number of load balancer capacity units (LCU) used by the load balancer.
ConsumedLCUs_TCP gauge LoadBalancer, Namespace, Region average 300 Number of load balancer capacity units (LCU) used by the load balancer for TCP.
ConsumedLCUs_TLS gauge LoadBalancer, Namespace, Region average 300 Number of load balancer capacity units (LCU) used by the load balancer for TLS.
ConsumedLCUs_UDP gauge LoadBalancer, Namespace, Region average 300 Number of load balancer capacity units (LCU) used by your load balancer for UDP.
CreatedTime attribute LoadBalancer, Namespace, Region Date and time the load balancer was created.
DnsName attribute LoadBalancer, Namespace, Region Public DNS name of the load balancer.
HealthyHostCount gauge LoadBalancer, Namespace, Region max 300 Number of targets that are considered healthy.
IpAddressType attribute LoadBalancer, Namespace, Region Type of IP addresses used by the subnets for your load balancer.
NewFlowCount gauge LoadBalancer, Namespace, Region sum 300 The total number of new flows (or connections) established from clients to targets in the time period.
NewFlowCount_TCP gauge LoadBalancer, Namespace, Region sum 300 Total number of new TCP flows (or connections) established from clients to targets in the time period.
NewFlowCount_TLS gauge LoadBalancer, Namespace, Region sum 300 Total number of new TLS flows (or connections) established from clients to targets in the time period.
NewFlowCount_UDP gauge LoadBalancer, Namespace, Region sum 300 Total number of new UDP flows (or connections) established from clients to targets in the time period.
PeakBytesPerSecond gauge bytes per second LoadBalancer, Namespace, Region max 300 Highest average throughput (bytes per second), calculated every 10 seconds during the sampling window.
PeakPacketsPerSecond gauge LoadBalancer, Namespace, Region max 300 Highest average packet rate (packets processed per second), calculated every 10 seconds during the sampling window.
PortAllocationErrorCount gauge LoadBalancer, Namespace, Region sum 300 Total number of ephemeral port allocation errors during a client IP translation operation.
ProcessedBytes gauge bytes LoadBalancer, Namespace, Region sum 300 Total number of bytes processed by the load balancer, including TCP/IP headers.
ProcessedBytes_TCP gauge bytes LoadBalancer, Namespace, Region sum 300 Total number of bytes processed by TCP listeners.
ProcessedBytes_TLS gauge bytes LoadBalancer, Namespace, Region sum 300 Total number of bytes processed by TLS listeners.
ProcessedBytes_UDP gauge bytes LoadBalancer, Namespace, Region sum 300 Total number of bytes processed by UDP listeners.
ProcessedPackets gauge LoadBalancer, Namespace, Region sum 300 Total number of packets processed by the load balancer.
Scheme attribute LoadBalancer, Namespace, Region Nodes of an Internet-facing load balancer that have public IP addresses.
SecurityGroupBlockedFlowCount_Inbound_ICMP gauge LoadBalancer, Namespace, Region sum 300 Number of new ICMP messages rejected by the inbound rules of the load balancer security groups.
SecurityGroupBlockedFlowCount_Inbound_TCP gauge LoadBalancer, Namespace, Region sum 300 Number of new TCP flows rejected by the inbound rules of the load balancer security groups.
SecurityGroupBlockedFlowCount_Inbound_UDP gauge LoadBalancer, Namespace, Region sum 300 Number of new UDP flows rejected by the inbound rules of the load balancer security groups.
SecurityGroupBlockedFlowCount_Outbound_ICMP gauge LoadBalancer, Namespace, Region sum 300 Number of new ICMP messages rejected by the outbound rules of the load balancer security groups.
SecurityGroupBlockedFlowCount_Outbound_TCP gauge LoadBalancer, Namespace, Region sum 300 Number of new TCP flows rejected by the outbound rules of the load balancer security groups.
SecurityGroupBlockedFlowCount_Outbound_UDP gauge LoadBalancer, Namespace, Region sum 300 Number of new UDP flows rejected by the outbound rules of the load balancer security groups.
State attribute LoadBalancer, Namespace, Region State of the load balancer.
TCP_Client_Reset_Count gauge LoadBalancer, Namespace, Region sum 300 Total number of reset (RST) packets sent from a client to a target.
TCP_ELB_Reset_Count gauge LoadBalancer, Namespace, Region sum 300 Total number of reset (RST) packets generated by the load balancer.
TCP_Target_Reset_Count gauge LoadBalancer, Namespace, Region sum 300 Total number of reset (RST) packets sent from a target to a client.
TargetTLSNegotiationErrorCount gauge LoadBalancer, Namespace, Region sum 300 Total number of TLS handshakes that failed during negotiation between a TLS listener and a target.
UnHealthyHostCount gauge LoadBalancer, Namespace, Region max 300 Number of targets that are considered unhealthy.
UnhealthyRoutingFlowCount gauge LoadBalancer, Namespace, Region max 300 Number of flows or connections that are routed using the routing failover action (fail open).
VpcId attribute LoadBalancer, Namespace, Region ID of the VPC for the load balancer.

AWS/NetworkFirewall Copied

The AWS/NetworkFirewall service collects metrics from VPC firewalls.

Metric name Metric type Dimension Statistic Period(s) Description
DroppedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets dropped by the Network Firewall firewall.
InvalidDroppedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets dropped for failing packet validation due to issues with the packet.
OtherDroppedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets dropped due to reasons other than those described by InvalidDroppedPackets or DroppedPackets.
Packets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets inspected for a firewall policy or stateless rulegroup for which a custom action is defined.
PassedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets that the Network Firewall firewall allowed through to their destinations.
ReceivedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets received by the Network Firewall firewall.
RejectedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets rejected due to Reject stateful rule actions.
StreamExceptionPolicyPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets matching the firewall policy’s stream exception policy.
TLSDroppedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets dropped by Network Firewall while inspecting SSL/TLS packets.
TLSErrors gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of errors observed by Network Firewall while inspecting SSL/TLS packets.
TLSPassedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets passed by Network Firewall while inspecting SSL/TLS packets.
TLSReceivedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of SSL/TLS packets received by the Network Firewall firewall.
TLSRejectedPackets gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of packets rejected by Network Firewall while inspecting SSL/TLS packets.
TLSRevocationStatusOKConnections gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of SSL/TLS connections to TLS servers whose certificates have been confirmed as not revoked.
TLSRevocationStatusRevokedConnections gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of SSL/TLS connections to TLS servers whose certificates have been confirmed as revoked.
TLSRevocationStatusUnknownConnections gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of SSL/TLS connections to TLS servers whose certificates revocation status is unknown or could not be determined by the firewall.
TLSTimedOutConnections gauge FirewallName, AvailabilityZone, Engine, Namespace, Region sum 300 Number of SSL/TLS connections that timed out during SSL/TLS inspection by Network Firewall.
VpcId attribute FirewallName, Namespace, Region Unique identifier of the VPC where the firewall is in use.

AWS/RDS Copied

The AWS/RDS service collects metrics from non-failed Amazon relational databases.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
AuroraVolumeBytesLeftTotal gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Remaining available space for the cluster volume.
BackupRetentionPeriodStorageUsed gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Total amount of backup storage used to support the point-in-time restore feature within the Aurora DB cluster’s backup retention window.
BinLogDiskUsage gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Amount of disk space occupied by binary logs. If automatic backups are enabled for MySQL and MariaDB instances, including read replicas, binary logs are created.
BurstBalance gauge percent DBInstanceIdentifier, Namespace, Region average 300 Percent of General Purpose SSD (gp2) burst-bucket I/O credits available.
CPUCreditBalance gauge minutes DBInstanceIdentifier, Namespace, Region average 300

T2 instances: Number of earned CPU credits that an instance has accrued since it was launched or started.

For T2 Standard, the CPUCreditBalance also includes the number of launch credits that have been accrued.

CPUCreditUsage gauge minutes DBInstanceIdentifier, Namespace, Region average 300 T2 instances: Number of CPU credits spent by the instance for CPU utilization. One CPU credit equals one vCPU running at 100 percent utilization for one minute, or an equivalent combination of vCPUs.
CPUUtilization gauge percent DBInstanceIdentifier, Namespace, Region average 300 Percentage of CPU utilization.
CheckpointLag gauge DBInstanceIdentifier, Namespace, Region average 300 Amount of time since the most recent checkpoint.
ConnectionAttempts gauge DBInstanceIdentifier, Namespace, Region sum 300 Number of attempts to connect to an instance, whether successful or not.
DBClusterIdentifier attribute DBInstanceIdentifier, Namespace, Region Determines if the DB instance is a member of a DB cluster.
DBInstanceStatus attribute DBInstanceIdentifier, Namespace, Region Specifies the current state of this database.
DBLoadCPU gauge DBInstanceIdentifier, Namespace, Region average 300 Number of active sessions where the wait event type is CPU.
DBName attribute DBInstanceIdentifier, Namespace, Region The meaning of this parameter differs according to the database engine you use. This can be the database name or the database ID.
DatabaseClass attribute DBInstanceIdentifier, Namespace, Region Contains the name of the compute and memory capacity class of the DB instance.
DatabaseConnections gauge DBInstanceIdentifier, Namespace, Region sum 300 Number of client network connections to the database instance.
DiskQueueDepth gauge DBInstanceIdentifier, Namespace, Region sum 300 Number of outstanding I/Os (read and write requests) waiting to access the disk.
EBSByteBalance% gauge percent DBInstanceIdentifier, Namespace, Region average 300 Percentage of throughput credits remaining in the burst bucket of your RDS database. This metric is available for basic monitoring only.
EBSIOBalance% gauge percent DBInstanceIdentifier, Namespace, Region average 300 Percentage of I/O credits remaining in the burst bucket of your RDS database. This metric is available for basic monitoring only.
EngineName attribute DBInstanceIdentifier, Namespace, Region Name of the database engine to be used for this DB instance.
FailedSQLServerAgentJobsCount gauge DBInstanceIdentifier, Namespace, Region average 300 Number of failed MicrosoftSQL Server Agent jobs during the last minute.
FreeStorageSpace gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Amount of available storage space.
FreeableMemory gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Amount of available random access memory.
MaximumUsedTransactionIDs gauge DBInstanceIdentifier, Namespace, Region average 300 Maximum transaction IDs that have been used. This applies to Postgresql.
NetworkReceiveThroughput gauge bytes_per_second DBInstanceIdentifier, Namespace, Region average 300 Incoming (receive) network traffic on the DB instance.
NetworkTransmitThroughput gauge bytes_per_second DBInstanceIdentifier, Namespace, Region average 300 Outgoing (transmit) network traffic on the DB instance.
OldestReplicationSlotLag gauge byes DBInstanceIdentifier, Namespace, Region average 300 Lagging size of the replica lagging the most in terms of write-ahead log (WAL) data received. This applies to Postgresql.
ReadIOPS gauge per_second DBInstanceIdentifier, Namespace, Region average 300 Average number of disk read I/O operations per second.
ReadLatency gauge seconds DBInstanceIdentifier, Namespace, Region average 300 Average amount of time taken per disk I/O operation.
ReadThroughput gauge bytes_per_second DBInstanceIdentifier, Namespace, Region average 300 Average number of bytes read from disk per second.
ReplicaLag gauge seconds DBInstanceIdentifier, Namespace, Region average 300 Amount of time a read replica DB instance lags behind the source DB instance. This applies to MySQL.
ReplicationSlotDiskUsage gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Disk space used by replication slot files. This applies to Postgresql.
SnapshotStorageUsed gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Total amount of backup storage consumed by all Aurora snapshots for an Aurora DB cluster outside its backup retention window.
SwapUsage gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Amount of swap space used on the DB instance. This metric is not available for SQL Server.
TransactionLogsDiskUsage gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Disk space used by transaction logs. This applies to Postgresql.
TransactionLogsGeneration gauge bytes_per_second DBInstanceIdentifier, Namespace, Region average 300 Size of transaction logs generated per second. This applies to Postgresql.
VolumeBytesUsed gauge bytes DBInstanceIdentifier, Namespace, Region average 300 Amount of storage used by your Aurora DB instance.
VolumeReadIOPs gauge DBInstanceIdentifier, Namespace, Region average 300 Number of billed read I/O operations from a cluster volume within a 5-minute interval.
VolumeWriteIOPs gauge DBInstanceIdentifier, Namespace, Region average 300 Number of write disk I/O operations to the cluster volume, reported at 5-minute intervals.
WriteIOPS gauge per_second DBInstanceIdentifier, Namespace, Region average 300 Average number of disk write I/O operations per second.
WriteLatency gauge seconds DBInstanceIdentifier, Namespace, Region average 300 Average amount of time taken per disk I/O operation.
WriteThroughput gauge bytes_per_second DBInstanceIdentifier, Namespace, Region average 300 Average number of bytes written to disk per second.

AWS/S3 Copied

The AWS/S3 service collects storage metrics and replication metrics (if any) from the S3 buckets.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
4xxErrors gauge BucketName, FilterId, Namespace, Region average 86400 Number of HTTP 4xx client error status code requests made to an Amazon S3 bucket with a value of either 0 or 1.
5xxErrors gauge BucketName, FilterId, Namespace, Region average 86400 Number of HTTP 5xx server error status code requests made to an Amazon S3 bucket with a value of either 0 or 1.
AllRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Total number of HTTP requests made to an Amazon S3 bucket regardless of type.
BucketSizeBytes gauge bytes BucketName, StorageType, Namespace, Region average 86400 Amount of data in bytes stored in a bucket in the STANDARD storage class, INTELLIGENT_TIERING storage class, Standard-Infrequent Access (STANDARD_IA) storage class, OneZone-Infrequent Access (ONEZONE_IA), Reduced Redundancy Storage (RRS) class, S3 Glacier Instant Retrieval storage class, Deep Archive Storage (S3 Glacier Deep Archive) class or, S3 Glacier Flexible Retrieval (GLACIER) storage class.
BytesDownloaded gauge bytes BucketName, FilterId, Namespace, Region average 86400 Number of bytes downloaded for requests made to an Amazon S3 bucket.
BytesPendingReplication gauge bytes RuleId, Namespace, Region, DestinationBucket, SourceBucket max 300 Total number of bytes of objects pending replication for a given replication rule.
BytesUploaded gauge bytes BucketName, FilterId, Namespace, Region average 86400 Number of bytes uploaded that contain a request body.
CreationDate attribute BucketName, Namespace, Region average Date the bucket was created.
DeleteRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Number of HTTP DELETE requests made for objects in an Amazon S3 bucket.
FirstByteLatency gauge milliseconds BucketName, FilterId, Namespace, Region average 86400 Per-request time from the complete request being received by an Amazon S3 bucket to when the response starts to be returned.
GetRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Number of HTTP GET requests made for objects in an Amazon S3 bucket.
HeadRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Number of HTTP HEAD requests made to an Amazon S3 bucket.
ListRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Number of HTTP requests that list the contents of a bucket.
NumberOfObjects gauge BucketName, StorageType, Namespace, Region average 86400 Total number of objects stored in a bucket for all storage classes.
OperationsPendingReplication gauge bytes RuleId, Namespace, Region, DestinationBucket, SourceBucket max 300 Number of operations pending replication for a given replication rule.
PostRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Number of HTTP POST requests made to an Amazon S3 bucket.
PutRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Number of HTTP PUT requests made for objects in an Amazon S3 bucket.
ReplicationLatency gauge seconds RuleId, Namespace, Region, DestinationBucket, SourceBucket max 300 Maximum number of seconds by which the replication destination Region is behind the source Region for a given replication rule.
SelectBytesReturned gauge bytes BucketName, FilterId, Namespace, Region sum 86400 Number of bytes of data returned with Amazon S3 SELECT Object Content requests in an Amazon S3 bucket.
SelectBytesScanned gauge bytes BucketName, FilterId, Namespace, Region sum 86400 Number of bytes of data scanned with Amazon S3 SELECT Object Content requests in an Amazon S3 bucket.
SelectRequests gauge BucketName, FilterId, Namespace, Region sum 86400 Number of Amazon S3 SELECT Object Content requests made for objects in an Amazon S3 bucket.
Status attribute RuleId, Namespace, Region Specifies if the rule is enabled.
TotalRequestLatency gauge milliseconds BucketName, FilterId, Namespace, Region average 86400 Elapsed per-request time from the first byte received to the last byte sent to an Amazon S3 bucket.

AWS/SDKUsage Copied

The AWS/SDKUsage service collects metrics from SDK Usage metrics.

To get these metrics, you need to enable the AWS SDK Usage metrics collector configuration, AwsSdkUsageMetricsCollector, in the Collection Agent YAML file. See AWS SDK Usage metrics collector.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
avgApiCallDurationLast5Min gauge milliseconds Namespace, ServiceId, OperationName, Region average 300 Average API call duration from the last 5-minute window.
avgRetryCountLast5Min gauge Namespace, ServiceId, OperationName, Region average 300 Average retry count from the last 5-minute window.
failedApiCallLast5Min gauge Namespace, ServiceId, OperationName, Region sum 300 Total number of failed API calls from the last 5-minute window.
successfulApiCallLast5Min gauge Namespace, ServiceId, OperationName, Region sum 300 Total number of successful API calls from the last 5-minute window.

AWS/SNS Copied

The AWS/SNS service collects metrics from SNS Topics.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
DisplayName attribute TopicName, Namespace, Region Human-readable name used in the From field for notifications to email and email-json endpoints.
NumberOfMessagesPublished gauge TopicName, Namespace, Region sum 300 Number of messages published to the Amazon SNS topics.
NumberOfNotificationsDelivered gauge TopicName, Namespace, Region sum 300 Number of messages successfully delivered from the Amazon SNS topics to subscribing endpoints.
NumberOfNotificationsFailed gauge TopicName, Namespace, Region sum 300 Number of messages that Amazon SNS failed to deliver.
NumberOfNotificationsFailedToRedriveToDlq gauge TopicName, Namespace, Region sum 300 Number of messages that could not be moved to a dead-letter queue.
NumberOfNotificationsFilteredOut gauge TopicName, Namespace, Region sum 300

Number of messages that were rejected by subscription filter policies.

A filter policy rejects a message when the message attributes do not match the policy attributes.

NumberOfNotificationsFilteredOut-InvalidAttributes gauge TopicName, Namespace, Region sum 300 Number of messages that were rejected by subscription filter policies because the messages’ attributes are invalid. For example, the attribute JSON was formatted incorrectly.
NumberOfNotificationsFilteredOut-NoMessageAttribute gauge TopicName, Namespace, Region sum 300 Number of messages that were rejected by subscription filter policies because the messages have no attributes.
NumberOfNotificationsRedrivenToDlq gauge TopicName, Namespace, Region sum 300 Number of messages that have been moved to a dead-letter queue.
Owner attribute TopicName, Namespace, Region Amazon Web Services account ID of the topic’s owner.
PublishSize gauge bytes TopicName, Namespace, Region average 300 Size of messages being published.

AWS/TransitGateway Copied

The AWS/TransitGateways service collects metrics from non-deleted Transit Gateways.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
BytesDropCountBlackhole gauge bytes TransitGateway, Namespace, Region average 300 Number of bytes dropped because they matched a blackhole route.
BytesDropCountBlackhole gauge bytes TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of bytes dropped because they matched a blackhole route on the transit gateway attachment.
BytesDropCountNoRoute gauge bytes TransitGateway, Namespace, Region average 300 Number of bytes dropped because they did not match a route.
BytesDropCountNoRoute gauge bytes TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of bytes dropped because they did not match a route on the transit gateway attachment.
BytesIn gauge bytes TransitGateway, Namespace, Region average 300 Number of bytes received by the transit gateway.
BytesIn gauge bytes TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of bytes received by the transit gateway from the attachment.
BytesOut gauge bytes TransitGateway, Namespace, Region average 300 Number of bytes sent from the transit gateway.
BytesOut gauge bytes TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of bytes sent from the transit gateway to the attachment.
Description attribute TransitGateway, Namespace, Region Description of the transit gateway.
OwnerId attribute TransitGateway, Namespace, Region ID of the Amazon Web Services account that owns the transit gateway.
PacketDropCountBlackhole gauge TransitGateway, Namespace, Region average 300 Number of packets dropped because they matched a blackhole route.
PacketDropCountBlackhole gauge TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of packets dropped because they matched a blackhole route on the transit gateway attachment.
PacketDropCountNoRoute gauge TransitGateway, Namespace, Region average 300 Number of packets dropped because they did not match a route.
PacketDropCountNoRoute gauge TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of packets dropped because they did not match a route on the transit gateway attachment.
PacketsIn gauge TransitGateway, Namespace, Region average 300 Number of packets received by the transit gateway.
PacketsIn gauge TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of packets received by the transit gateway from the attachment.
PacketsOut gauge TransitGateway, Namespace, Region average 300 Number of packets sent by the transit gateway.
PacketsOut gauge TransitGateway, TransitGatewayAttachment, Namespace, Region average 300 Number of packets sent by the transit gateway to the attachment.
State attribute TransitGateway, Namespace, Region State of the transit gateway.

AWS/Usage Copied

The AWS/Usage service collects service quota usage metrics.

Metric name Metric type Dimension Statistic Period(s) Description
CallCount gauge Resource, Service,Type, Class, Namespace, Region sum 300 Number of specified operations performed in your account.
ErrorCount gauge Resource, Service,Type, Class, Namespace, Region sum 300
ResourceCount gauge Resource, Service,Type, Class, Namespace, Region sum 300
ThrottleCount gauge Resource, Service,Type, Class, Namespace, Region sum 300

AWS/VPN Copied

The AWS/VPN service collects metrics from non-deleted Virtual Private Networks.

Metric name Metric type Unit name Dimension Statistic Period(s) Description
Category attribute VpnId, Namespace, Region Category of the VPN connection.
CustomerGatewayId attribute VpnId, Namespace, Region ID of the customer gateway at your end of the VPN connection.
State attribute VpnId, Namespace, Region Current state of the VPN connection.
TransitGatewayId attribute VpnId, Namespace, Region ID of the transit gateway associated with the VPN connection.
TunnelDataIn gauge bytes VpnId, Namespace, Region sum 300

Bytes received on the AWS side of the connection through the VPN tunnel from a customer gateway.

Each metric data point represents the number of bytes received after the previous data point.

TunnelDataOut gauge bytes VpnId, Namespace, Region sum 300

Bytes sent from the AWS side of the connection through the VPN tunnel to the customer gateway.

Each metric data point represents the number of bytes sent after the previous data point.

TunnelState gauge bytes VpnId, Namespace, Region average 300

State of the tunnels.

For static VPNs, 0 indicates DOWN, while 1 indicates UP.

For BGP VPNs, 1 indicates ESTABLISHED, while 0 is used for all other states.

For both types of VPNs, values between 0 and 1 indicate at least one tunnel is not UP.

Type attribute VpnId, Namespace, Region Type of VPN connection.
VpnGatewayId attribute VpnId, Namespace, Region ID of the virtual private gateway at the Amazon Web Services side of the VPN connection.

Required AWS plugin permissions Copied

Service Permission
All Services cloudwatch:GetMetricData
All Services cloudwatch:ListMetrics
AWS/ApplicationELB elasticloadbalancingv2:DescribeLoadBalancers
AWS/ApplicationELB elasticloadbalancingv2:DescribeTags
AWS/AutoScaling autoscaling:DescribeAutoScalingGroups
AWS/Billing budgets:ViewBudget
AWS/Billing Billing Services
AWS/CertificateManager acm:ListCertificates
AWS/CertificateManager acm:DescribeCertificate
AWS/CertificateManager acm:ListTagsForCertificate
AWS/DynamoDB dynamodb:DescribeTable
AWS/DynamoDB dynamodb:DescribeTable
AWS/EBS ebs:DescribeVolumes
AWS/EC2 ec2:DescribeInstances
AWS/ECS ecs:DescribeClusters
AWS/ECS ecs:ListClusters
AWS/ECS ecs:ListTagsForResource
AWS/EFS elasticfilesystem:DescribeFileSystems
AWS/EKS eks:DescribeClusters
AWS/EKS eks:DescribeNodegroup
AWS/EKS eks:ListClusters
AWS/EKS eks:ListNodegroups
AWS/ElastiCache elasticache:DescribeCacheClusters
AWS/ElastiCache elasticache:ListTagsForResource
AWS/ELB elasticloadbalancing:DescribeLoadBalancers
AWS/ELB elasticloadbalancing:DescribeTags
AWS/Events eventbridge:ListEventBuses
AWS/Events eventbridge:ListRules
AWS/Events eventbridge:ListTagsForResource
AWS/GatewayELB elasticloadbalancingv2:DescribeLoadBalancers
AWS/GatewayELB elasticloadbalancingv2:DescribeTags
AWS/Kinesis kinesis:ListStreams
AWS/Kinesis kinesis:DescribeStream
AWS/Kinesis kinesis:ListTagsForStream
AWS/KMS kms:ListKeys
AWS/KMS kms:ListAliases
AWS/KMS kms:ListResourceTags
AWS/Lambda lambda:ListFunctions
AWS/Lambda lambda:ListTags
AWS/Logs logs:DescribeLogGroups
AWS/Logs logs:DescribeSubscriptionFilters
AWS/Logs logs:ListTagsLogGroup
AWS/NATGateway ec2:DescribeNatGateways
AWS/NetworkELB elasticloadbalancingv2:DescribeLoadBalancers
AWS/NetworkELB elasticloadbalancingv2:DescribeTags
AWS/NetworkFirewall network-firewall:ListFirewalls
AWS/NetworkFirewall network-firewall:DescribeFirewall
AWS/RDS rds:DescribeDBInstances
AWS/S3 s3:GetBucketTagging
AWS/S3 s3:ListBucket
AWS/S3 s3:ListAllMyBuckets
AWS/S3 s3:GetBucketLocation
AWS/SNS sns:ListTopics
AWS/SNS sns:GetTopicAttributes
AWS/SNS sns:ListTagsForResource
AWS/TransitGateway ec2:DescribeTransitGateways
AWS/VPN ec2:DescribeVpnConnections

Endpoints accessed by AWS plugin Copied

Service Endpoint Description
AWS/AutoScaling autoscaling.<region>.amazonaws.com
AWS/Billing budgets.amazonaws.com
AWS/Billing sts.amazonaws.com For identity purposes
AWS/CertificateManager acm.<region>.amazonaws.com
AWS/DynamoDB dynamodb.<region>.amazonaws.com
AWS/EC2 ec2.<region>.amazonaws.com
AWS/ECS ecs.<region>.amazonaws.com
AWS/EFS elasticfilesystem.<region>.amazonaws.com
AWS/EKS eks.<region>.amazonaws.com
AWS/ElastiCache elasticache.<region>.amazonaws.com
AWS/ELB, AWS/GatewayELB, AWS/NetprobeELB, AWS/ApplicationELB elasticloadbalancing.<region>.amazonaws.com
AWS/Events events.<region>.amazonaws.com
AWS/Kinesis kinesis.<region>.amazonaws.com
AWS/KMS kms.<region>.amazonaws.com
AWS/Lambda lambda.<region>.amazonaws.com
AWS/Logs logs.<region>.amazonaws.com
AWS/NetworkFirewall network-firewall.<region>.amazonaws.com
AWS/RDS rds.<region>.amazonaws.com
AWS/S3 s3.<region>.amazonaws.com
AWS/S3 <bucket_name>.s3.<region>.amazonaws.com
AWS/SNS sns.<region>.amazonaws.com
Cloudwatch Monitoring monitoring.<region>.amazonaws.com For accessing metrics via Cloudwatch
["Geneos"] ["User Guide"]

Was this topic helpful?