Resource and hardware requirements

Make sure that you meet the following resource and hardware requirements for your configuration size before installing ITRS Analytics.

The required size by ITRS Analytics depends mainly on the message rate it needs to handle.

T-shirt sizing	Message rate	ITRS Analytics entities	Indicative server range
Large	100,000	250,000	3,000-9,000
Medium	50,000	125,000	900-3,000
Small	10,000	25,000	300-900

For current Geneos customers, you can find the message rate generated by any Gateway (version 5.14.0 and later) by configuring ITRS Analytics publishing in statistics-only mode. To determine the total required message rate, add up the message rates from all Gateways that share an ITRS Analytics instance.

If you do not have these statistics, you can initially reference the sizing guidelines provided. The estimated range of the number of servers that ITRS Analytics can handle are based on certain assumptions (see below) and an analysis of existing customer Gateways.

Indicative server range	Computation
Lower estimate	The following conservative assumptions were used: 20-second sampling 2 managed entities per server 7 dataviews per managed entity 10 columns and 10 rows per dataview 50% of values changing every sample period
Upper estimate	Actual message rates from various customer Gateways were used. Most of these Gateways use 20-second sampling and a wide range of plugins.

You may use these estimates as a starting point, but validate it with actual statistics from your Gateways as soon as possible, since message rates can vary significantly between different plugins.

T-shirt sizing for HA-enabled Copied

Note

Please ensure you are using the latest recommended T-shirt sizing based on your specific configuration. These updates reflect changes in metrics, telemetry, timescales, and workload patterns starting from ITRS Analytics Platform 2.13.0.

Refer to the updated sample configuration files for the ITRS Analytics instance.

Large Copied

Specification	Minimum requirement
ITRS Analytics entities	250,000
Metrics and telemetry	100,000 metrics/sec 20,000 OpenTelemetry spans/sec (pre-sampling)
Messages per second limit	100,000
Messages per second target range	50-100K
Indicative server range	3,000-9,000
Operating system	Linux
CPU	Timescale: 48 cores Workloads: 72 cores
RAM	Timescale: 360 GiB memory Workloads: 170 GiB memory
Throughput	WAL disk: 5000 IOPs / 200 Mbps Timescale: 3000 IOPs / 125Mbps Apps: 3000 IOPs / 125Mbps
Disks (total estimate)	SSD required. See Sample configuration for AWS EC2 handling 100k metrics/sec (large).

Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.

Medium Copied

Specification	Minimum requirement
ITRS Analytics entities	125,000
Metrics and telemetry	50,000 metrics/sec 10,000 OpenTelemetry spans/sec (pre-sampling)
Messages per second limit	50,000
Messages per second target range	10-50K
Indicative server range	900 - 3,000
Operating system	Linux
CPU	Timescale: 16 cores Workloads: 55 cores
RAM	Timescale: 112 GiB memory Workloads: 122 GiB memory
Throughput	WAL disk: 3000 IOPs / 200 Mbps Timescale: 3000 IOPs / 125Mbps Apps: 3000 IOPs / 125Mbps
Disks (total estimate)	SSD required. See Sample configuration for AWS EC2 handling 50k metrics/sec (medium).

Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.

Small Copied

Specification	Minimum requirement
ITRS Analytics entities	25,000
Metrics and telemetry	10,000 metrics/sec 5,000 OpenTelemetry spans/sec (pre-sampling)
Messages per second limit	10,000
Messages per second target range	0-10K
Indicative server range	300-900
Operating system	Linux
CPU	40 cores
RAM	100 GiB memory
Throughput	3000 IOPs / 125Mbps
Disks (total estimate)	SSD required. See Sample configuration for AWS EC2 handling 10k metrics/sec (small).

Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.

T-shirt sizing for Non-HA Copied

Warning

Beginning with ITRS Analytics 2.14.2, the Micro t-shirt size is now supported for production deployments. This update is not backward compatible. To use Micro in production, you must upgrade to ITRS Analytics version 2.14.2 or newer.

If your current cluster size is set to Small, you must open the Config page within the KOTS Admin Console before upgrading, re-select Small, and save your changes. In this release, the default cluster size for non-HA deployments has changed to Micro. If you do not update and save the configuration before upgrading, your cluster size will automatically reset to Micro, even if it was previously set to Small.

Small Copied

Specification	Minimum requirement
ITRS Analytics entities	25,000
Metrics and telemetry	10,000 metrics/sec 5,000 OpenTelemetry spans/sec (pre-sampling)
Messages per second limit	10,000
Messages per second target range	0-10K
Indicative server range	300-900
Operating system	Linux
CPU	40 cores
RAM	100 GiB memory
Throughput	3000 IOPs / 125Mbps

Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.

Micro Copied

Specification	Minimum requirement
ITRS Analytics entities	10,000
Messages per second limit	3,000
Messages per second target range	0–3K
Indicative server range	Up to 300
Operating system	Linux
CPU	11 cores
RAM	48 GiB memory
Throughput	3000 IOPs / 125Mbps
Disks (total estimate)	500 GB (SSD required). See Sample configuration for AWS EC2 with ALB Ingress controller (micro)

Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.

T-shirt size node requirements Copied

This section explains the minimum number of nodes required for each T-shirt size configuration. It also describes how deactivating the Timescale (TS) node selectors affects the node count for medium and large sizes.

The base minimum number of nodes depends on the T-shirt size of your deployment. The following table summarizes the default minimum requirements:

Size	Has TS Node Selectors?	Minimum Node Count
small non-ha	N/A	1
small	N/A	3
medium	No	3
medium	Yes	6
large	No	6
large	Yes	9

Impact of Timescale node selectors Copied

When Timescale node selectors are activated, additional dedicated nodes are required to run Timescale workloads separately. This results in higher minimum node requirements for medium and large T-shirt sizes.

However, when Timescale node selectors are deactivated, the system reduces the minimum required nodes for these sizes by three nodes, since it no longer reserves separate nodes for Timescale.

Examples Copied

Medium size with TS node selectors activated: Requires 6 nodes.
Medium size with TS node selectors deactivated: Requires only 3 nodes.
Large size with TS node selectors activated: Requires 9 nodes.
Large size with TS node selectors deactivated: Requires only 6 nodes.

Storage considerations Copied

App installs Copied

Important
Before installing the FIX Monitor app, consider the potential storage impact and ensure that sufficient storage is provisioned. Large session volumes can lead to a significant increase in PVC size, especially in a single-node setup where PVCs share the same storage.

Embedded cluster installs Copied

The /var/lib/embedded-cluster directory, used for installation and data storage, should have at least three times the size of the airgap bundle or a minimum of 40Gi of free space for online installations. Additionally, it should not exceed 80% capacity.

Note that PVCs are stored in the /var/lib/embedded-cluster/openebs/local subdirectory, which does not reserve a specific amount of space upfront. The folder will initially use close to no space until files are actually added even when there is allocated storage space. Make sure that the total volume of your PVCs will fit within this subdirectory. For example, if you have PVCs for Timescale (100 GB) and Kafka (100 GB), the directory needs to be at least 200 GB.

When installing on an embedded cluster with limited space, you can relocate the data directory by passing the --data-dir flag with the install command. You need to specify the desired directory path using the --data-dir flag, since symlinks are currently not supported.

For example:

sudo ./[itrs-analytics] install --license license.yaml --data-dir /log/lib/obcerv-data

Once the cluster is installed, you cannot change the data directory anymore.

Preflight checks for Trident-based storage Copied

This ensures that the assigned storage class meets best practices and avoids potential issues that could impact stability and performance.

Check the storage class associated with each workload. If the provisioner is csi.trident.netapp.io, validate the parameters for the presence of any of the identified issues.
The backendType is set to ontap-nas-economy. According to Trident documentation, ontap-nas-economy is not recommended for production.
The provisioningType is set to thin. When configured as thin, there is no strict guarantee that the requested storage will always be available.
The snapshots parameter must be set to false. If set to true, this setting can lead to rapid disk utilization increases, potentially reaching 100% due to snapshot storage.
Recommended actions:
- If any of these issues are detected, update the storage class parameters accordingly.
- Consult Trident documentation for best practices on configuring backend storage.
- Monitor storage utilization closely, especially when enabling snapshots.

Preflight checks for Disk I/O performance Copied

The preflight Disk I/O performance check is designed to validate the performance of all configured storage classes using the fio (Flexible I/O Tester) benchmarking tool. The objective is to identify latency, throughput, or stability issues before they impact application workloads.

Note
This preflight check test uses fio to measure disk sync latency and IOPS (Input/Output Operations Per Second) by creating short-lived PVCs for each configured storage class. Enabling the Disk I/O Performance test can considerably extend the overall runtime of the preflight checks.

For each storage class, the test suite runs four predefined fio profiles, designed to simulate a broad range of I/O workloads:

Random reads
Random writes
Single-path access patterns
Multi-path access patterns

Each fio profile is configured to run for 60 seconds, resulting in a minimum test duration of 4 minutes per storage class (for example, 4 tests × 60 seconds x number of storage classes). The total runtime of the preflight check scales linearly with the number of configured storage classes in the environment.

Each storage class is evaluated against the following key Disk I/O performance metrics, along with their associated thresholds and severity levels.

Metric	Warning Threshold	Error Threshold	Interpretation
Data Sync Latency (99th percentile)	> 10 ms	> 100 ms	High latency indicates potential queuing or slow disk response
IOPS (Read/Write Average)	< 3000	< 2500	Lower IOPS values suggest inadequate throughput
Coefficient of Variation (IOPS)	≥ 10%	≥ 20%	High CV suggests unstable or erratic disk performance

Non-strict preflight mode Copied

Preflight checks for both Warning and Error levels are currently configured in a non-strict mode. This means they will not block the installation process, allowing for greater flexibility during deployment.

Managing settings for preflight and support bundle Copied

The Disk I/O performance check is enabled by default for both preflight checks and support bundles. The configuration options for both features are available in the KOTS Admin Console under Advanced Settings > Preflight and Support Bundle Settings.

Preflight checks for Disk I/O

Preflight checks Copied

During system reconfiguration, the check is automatically skipped if an existing installation is detected, unless it has been explicitly enabled through configuration.

To disable this, clear the selection for the Run Disk I/O Performance Test checkbox.

Support bundle Copied

This check is managed independently using the Include Disk I/O Test for Support Bundle option. When enabled, it adds the disk performance test to the support bundle, helping capture disk latency and throughput.

It operates independently of the preflight check configuration and remains enabled by default unless explicitly disabled. To modify this setting, simply select or clear the corresponding checkbox.

Preflight checks for CSI VolumeSnapshot support in StorageClass Copied

Ensure that a given StorageClass in a Kubernetes cluster supports CSI VolumeSnapshots, which are essential for consistent backups. This check must be performed prior to initiating any backup operations to ensure support and backup consistency.

Note
ITRS Analytics support backup and restore is disabled by default. You can modify the Enable IAX backup and restore setting located under Advanced Settings > Backup and Restore in the KOTS Admin Console. For more information, see Backup and restore documentation.

Strict preflight check for CSI VolumeSnapshot support Copied

Beginning with ITRS Analytics version 2.12.6, the preflight check for CSI VolumeSnapshotClass support has been made strict (blocking) to ensure consistent and reliable backup and restore functionality. To prevent potential data loss and restore failures, this preflight check blocks installation if no supported VolumeSnapshotClass is detected.

If a compatible VolumeSnapshotClass is not found, the installation will be blocked unless you resolve the issue or explicitly disable backups in the KOTS Admin Console under Advanced Settings > Backup and Restore.

Strict preflight check for CSI VolumeSnapshot support

Conceptual flow and example Copied

Retrieve the provisioner or driver for the selected StorageClass (for example, ebs-sc).
```
kubectl get storageclasses.storage.k8s.io ebs-sc -o jsonpath='{.provisioner}' 
```

Check for matching VolumeSnapshotClass.

kubectl get volumesnapshotclass -o json | jq '.items[] | select(.driver == "<provisioner>") | .metadata.name'

If no VolumeSnapshotClass is found that matches the provisioner of the specified StorageClass, this displays the following error:

Error
The following StorageClasses do not support CSI VolumeSnapshots: <SC Name (provisioner: <provisioner>)>. Without CSI VolumeSnapshot support, consistent backups cannot be guaranteed and are not supported. To proceed with the installation, you must resolve this issue or disable the backup option in Advanced Settings under Backup and Restore.

Preflight check to verify available pod capacity Copied

This preflight check verifies that your Kubernetes cluster has enough available pod slots to successfully deploy ITRS Analytics.

The check ensures that the cluster can accommodate all required IAX components without exceeding its pod capacity or running into scheduling issues. Insufficient pod availability may lead to failed or incomplete deployments.

How it works Copied

The preflight calculates the number of pods required based on the selected cluster size configuration.
It then compares this value against the current available pod capacity in the Kubernetes cluster.
If the available capacity is insufficient, the deployment will not proceed until resources are freed or the cluster capacity is increased.

Note
This is a strict preflight check. Deployment is blocked if the cluster lacks enough pod slots. The check prevents resource contention and ensures a stable, fully functional ITRS Analytics deployment.

Recommended action Copied

If the check fails:

Verify your cluster’s node and pod limits.
Scale up your cluster or free up unused resources.
Re-run the deployment after ensuring sufficient capacity.

Preflight check to validate Linkerd compatibility with native sidecars Copied

Beginning with ITRS Analytics 2.16.0, workloads that use native sidecars require a supported version of Linkerd.

The Linkerd native sidecars support preflight check ensures that clusters running Linkerd are compatible with ITRS Analytics workloads using native sidecars. Running the preflight check ensures that clusters with a pre-installed Linkerd meet these requirements before deployment. This prevents deployment issues caused by incompatible service mesh versions.

How it validates compatibility Copied

Detects Linkerd installation:
- Searches for the linkerd-config ConfigMap in the default linkerd namespace.
- If not found, searches across all namespaces.
Validates Linkerd version:
- If Linkerd is present, the version is parsed.
- The version is then compared against the minimum required version (2.15.0).
Reports compatibility:
- If Linkerd is not installed but internal TLS is enabled, or the version is below 2.15.0, the check reports that native sidecar support is unavailable.
- If Linkerd is 2.15.0 or higher, the native sidecar support is considered compatible.

In summary, here’s how it works:

Condition	Result
Linkerd not installed but internal TLS is enabled	Native sidecar support unavailable
Linkerd installed, version < 2.15.0	Native sidecar support unavailable
Linkerd installed, version ≥ 2.15.0	Native sidecar support available and compatible

Previous article Next article

Resource and hardware requirements

T-shirt sizing for HA-enabled Copied

Large Copied

Medium Copied

Small Copied

T-shirt sizing for Non-HA Copied

Small Copied

Micro Copied

T-shirt size node requirements Copied

Impact of Timescale node selectors Copied

Examples Copied

Storage considerations Copied

App installs Copied

Embedded cluster installs Copied

Preflight checks for Trident-based storage Copied

Preflight checks for Disk I/O performance Copied

Non-strict preflight mode Copied

Managing settings for preflight and support bundle Copied

Preflight checks Copied

Support bundle Copied

Preflight checks for CSI VolumeSnapshot support in StorageClass Copied

Strict preflight check for CSI VolumeSnapshot support Copied

Conceptual flow and example Copied

Preflight check to verify available pod capacity Copied

How it works Copied

Recommended action Copied

Preflight check to validate Linkerd compatibility with native sidecars Copied

How it validates compatibility Copied

Was this topic helpful?

Your thoughts...

How can we improve this topic?

Your thoughts...

Thank you for your feedback!