Resource and hardware requirements
Make sure that you meet the following resource and hardware requirements for your configuration size before installing ITRS Analytics.
The required size by ITRS Analytics depends mainly on the message rate it needs to handle.
T-shirt sizing | Message rate | ITRS Analytics entities | Indicative server range |
---|---|---|---|
Large | 100,000 | 250,000 | 3,000-9,000 |
Medium | 50,000 | 125,000 | 900-3,000 |
Small | 10,000 | 25,000 | 300-900 |
Warning
Beginning with ITRS Analytics 2.10.x,Micro
is intended for development use only and is not suitable for production environments.
For current Geneos customers, you can find the message rate generated by any Gateway (version 5.14.0 and later) by configuring ITRS Analytics publishing in statistics-only mode. To determine the total required message rate, add up the message rates from all Gateways that share an ITRS Analytics instance.
If you do not have these statistics, you can initially reference the sizing guidelines provided. The estimated range of the number of servers that ITRS Analytics can handle are based on certain assumptions (see below) and an analysis of existing customer Gateways.
Indicative server range | Computation |
---|---|
Lower estimate | The following conservative assumptions were used:
|
Upper estimate | Actual message rates from various customer Gateways were used. Most of these Gateways use 20-second sampling and a wide range of plugins. |
You may use these estimates as a starting point, but validate it with actual statistics from your Gateways as soon as possible, since message rates can vary significantly between different plugins.
T-shirt sizing for HA-enabled Copied
Large Copied
Specification | Minimum requirement |
---|---|
ITRS Analytics entities | 250,000 |
Messages per second limit | 100,000 |
Messages per second target range | 50-100K |
Indicative server range | 3,000-9,000 |
Operating system | Linux |
CPU |
|
RAM |
|
Throughput |
|
Disks (total estimate) | SSD required. See Sample configuration for AWS EC2 handling 100k metrics/sec (large). |
Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.
Medium Copied
Specification | Minimum requirement |
---|---|
ITRS Analytics entities | 125,000 |
Messages per second limit | 50,000 |
Messages per second target range | 10-50K |
Indicative server range | 900 - 3,000 |
Operating system | Linux |
CPU |
|
RAM |
|
Throughput |
|
Disks (total estimate) | SSD required. See Sample configuration for AWS EC2 handling 50k metrics/sec (medium). |
Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.
Small Copied
Specification | Minimum requirement |
---|---|
ITRS Analytics entities | 25,000 |
Messages per second limit | 10,000 |
Messages per second target range | 0-10K |
Indicative server range | 300-900 |
Operating system | Linux |
CPU | 34 cores |
RAM | 80 GiB memory |
Throughput | 3000 IOPs / 125Mbps |
Disks (total estimate) | SSD required. See Sample configuration for AWS EC2 handling 10k metrics/sec (small). |
Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.
T-shirt sizing for Non-HA Copied
Warning
Beginning with ITRS Analytics 2.10.x,Micro
is intended for development use only and is not suitable for production environments.
Small Copied
Specification | Minimum requirement |
---|---|
ITRS Analytics entities | 25,000 |
Messages per second limit | 10,000 |
Messages per second target range | 0-10K |
Indicative server range | 300-900 |
Operating system | Linux |
CPU | 34 cores |
RAM | 80 GiB memory |
Throughput | 3000 IOPs / 125Mbps |
Note
The minimum resource and hardware requirements refer to the total requested amounts across all resources. However, the preflight checks verify total resource limits to ensure they are sufficient. For more information, see Kubernetes Limits vs. Requests.
Storage considerations Copied
App installs Copied
Important
Before installing the FIX Monitor app, consider the potential storage impact and ensure that sufficient storage is provisioned. Large session volumes can lead to a significant increase in PVC size, especially in a single-node setup where PVCs share the same storage.
Embedded cluster installs Copied
The /var/lib/embedded-cluster
directory, which is used for installation and data storage, should have at least 40 Gi of free space and not more than 80% full.
Note that PVCs are stored in the /var/lib/embedded-cluster/openebs/local
subdirectory, which does not reserve a specific amount of space upfront. The folder will initially use close to no space until files are actually added even when there is allocated storage space. Make sure that the total volume of your PVCs will fit within this subdirectory. For example, if you have PVCs for Timescale (100 GB) and Kafka (100 GB), the directory needs to be at least 200 GB.
When installing on an embedded cluster with limited space, you can relocate the data directory by passing the --data-dir
flag with the install command. You need to specify the desired directory path using the --data-dir
flag, since symlinks are currently not supported.
For example:
sudo ./[application-slug] install --license license.yaml --data-dir /log/lib/obcerv-data
Once the cluster is installed, you cannot change the data directory anymore.
Preflight checks for Trident-based storage Copied
This ensures that the assigned storage class meets best practices and avoids potential issues that could impact stability and performance.
-
Check the storage class associated with each workload. If the provisioner is
csi.trident.netapp.io
, validate the parameters for the presence of any of the identified issues. -
The
backendType
is set toontap-nas-economy
. According to Trident documentation,ontap-nas-economy
is not recommended for production. -
The
provisioningType
is set tothin
. When configured asthin
, there is no strict guarantee that the requested storage will always be available. -
The
snapshots
parameter must be set tofalse
. If set totrue
, this setting can lead to rapid disk utilization increases, potentially reaching 100% due to snapshot storage. -
Recommended actions:
- If any of these issues are detected, update the storage class parameters accordingly.
- Consult Trident documentation for best practices on configuring backend storage.
- Monitor storage utilization closely, especially when enabling snapshots.
Preflight checks for Disk I/O performance Copied
The preflight Disk I/O performance check is designed to validate the performance of all configured storage classes using the fio
(Flexible I/O Tester) benchmarking tool. The objective is to identify latency, throughput, or stability issues before they impact application workloads.
Note
This preflight check test usesfio
to measure disk sync latency and IOPS (Input/Output Operations Per Second) by creating short-lived PVCs for each configured storage class. Enabling the Disk I/O Performance test can considerably extend the overall runtime of the preflight checks.
For each storage class, the test suite runs four predefined fio
profiles, designed to simulate a broad range of I/O workloads:
- Random reads
- Random writes
- Single-path access patterns
- Multi-path access patterns
Each fio
profile is configured to run for 60 seconds, resulting in a minimum test duration of 4 minutes per storage class (for example, 4 tests × 60 seconds x number of storage classes). The total runtime of the preflight check scales linearly with the number of configured storage classes in the environment.
Each storage class is evaluated against the following key Disk I/O performance metrics, along with their associated thresholds and severity levels.
Metric | Warning Threshold | Error Threshold | Interpretation |
---|---|---|---|
Data Sync Latency (99th percentile) | > 10 ms | > 100 ms | High latency indicates potential queuing or slow disk response |
IOPS (Read/Write Average) | < 3000 | < 2500 | Lower IOPS values suggest inadequate throughput |
Coefficient of Variation (IOPS) | ≥ 10% | ≥ 20% | High CV suggests unstable or erratic disk performance |
Non-strict preflight mode Copied
Preflight checks for both Warning
and Error
levels are currently configured in a non-strict mode. This means they will not block the installation process, allowing for greater flexibility during deployment.
Enabled by default Copied
This check is enabled by default but can be optionally disabled using a flag in Advanced Settings > Preflight Settings in the KOTS Admin Console.