Dynamic Thresholds issues
This documentation outlines common issues and troubleshooting steps for configuring Dynamic Thresholds with Results Exporter and ITRS Analytics. For configuration instructions, refer to Configuring Dynamic Thresholds.
Data is not being exported to ITRS Analytics Copied
This can occur for a variety of reasons, which includes the following:
- The host or port of the ITRS Analytics ingestion service is incorrect.
- The ITRS Analytics ingestion username or password is incorrect.
- The ITRS Analytics ingestion certificate is either in an incorrect format or not readable by the opsview user.
- The ITRS Analytics ingestion certificate the Results Exporter is configured to expect is not the same as the one used by the ITRS Analytics ingestion service. For example, if the certificate has changed since the Results Exporter was configured, it will recognize the mismatch and refuse to connect.
Note
We recommend investigating the underlying error using the Results Exporter logs. If the cause of the issue is still unclear, review and validate the listed possible causes. To ensure proper configuration of values, refer to the instructions at Configure Results Exporter in Opsview.
In some cases, datapoints may not be exported to ITRS Analytics and get dropped, even if the Results Exporter is configured correctly. This can occur due to invalid data within the result messages that would be blocked by ITRS Analytics. The Results Exporter logs will provide additional details to help diagnose the issue.
Investigating issues using the Results Exporter logs Copied
By default, the NOTICE
log level in the Results Exporter shows high-level information about errors. For example:
[ERR] Worker-0 obcerv-dynamic_threshold_metrics task failed to send data to Obcerv
[ERR] OutputObcervTask obcerv-dynamic_threshold_metrics worker 0 - Exception: RpcError when sending data: failed to connect to all addresses.
To gain further information about the underlying cause of the errors, you can enable DEBUG
logging. This will also explain the reasons for any datapoints being dropped during the processing of messages. For more details on how to modify the log level, see Results Exporter Logging Settings.
Using sensitive trace logging Copied
For gRPC connection errors, you may need to enable sensitive_trace
logging for full investigation. For more details on how to modify the log level, see Results Exporter Logging Settings.
The DEBUG
log level must also be enabled for this to take effect as this can print sensitive information in your connection parameters. To view the trace information, you must first stop the Results Exporter component completely using the following:
/opt/opsview/watchdog/bin/opsview-monit stop opsview-resultsexporter
Then run the component in non-daemon mode as the opsview
user:
/opt/opsview/resultsexporter/venv3/bin/resultsexporterlauncher -d
All logging, including the trace information, will then be printed to stdout. This can help diagnose certificate mismatch issues. For example:
Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.
Dynamic Thresholds configuration page is missing Copied
The Dynamic Thresholds configuration page may not be accessible in the navigation bar if the Dynamic Thresholds feature is not included in your Opsview license.
Please check Managing your Subscription/Checking your Subscription Details for more information.
Host-services are missing in the Dynamic Thresholds search Copied
Host-services must be compatible with the Dynamic Thresholds feature to appear in the configuration page search results. Compatibility is determined by the following criteria:
- Host-service must be returning performance metrics.
- Service check definition must contain one of the Dynamic Thresholds macros:
$WARNINGDT
or$CRITICALDT
.
My enabled host-services are in an Awaiting Data state Copied
If a host-service’s dynamic threshold configuration is showing Awaiting Data
, it’s possible that service metric data isn’t reaching ITRS Analytics. Verify that the correct host-services are being exported through the ITRS Analytics Web Console and check Opsview log files for any related errors.
Log message details Copied
These are common log messages you might see when configuring Dynamic Thresholds encounters an error or issue.
Nothing to do right now
Copied
INFO [opsview.opsview.processors.obcervimporter] ObcervImporterProcessor checking objects ...
INFO [opsview.opsview.processors.obcervimporter] Nothing to do right now
All host-services using Dynamic Thresholds have been refreshed within the last min_object_update_interval_secs
seconds, and there have been no changes to which host-services are enabled or disabled for this feature. The obcervimporter will perform another check later.
Failed to extract result for 'Host.Service (oid=707) metric: response = {}
Copied
INFO [opsview.opsview.processors.obcervimporter] ObcervImporterProcessor checking objects ...
WARNING [opsview.obcervimporter.opsview.obcervimporter.obcervclient] Failed to extract result for 'hostA.'ServiceA' (oid=7) metricA': response={}
WARNING [opsview.obcervimporter.opsview.obcervimporter.obcervclient] Failed to extract result for 'hostB.'ServiceB' (oid=97) metricB': response={}
Note
As discussed in the Limitations section, Dynamic Thresholds work optimally with host-services that have at least a week’s worth of data in ITRS Analytics. For newly monitored host-services, sufficient data for threshold calculations might not yet be available. In such cases, theobcervimporter
will periodically retry and log a warning until enough data is collected.