Data Sets (Time series)

Overview Copied

Data sets represent a placeholder for different kinds of data collected by the Gateway. Time series is the only type available.

Time Series data sets can then be used as part of Anomaly Detection and Breach predictor functionality.

Create a Time Series Copied

To create a Time Series:

Select Data sets from the navigation tree in the GSE.
Click New Time series.
Select the type of the time series. The type specifies if the time series is database driven or gathers data from Gateway Hub.
Provide all required information. See Configuration reference for more information about available options.
Validate and click Save current document to save the changes.

Gateway Hub driven Time series Copied

Geneos can use data sets generated by Gateway Hub. The specification for these data sets is set up using GSE, and the Gateway automatically manages their generation and retrieval.

To use Anomaly Detection rules with data sets generated from Gateway Hub you should enable SSO authentication in Gateway Hub. For more information, see Connecting to Gateway Hub

The Gateway uses SSO tokens to access the data set endpoints in Gateway Hub. It is possible to connect to a Gateway Hub without authentication but this is insecure and should not be used in production environments. See Unauthenticated usage with an insecure Gateway Hub in Centralised Gateways User Guide.

To use this method, select Gateway Hub driven as Type when setting up time series. For more information, see Type — Hub.

Database driven Time Series Copied

The time series can be set up by a process external to the Gateway and stored in two tables in the database used for database logging. The tables are:

Table name Description

Table name	Description
`time_series_user_table`	Stores a set of names and unique IDs. The names are used to map the names of the time series defined in the Gateway setup to the IDs used in the `time_series_data_user_table`. There are two values per row: `name` — corresponds to the data set name defined in the Gateway and is unique. `time_series_id` — a unique ID used in the `time_series_data_user_table`.
`time_series_data_user_table`	Stores the time series data. There are three values per row: `time_series_id` — ID used to link the data back to the name in the `time_series_data_user_table`. `start_time` — start time for the `value` of a point in the time series data. Time in seconds since the start of the day. `value` — value of the time series at `start_time`.

time_series_user_table

Stores a set of names and unique IDs. The names are used to map the names of the time series defined in the Gateway setup to the IDs used in the time_series_data_user_table.

There are two values per row:

name — corresponds to the data set name defined in the Gateway and is unique.
time_series_id — a unique ID used in the time_series_data_user_table.

time_series_data_user_table

Stores the time series data.

There are three values per row:

time_series_id — ID used to link the data back to the name in the time_series_data_user_table.
start_time — start time for the value of a point in the time series data. Time in seconds since the start of the day.
value — value of the time series at start_time.

The schema for these tables is available in the Gateway resources directory provided as part of the Gateway bundle. The data is read from the database at Gateway start time and at the reload time defined in the setup.

It is up to an external process to maintain and update the time series tables. This can be controlled using the Gateway scheduled command.

To use this method, select databaseDriven as Type when setting up time series. For more information, see Configuration reference.

Prerequisites Copied

Before you can configure the database driven time series, you need to configure the database tables:

Configure database logging. For more information, see MySQL configuration in Gateway Database Logging.
Ensure that the tables defined in <gateway directory>/resources/database/<database type>/time-series-schema-1.0.sql exist.
Insert data into those tables.

Configuration reference Copied

Setting	Description	Mandatory	XML schema path
Time series	Time series model a day’s worth of data uploaded from the database.	No	timeSeries
Name	Specifies a name that you want to identify each time series with. If your time series are database driven, the name must correspond to one of the database tables you have configured.	Yes	timeSeries > name
Description	Specifies additional information about the time series. You can enter multi-line comments in the description field.	No	timeSeries > description
External	Specifies how the external data access is managed.		timeSeries > external
External>Reload Time	Specifies the time of the day that data should be uploaded from the database or Gateway Hub every day.	No (default value is the current time during time series creation)	timeSeries > external > reloadTime
Type	Specifies if the time series is database driven or gathers data from Gateway Hub. This setting has two options: Database driven Gateway Hub driven. See Type Hub	No	timeSeries > type

Type — Gateway Hub driven Copied

This section provides more information about configuration options if you select your time series to be generated from Gateway Hub.

Algorithm — Seasonal-quick Copied

You can specify the algorithm to use to generate the dataset. There is currently one algorithm available, Seasonal-quick:

Setting	Description
Entity query	Specifies the entity query using the entities filter syntax. For example, `user.COMPONENT=EMS`. This query finds all entities with the user-defined managed entity attribute `COMPONENT=EMS`. For more information on how to use the entity filter syntax, see Entity Filter Syntax. Note: In GSE you should only use quotes. Do not escape the quotes with backslashes because the JSON formatter in Gateway does that for you.
Metrics	Data you want to query. This is a sequence of raw metric names to be included in the resulting metric time series. Example: `cpu/cpu/Average_cpu/percentUtilisation` For more information about retrieving metric data, see Metric Query Example Note: In GSE you should only use quotes. Do not escape the quotes with backslashes because the JSON formatter in Gateway does that for you.
Aggregations	Aggregations are calculations you wish to perform on the data. The options are: `count` `sum` `min` `max` `stddev` `avg` `var` `percentile` For aggregations that can be set to a specified level (for example, `percentile`), you must also specify an `alias` to distinguish between multiple entries. Percentage levels are set between `0` and `100`. Aggregations are provided by Gateway Hub, for more information see Aggregations in Retrieve data from Gateway Hub. Note: The percentage aggregation available here uses the `itrs.tdigest.percentile` operation provided by the Gateway Hub API.
Granularity	Seasonal granularity setting is used to create Time Series buckets of selected granularity. It allows you to choose how long an interval each value in the Time Series represents. You can choose from the drop-down menu, or enter a number followed by a length of time. The options in the drop-down menu are: `1 minute` `5 minutes` `15 minutes` `1 hour` `3 hours` `12 hours` `1 day`
Period	The period of cycle which defines seasonality. It allows you to choose the period over which you want the values to repeat. The options are: `Day` `Week`
Periods	Number of periods. It allows you to say how many recent periods (days or weeks) you want the data to be based on.

Period settings example Copied

Here’s an example of how to use period settings (granularity, period, and periods):

You want to write a rule which will compare the current value of an item with a typical value for this time of day, based on data from the last 60 days.

Granularity allows you to choose how long an interval each value in the time series represents. In this case, you should choose 5 minutes or 15 minutes.
Period allows you to choose the period over which you expect the values to repeat: does your time series represent a typical day or a typical week? In this use case, you want to choose a period of day. If the value you are monitoring varies a lot between working and non-working days, and you have enough historical data available, you should choose week.
Periods setting allows you to say how many recent days or weeks you want the data to be based on. In this use case, you want 60 days, so you should set this parameter to 60.

Time to live Copied

This specifies the length of time the time series is valid for. The options are:

1 day
1 hour
1 week
3 hours

When the data is passed from Gateway to Gateway Hub, the default value is 1 day.

The data set is automatically deleted after the time to live value expires.

Previous article Next article