Geneos

Gateway Log File

Overview

The Gateway writes all of its log messages to a log file, allowing quick access the past Gateway behaviour. These log messages include descriptions of what the Gateway is doing as well as errors that may have occurred.

Each log message is written on a separate line, and starts with the date and time that it was generated.

Operation

Specify the log file name

To specify the log file name, either:

  • Use the LOG_FILENAME environment variable in the Gateway startup script or Gateway setup file. This is how it is configured to be set up using the example in How to create a start script for Gateway.
  • Use the -log command line option when starting the Gateway.

If both are set, the command line option overrides the LOG_FILENAME environment variable.

Use date and time specifiers in log file name

You can use date and time specifiers in the log file name. These specifiers then get populated with relevant date/time information during the logging process. By default this is evaluated against the Gateway start time, but is evaluated against the roll time if this is used.

See Time Zones and Time Formats for a list of time specifiers.

Note: It is possible that using these specifiers will allow a number of log files to build up on disk. See Archive log files for more information.

For example, if you start up a Gateway on the 3rd of Jan 2013 at 09:30, the command containing -log GatewayLogExample-%Y-%m-%d-%H-%M.log, the generated log file would be GatewayLogExample-2013-01-03-09-30.log.

Roll over due to maximum file size

To stop log files from getting excessively large, Gateway log files roll over when they reach a certain size limit.

Upon hitting this size limit, the active log file is archived. See Archive log files.

By default, the log file rolls over when the log file reaches 10485760 bytes (or 10 MB). However, it can be configured to rollover after up to a maximum of 2147483647 bytes (2 GB) with a 32-bit Gateway. To do this, either:

Roll over at a specified time of day

The Gateway log file can be set to roll over at a specified time of day via the -roll-time command line option. When this is set, the Gateway log file automatically rolls over after the specified time of day, at the point when a new log message comes in.

For example, if the rollover time is set to 18:00 and at 18:02 the next log message comes in, the current log file is closed and a new log file is generated at 18:02. This log message is then be written into the new file.

You may only set one role time.

Note: When time-based rolling is active, any time specifiers used in the filename are evaluated against the roll time rather than the gateway start time. This means that a new log file is not generated each time the Gateway is simply stopped and started. Therefore, if a gateway is started at 09:00 on 02-Mar-2012 and the roll time is 10:00, the date and time used to generate the log file name is 10:00 01-Mar-2012.

When rolling, if the new file name generated by the Gateway already exists, the existing file is archived. See Archive log files.

Archive log files

When the log file reaches its maximum size, or a roll time has been reached, a new log file is opened. When the Gateway starts a new log file, any existing file with the same name is renamed to <filename>.old.

To prevent a large number of log files being retained, only the latest .old file is kept. If <filename>.old already exists it is overwritten. Consequently, using a simple filename results in there only ever being two log files: the current log, and the old log.

Using time specifiers in the filename usually results in a new filename that does not already exist, particularly if the full date is used. However, this may cause the number of files on disk to continue to increase.

Not including the whole date increases the number of possibilities, such as just including the day or day of the month, which creates a weekly or monthly rotation of files. .old files may still be generated during a day if the maximum size is reached, therefore it may be advisable to increase the maximum size if the files on disk is limited by a rotation system.

A UNIX script can be called to move log files. The archive script can be used to:

  • Move or copy .old files into an archive elsewhere.
  • Prevent a large number of date/time based files building up by removing the older ones if the full date is specified.

The UNIX script can be specified using:

Note: Using operatingEnvironment > logArchiveScript overrides LOG_ARCHIVE_SCRIPT (if set).

The script is run when the Gateway switches to using a new log file. The name of the old log file is passed to the script, which is either:

  • <filename>.old, if the log file name did not change.
  • The old log <filename>, if the log file name did change.

If the log file name is changed to a file that already exists, the existing file is moved to <filename>.old. That file is not passed to the archive script.

Log file connections

This section lists common log entries from a Gateway log file and errors that may occur when a component is disconnected from the Gateway.

When the Netprobe connection disconnects from the Gateway, these INFO messages might appear:

INFO messages Description
INFO: Translator ConManager Details: 'writeData()'; 'None'; 0; -808; 'getSockOpt() failed on fd 16 returns 32 -808 means a socket error is identified, while returns 32 means that the other end of connection has been closed.
INFO: ProbeManager Netprobe server-name 6370 (server:11108) Down<Tue Nov 24 10:59:34> INFO: ProbeManager Releasing licence for NP64 server-name 6370 (server-name:11108) server disconnects from the Gateway because it was detected as offline.
INFO: ProbeManager Releasing sampler licences for NP64 server-name 6370 (server-name:11108) License of the plug-in is revoked.
INFO: ActionManager Action DataItem 'Print to a file' removed (variable=/geneos/gateway[(@name="GATEWAY_9370")]/directory/probe[(@name="NP64 server-name 6370")]/managedEntity[(@name="server-name")]/sampler[(@name="CPU")][(@type="")]/dataview[(@name="CPU")]/rows/row[(@name="Average_cpu")]/cell[(@column="percentUtilisation")]) Action fired on the disconnected Netprobe's dataview is disabled.
   

To resolve this, do the following:

  • Check if the Netprobe process is running.
  • Execute a PING and TRACEROUTE checks to the Netprobe port and server.

When the Active Console disconnects from the Gateway, these INFO messages might appear:

INFO messages Description
INFO: Translator ConManager Details: 'writeData()'; 'None'; 0; -808; 'getSockOpt() failed on fd 12 returns 104' -808 means a socket error is identified, and returns 104 means that the connection has been reset.
INFO: UserManager User 'US\gban' from 192.168.100.23:56722 disconnected. Connection ID 1. Active Console user gets disconnected from the Gateway.
   

Check if the Gateway is overloaded

If the disconnection occurs frequently, do any of the following:

Check if the Active Console is overloaded

If Gateway is not overloaded, but the disconnection still occurs frequently, check the Active Console.

  1. Restart the Active Console.
  2. Create a new workspace.
  3. Connect to the affected Gateway and observe if the error still persists.

If the disconnection stops, then the original workspace causes the issue.

To resolve this, do the following in the Active Console:

  • Remove the unused List views.
  • Remove the unused dashboards.
  • Only connect to the Gateways that are being monitored.
  • Optimise the XPaths on the List views and dashboards.