Geneos

Top Plug-in

Introduction

GENEOS Top Plug-in monitors systems for resource hungry tasks.

It works by displaying processes that breach a configurable threshold value (e.g. 20% CPU usage).

Processes are said to be in breach when they exceed the threshold value and, optionally, are in the top N processes to exceed the threshold; where N is a configurable value (maxRows). The plug-in can also display processes that were recently in breach. These are referred to as tailed off and are displayed for a tail off period.

View

The plug-in produces a single view. This will display two headline variables, a summary of the breach condition and the number of processes currently in breach. The table displays processes in breach and, if a Tail Off Period is specified, processes recently in breach. Column values are configurable except for the first column name, which always displays the processes name and PID.

The view can be configured on the Advanced tab > Process parameters to display any of the columns listed below.

top2

Headline Legend

Name Description
summary A summary of what is being displayed
numBreached The number of breached processes being displayed

Table Legend

Name Description
name The name of the process and the PID. This creates a unique identifier for the process.
percentCPU % of recent cpu used by this process instance.
virtualMemory Virtual memory size of this process instance.
residentSetSize Resident set size of this process instance.
processId The Unix PID number of this process instance.
parentProcessId The PID number of the parent process.
arguments The command line arguments of this process instance.
percentMemory % of memory occupied by this process instance. (UNIX only)
startTime The time this process instance started. (UNIX only)
user The User that started this process instance. (UNIX only)
groupId The Unix Groupid of this process instance. (UNIX only)
numThreads Number of threads being used by this process instance. (UNIX only)
state The current state of this process instance e.g Running, Sleeping etc. (UNIX only)
ageHours The number of hours that the process has been running. (UNIX only)
ageDays The number of days that the process has been running. A process started at one minute to midnight will show as as 1 day at one minute past midnight, even though AgeHours will be 0. (UNIX only)
fileDescriptors The number of file descriptors held by the process. (UNIX only)
cpuTime Cumulative CPU time used by the process in a human readable format. (Solaris only)
cpuSeconds Cumulative CPU time used by the process in seconds. (Solaris only)
percentUser % of recent user time used by this process instance. (Windows only)
percentPrivilegedTime % of recent privileged time used by this process instance.
(Windows only) virtualMemPeak Peak virtual memory size of this process instance. (Windows only)
pageFaultsPerSec Page faults per second caused by this process instance. (Windows only)
workingSetPeak Peak resident set size of this process instance. (Windows only)
pageFileBytesPeak Peak page file bytes used by this process instance. (Windows only)
pageFileBytes Page file bytes used by this process instance. (Windows only)
priorityBase Base priority of process. (Windows only)
elapsedTime Elapsed time since process started. (Windows only)
poolPagedBytes Number of bytes in the Paged Pool (Windows only)
poolNonPagesBytes Number of bytes in the Nonpaged Pool, (Windows only)
handleCount Number of handles held by process instance. (Windows only)
GDICount Number of GDI objects held by process instance. (Windows only)
userName The User that started this process instance. (Windows only)

Plug-in Configuration

processParameters

Defines the parameters (process attributes) to be shown for each process monitored. To disable this mode, set the environment variable SOLARIS_OS_DIRECT to FALSE in the start_netprobe script.

Possible values:

The available process parameters for Unix are:

Parameter Description
pcpu % of recent cpu
pmem % of system memory
vsz Virtual memory size
rss Resident set size
time Uptime
user User name
group Group id
pid Process id
ppid Parent process id
args Command line args
nlwp Number of threads
state Process state, running, sleeping etc.
ageh Age of the process in hours
fd File descriptors
aged Age of the process in days
cputime The cumulative CPU time of the process in the form [[dd-]hh:]mm:ss (Solaris Only)
scputime The cumulative CPU time of the process in seconds (Solaris Only)

Additionally, the following process parameters are available on Linux:

Parameter Description
pagef Major Page Faults per second
nFLT Total Major Page Faults
nDRT Number of Dirty Memory Pages

The available process parameters for Windows are:

Parameter Description
pcpu % of recent cpu
user % user time
priv % privileged time
vsz Virtual memory size
vszp Virtual bytes peak
pagef Page faults per sec
rssp Working set peak
rss Working set
pfkbp Page file bytes peak
pfkb Page file bytes
prior Priority base
etime Elapsed time
pid Process id
poolp Pool paged bytes
poolnp Pool non paged bytes
handles Handle count
ppid Creating process id
prikb Private bytes
args Command line args
gdi GDI Count
uname User name

Note: For fd usage: If the NetProbe doesn't have root permissions it will only be able to read file descriptor counts for processes running as the same user as the NetProbe.

Mandatory: No
Default: pcpu, vsz

threshold

Set the threshold value and test type against which the thresholdParameter for each process is tested. Test type can be either Greater Than Or Equal To or Less Than Or Equal To.

Mandatory: Yes
Default: N/A

threshold > thresholdParameter

Defines the parameter which is tested against the threshold value to determine if processes are in breach. Can be defined as any numeric processParameter.

Mandatory: Yes
Default: N/A

threshold > greaterThanOrEqualTo

Sets threshold test type to Greater Than Or Equal To. Contains a decimal value.

Mandatory: No
Default: N/A

threshold > lessThanOrEqualTo

Sets threshold test type to Less Than Or Equal To. Contains a decimal value.

Mandatory: No
Default: N/A

tailOffPeriod

Defines the optional Tail Off Period for which processes remain displayed after they are no longer in breach of the threshold value, this is in order to aid logging cells to a database. If set to zero, processes cease to be displayed the moment they are out of breach.

Value is in seconds.

Mandatory: No
Default: 0

maxRows

Defines an optional maximum number of breached processes to display.

Note: If this value is used in conjunction with a tailOffPeriod, then additional rows are displayed. The additional rows are tailed-off processes and are only displayed for the tailOffPeriod.

If this field is used in conjunction with Less Than Or Equal To, then the plug-in sorts the rows in the dataview according the processes with the least HandleCount values, before limiting the number of rows accordingly.

Mandatory: No
Default: 1

adjustForLogicalCPUs

Adjusts the percentCPU of each process by taking the percentage of time being used by the process and then dividing by the number of cores/cpus on the machine. For example, suppose we have a process using up 10% of 1 core on a 4 core box. If this setting is enabled, then we would expect the dataview to display this process as using 2.5% instead. If this setting is not enabled then we would expect 10% to be displayed as usual.

Mandatory: No
Default: False