Top Plug-in
Introduction
GENEOS Top Plug-in monitors systems for resource hungry tasks.
It works by displaying processes that breach a configurable threshold value (e.g. 20% CPU usage).
Processes are said to be in breach when they exceed the threshold value and, optionally, are in the top N processes to exceed the threshold; where N is a configurable value (maxRows). The plug-in can also display processes that were recently in breach. These are referred to as tailed off and are displayed for a tail off period.
View
The plug-in produces a single view. This will display two headline variables, a summary of the breach condition and the number of processes currently in breach. The table displays processes in breach and, if a Tail Off Period is specified, processes recently in breach. Column values are configurable except for the first column name, which always displays the processes name and PID.
The view can be configured on the Advanced tab > Process parameters to display any of the columns listed below.
Headline Legend
Name | Description |
---|---|
summary | A summary of what is being displayed |
numBreached | The number of breached processes being displayed |
Table Legend
Name | Description |
---|---|
name | The name of the process and the PID. This creates a unique identifier for the process. |
percentCPU | % of recent cpu used by this process instance. |
virtualMemory | Virtual memory size of this process instance. |
residentSetSize | Resident set size of this process instance. |
processId | The Unix PID number of this process instance. |
parentProcessId | The PID number of the parent process. |
arguments | The command line arguments of this process instance. |
percentMemory | % of memory occupied by this process instance. (UNIX only) |
startTime | The time this process instance started. (UNIX only) |
user | The User that started this process instance. (UNIX only) |
groupId | The Unix Groupid of this process instance. (UNIX only) |
numThreads | Number of threads being used by this process instance. (UNIX only) |
state | The current state of this process instance e.g Running, Sleeping etc. (UNIX only) |
ageHours | The number of hours that the process has been running. (UNIX only) |
ageDays | The number of days that the process has been running. A process started at one minute to midnight will show as as 1 day at one minute past midnight, even though AgeHours will be 0. (UNIX only) |
fileDescriptors | The number of file descriptors held by the process. (UNIX only) |
cpuTime | Cumulative CPU time used by the process in a human readable format. (Solaris only) |
cpuSeconds | Cumulative CPU time used by the process in seconds. (Solaris only) |
percentUser | % of recent user time used by this process instance. (Windows only) |
percentPrivilegedTime | % of recent privileged time used by this process instance. |
(Windows only) virtualMemPeak | Peak virtual memory size of this process instance. (Windows only) |
pageFaultsPerSec | Page faults per second caused by this process instance. (Windows only) |
workingSetPeak | Peak resident set size of this process instance. (Windows only) |
pageFileBytesPeak | Peak page file bytes used by this process instance. (Windows only) |
pageFileBytes | Page file bytes used by this process instance. (Windows only) |
priorityBase | Base priority of process. (Windows only) |
elapsedTime | Elapsed time since process started. (Windows only) |
poolPagedBytes | Number of bytes in the Paged Pool (Windows only) |
poolNonPagesBytes | Number of bytes in the Nonpaged Pool, (Windows only) |
handleCount | Number of handles held by process instance. (Windows only) |
GDICount | Number of GDI objects held by process instance. (Windows only) |
userName | The User that started this process instance. (Windows only) |
Plug-in Configuration
processParameters
Defines the parameters (process attributes) to be shown for each process monitored. To disable this mode, set the environment variable SOLARIS_OS_DIRECT to FALSE in the start_netprobe script.
Possible values:
The available process parameters for Unix are:
Parameter | Description |
---|---|
pcpu | % of recent cpu |
pmem | % of system memory |
vsz | Virtual memory size |
rss | Resident set size |
time | Uptime |
user | User name |
group | Group id |
pid | Process id |
ppid | Parent process id |
args | Command line args |
nlwp | Number of threads |
state | Process state, running, sleeping etc. |
ageh | Age of the process in hours |
fd | File descriptors |
aged | Age of the process in days |
cputime | The cumulative CPU time of the process in the form [[dd-]hh:]mm:ss (Solaris Only) |
scputime | The cumulative CPU time of the process in seconds (Solaris Only) |
Additionally, the following process parameters are available on Linux:
Parameter | Description |
---|---|
pagef | Major Page Faults per second |
nFLT | Total Major Page Faults |
nDRT | Number of Dirty Memory Pages |
The available process parameters for Windows are:
Parameter | Description |
---|---|
pcpu | % of recent cpu |
user | % user time |
priv | % privileged time |
vsz | Virtual memory size |
vszp | Virtual bytes peak |
pagef | Page faults per sec |
rssp | Working set peak |
rss | Working set |
pfkbp | Page file bytes peak |
pfkb | Page file bytes |
prior | Priority base |
etime | Elapsed time |
pid | Process id |
poolp | Pool paged bytes |
poolnp | Pool non paged bytes |
handles | Handle count |
ppid | Creating process id |
prikb | Private bytes |
args | Command line args |
gdi | GDI Count |
uname | User name |
Note: For fd usage: If the NetProbe doesn't have root permissions it will only be able to read file descriptor counts for processes running as the same user as the NetProbe.
threshold
Set the threshold value and test type against which the thresholdParameter for each process is tested. Test type can be either Greater Than Or Equal To or Less Than Or Equal To.
threshold > thresholdParameter
Defines the parameter which is tested against the threshold value to determine if processes are in breach. Can be defined as any numeric processParameter.
threshold > greaterThanOrEqualTo
Sets threshold test type to Greater Than Or Equal To. Contains a decimal value.
threshold > lessThanOrEqualTo
Sets threshold test type to Less Than Or Equal To. Contains a decimal value.
tailOffPeriod
Defines the optional Tail Off Period for which processes remain displayed after they are no longer in breach of the threshold value, this is in order to aid logging cells to a database. If set to zero, processes cease to be displayed the moment they are out of breach.
Value is in seconds.
maxRows
Defines an optional maximum number of breached processes to display.
Note: If this value is used in conjunction with a tailOffPeriod, then additional rows are displayed. The additional rows are tailed-off processes and are only displayed for the tailOffPeriod.
If this field is used in conjunction with Less Than Or Equal To, then the plug-in sorts the rows in the dataview according the processes with the least HandleCount values, before limiting the number of rows accordingly.
adjustForLogicalCPUs
Adjusts the percentCPU of each process by taking the percentage of time being used by the process and then dividing by the number of cores/cpus on the machine. For example, suppose we have a process using up 10% of 1 core on a 4 core box. If this setting is enabled, then we would expect the dataview to display this process as using 2.5% instead. If this setting is not enabled then we would expect 10% to be displayed as usual.