Back to OP5 Monitor FAQ

How to monitor Microsoft SQL-Server

Microsoft SQL Server is one of the most used databases today. Because it often has an important role in your business, you need to make sure you get alerted if there is a problem with your databases. This article describes how OP5 Monitor can monitor both important environmental parts of the operating system and how to perform queries to the database. In this document, we assume that Microsoft SQL Server is the main application running on the server.

This will be done Copied

The suggested configuration components for monitoring Microsoft SQL Server are:

Concerning SQL 2012 Copied

To use it with Microsoft SQL server 2012 or later, you will have to change the TDS version in “/etc/freetds.conf” to “8.0”.

Check commands Copied

Add the required check commands, if they don’t already exist in your configuration, with the import functionality in OP5 Monitor (’Configure’ -> ‘Commands’ -> ‘Check Command Import’):

List of commands Copied

Plugin name check_command
check_nt check_mssql_db_file_size
check_mssql_log_file_size
check_nt_memory_page_per_sec
check_nt_physical_disk_time
check_mssql_num_deadlocks
check_mssql_num_user_connections
negate check_mssql_cache_hit_ratio
check_sql check_mssql_backup_job
check_mssql_query_string_regex
check_mssql_query_count
check_mssql_query_reponstime

Description of the commands Copied

In the table below are descriptions of the so-called performance check_commands used to monitor a Microsoft SQL Server.

command_name description
check_mssql_db_file_size and check_mssql_log_file_size - Disk usage is something we always check. With this check_command you will be able to monitor the size of the data and log files size. - If you use ‘_total’ as $ARG1$ you will get the sum of all data or log files. - Warning and Critical is measured in KB.
check_nt_memory_page_per_sec - This counter will measure pages per second that is fetched from RAM to the hard drive or vice versa. - Normally this value should be 0 - Values between 1 to 20 should not be a problem but as usual everything is relative - Values over 20, during a longer time period, mostly means that you need to add more RAM to the server
check_nt_physical_disk_time - This counter measures how busy a physical array is, not any logical partition or an individual disc in the array. - It will give you a pretty good knowledge about how busy your disk array is. - The value should not be over 55% on the server. If it is over 55% during several 10 minutes periods over 24 hours it will be an indication of I/O problems on the server
check_mssql_num_deadlocks - Deadlocks appear when two processes locked one “pieces of data” each, both try to lock the same “new piece” of data at the same time. Every involved process will be waiting indefinitely for the other process to release the lock. Microsoft SQL Server will detect and kill one of the processes which means that one process will lose data - OP5 Monitor will send a Critical alert as soon as a deadlock appears
check_mssql_num_user_connections - This check_command will count the number of connections. - If the number of connections is too high you might run into problems and so you may need to increase the number of threads on the running server.
check_mssql_cache_hit_ratio - Buffer Cache Hit Ratio counter indicates of how often your Microsoft SQL Server goes to the Buffer instead of the hard drive when searching for data. - The value should be over 90% - We may only use either Warning or Critical here so in the default config we have chosen Critical.

The table below describes the check_commands you may use when monitoring Microsoft SQL Server by queries.

command_name description
check_mssql_backup_job - This check_command is one way to monitor the internal backup jobs of your Microsoft SQL Server. State OK will be true if the string “The job succeeded.” is found in the query made to the MSDB database - State Critical will be returned if the string “The job succeeded” is not found or if the last backup job is older than 24 hours
check_mssql_query_string_regex - This check_command takes a given query and will search for a given search string in the result set and alert if not found. - You may use regular expressions when searching for a string in the first cell in the first row of the result.
check_mssql_query_count - This check_command takes a given query and returns the number of hits.
check_mssql_query_reponstime - This check_command will report the response time of a given query.

Adding the services Copied

Add the required services, (‘Configure’ -> ‘Host: ’ -> ‘Go’ -> ‘Services for host ’ -> ‘Add new service’ -> ‘Go’):

When you added the host you had probably already added services like CPU usage, mem usage, disk usage and so on. The table below describes services you probably want to add. (Arguments are just examples, you need to adjust them to suit your environment).

Performance services Copied

service_description check_command check_command_args
MSSQL Services check_nt_service SQLSERVERAGENT,MSSQLSERVER
MSSQL DB File Size OP5 check_mssql_db_file_size 1024!1256
MSSQL Log File Size OP5 check_mssql_log_file_size 1024!1256
NT Memory Page/SecP check_nt_memory_page_per_sec 20!30
NT Physical Disk Time check_nt_physical_disk_time 45!55
MSSQL Cache Hit Ratio check_mssql_cache_hit_ratio 90!80
MSSQL Num Deadlocks check_mssql_num_deadlocks 1
MSSQL Num User Connections check_mssql_num_user_connections 200!250

Query services Copied

service_description check_command check_command_args
MSSQL Backup job ? master check_mssql_backup_jobn user!passwd!job name
MSSQL Select String check_mssql_query_string_regex user!password!db!select field from table where?!.*The job succeeded..*
MSSQL query count check_mssql_query_count user!password!db!select field from table where?!10!20
MSSQL respons time check_mssql_query_reponstime user!password!db!select field from table where?!5!10

Use the “Test this service” button for the services to see if they work. Once they are correct and working as they should, you may add the services to all of your Microsoft SQL Servers with the clone function.

Monitoring a cluster Copied

It is not unusual that a Microsoft SQL Server is running in a clustered environment. Of course we can monitor your Microsoft SQL Server in that kind of environment too. Please read the how-to about Monitoring Microsoft clustered servers.

["Geneos"] ["FAQ"]

Was this topic helpful?