How to troubleshoot Naemon Configuration Server (Nachos)
Introduction Copied
Basic troubleshooting when encountering issues saving the configuration.
Check if the Nachos service is running Copied
Log in to the Monitor server with SSH, and check if the nachos service is running.
systemctl status nachos
The above command is also run by the UI, and a red warning message should show up if it indicates a problem with the service. If the service is inactive, try to start it again:
systemctl start nachos
If the service fails to start, refer to the status output or logs for clues.
Check if the Nachos virtual environment exists Copied
This is only applicable for environments on Monitor 8.x Copied
Environments running Monitor 9.x no longer uses the Python venv.
Check whether this path exists:
/opt/monitor/nachos/venv/bin/
If not, reinstall nachos:
# yum reinstall op5-nachos -y
Check the web console when trying to save Copied
When on the save page, open the browsers console (ctrl + shift + c in Chrome), and select the save button. Next switch to the console tab, and click on save. The console log might show clues as to why the save fails. For example this console log indicates that the nachos service might be down.
POST https://127.0.0.1/nachos/api/v1/exports/ 503 (Service Unavailable)
Logs Copied
The logs are located in
/var/log/op5/nachos.log
Change log level (debug logging) Copied
Change default_log_levelsin /opt/monitor/nachos/nachos.cfg:
default_log_levels = nachos=DEBUG, eventlet=ERROR
Valid levels are: CRITICAL, ERROR, INFO, WARNING, and DEBUG.
Then restart Nachos:
systemctl restart nachos
Configuration Files Copied
The configuration is located in:
/opt/monitor/nachos/nachos.cfg
Common problems Copied
Stuck in “Export already in progress” Copied
This is a Nacoma related issue, and might occur in the new service as well. When an export stats, an object lock entry is inserted into the objlocks table in the Nacoma database. This entry is valid for five minutes, and is removed when the export has finished. In certain circumstances the entry might not be removed, for example when reaching PHP’s max_exection_time limit in old Nacoma, which should not be an issue in the new service as it does not use PHP. If a user encounters this problem and want to unlock the database without waiting for the entry to expire, it is possible to truncate the objlocks table:
mysql nacoma -e "truncate objlocks"
Access denied for user ‘x’@‘x’ Copied
If there is a line in the log file looking like this:
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1045, “Access denied for user ‘user’@’localhost’ (using password: YES)”) (Background on this error at: http://sqlalche.me/e/e3q8)
There can be multiple reasons why this problem is encountered. Most common reason is having “skip-networking” option set in /etc/my.cnf. The workaround is to use the MySQL Unix socket instead of TCP/IP.
Open /opt/monitor/nachos/nachos.cfg, and under [database] and [nacoma] add ‘?unix_socket=/var/lib/mysql/mysql.sock’ to the connection variable, and remove ‘:3306’.
From:
[database]connection = mysql+pymysql://username:password@127.0.0.1/nachos[nacoma]connection = mysql+pymysql://root@127.0.0.1:3306/nacoma
To:
[database]connection = mysql+pymysql://username:password@127.0.0.1/nachos?unix_socket=/var/lib/mysql/mysql.sock[nacoma]connection = mysql+pymysql://root@127.0.0.1/nacoma?unix_socket=/var/lib/mysql/mysql.sock
After configuration changes restart the service
service nachos restart
Another reason could be wrong user credentials. Make sure they are correct.
connection = mysql+pymysql://user:wrongpassword@127.0.0.1/nachos
to
connection = mysql+pymysql://correctuser:correctpassword@127.0.0.1/nachos
Save the file and restart the nachos service:
service nachos restart
Missing exports table Copied
This can happen when having “skip-networking” in /etc/my.cnf during the installation of Nachos. This oneliner will recreate the table.
/opt/monitor/nachos/venv/bin/nachos-manage --config-file /opt/monitor/nachos/nachos.cfg db_sync
The Nachos service cannot start Copied
If the output of systemctl status nachos returns the following:
# systemctl status nachos
â— nachos.service - A Naemon configuration management service
Loaded: loaded (/usr/lib/systemd/system/nachos.service; enabled; vendor preset: disabled)
Active: activating (auto-restart) (Result: exit-code) since Wed 2023-02-22 09:15:19 CET; 4s ago
Process: 833178 ExecStartPre=/opt/monitor/nachos/init_db.sh (code=exited, status=1/FAILURE)
Then the service cannot be started due to some access issue with the database. Pay close attention to the “Process” output, wher it mentions that the Nachos service cannot start when executing the “init_db.sh” script. This is a database initialization script and an error may occur if the nachos database user does not have correct access privileges to the nachos database.
Check that the database user defined in the /opt/monitor/nachos/nachos.cfg file has the correct details that correspond to the privileges defined in the database. The following command can be ran to check the Host and User privileges held within the database:
[root@op5-system ]# mysql mysql -e " select Host, User from user;"
+------------+----------+
| Host | User |
+------------+----------+
| 127.0.0.1 | root |
| ::1 | root |
| localhost | |
| localhost | magellan |
| localhost | merlin |
| localhost | nachos |
| localhost | nacoma |
| localhost | root |
| localhost | trapper |
| op5-system | |
| op5-system | root |
+------------+----------+
In /opt/monitor/nachos/nachos.cfg, ensure that the user and host details correspond to the user and host from the output of the command above:
[database]
connection = mysql+pymysql://nachos:NACHOS_DBPASS@localhost/nachos?unix_socket=/var/lib/mysql/mysql.sock
[nacoma]
connection = mysql+pymysql://nachos:NACHOS_DBPASS@localhost/nacoma?unix_socket=/var/lib/mysql/mysql.sock
Notice that the user is nachos and the host is localhost in the connection string for the /opt/monitor/nachos/nachos.cfg file (connection = mysql+pymysql://nachos:NACHOS_DBPASS@localhost/nacoma?unix_socket=/var/lib/mysql/mysql.sock) and it matches what is shown from the MySQL output that was run earlier.
Database passwords can also be changed.
Reverting to old functionality Copied
Last resort in case of emergency, this will disable Nachos and should revert the product to using only Nacoma.
rpm -e --nodeps op5-nachos-ui
systemctl restart httpd