Back to Opsview FAQ

Internal documentation only

This page has been marked as draft.

Manage collector hosts files manually

Collectors need to communicate with the other collectors in their cluster. Collectors also need to resolve themselves through the 127.0.1.1 IP address, otherwise the opsview-messagequeue and other components running on the collector will not work.

Usually, opsview-deploy takes care of collector hosts files and no manual intervention is needed. However, if you have disabled changes to hosts files when running opsview-deploy, manual management of collector hosts files becomes necessary.

Prerequisites Copied

Process Copied

Follow these steps on each collector in your collector cluster.

  1. Log in to the collector as root.
  2. Check the collector’s hostname and FQDN are set correctly using the hostname command:
hostname
hostname -f

These commands should return the collector’s hostname and FQDN respectively. The command output might look like this:

collector-1
collector-1.domain.name

but should not look this:

localhost
localhost.localdomain

localhost hostname or .localdomain FQDN Copied

If the collector’s hostname is localhost or its FQDN ends with .localdomain, update the hostname or FQDN to their expected values. The hostname and FQDN should match the details in your opsview_deploy.yml file.

  1. Check if /etc/hosts has a 127.0.1.1 entry for the current collector’s hostname and FQDN using grep:
grep "127.0.1.1" /etc/hosts

You should see a result like this:

127.0.1.1       collector-1.domain.name  collector-1

No command output Copied

If you get no output from the grep command, this means that /etc/hosts does not contain a 127.0.1.1 entry for the collector. Run this echo command to add the missing 127.0.1.1 entry:

echo "127.0.1.1 $(hostname) $(hostname -f)" >> /etc/hosts
  1. Check that the hostname and FQDN of the current collector resolve to a 127.0.1.1 address using ping:
ping -c 3 $(hostname)
ping -c 3 $(hostname -f)

Both commands should succeed. If either command fails, go back to step 1 and try again.

  1. Verify that the current collector can resolve the hostnames and FQDNs of the other collectors in its cluster using ping.
ping -c 3 collector-2
ping -c 3 collector-2.domain.name

You should be able to ping all the other collectors in the cluster by both hostname and FQDN.

Successful output from pinging a collector with its FQDN might look like this:

PING collector-2.domain.name (192.168.17.126) 56(84) bytes of data.
64 bytes from collector-2.domain.name (192.168.17.126): icmp_seq=1 ttl=64 time=0.585 ms
64 bytes from collector-2.domain.name (192.168.17.126): icmp_seq=2 ttl=64 time=0.821 ms
64 bytes from collector-2.domain.name (192.168.17.126): icmp_seq=3 ttl=64 time=0.744 ms

--- collector-2.domain.name ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 0.585/0.716/0.821/0.103 ms

ping fails Copied

If you cannot ping a collector by its hostname or FQDN, review your DNS or add entries for the unreachable collector to the /etc/hosts file of the collector from which ping fails. Such entries in the /etc/hosts file might look like this:

192.168.17.125 collector-1.domain.name collector-1
192.168.17.126 collector-2.domain.name collector-2
192.168.17.127 collector-3.domain.name collector-3
["Geneos"] ["Opsview > System"] ["FAQ"]

Was this topic helpful?