Internal documentation only
This page has been marked as draft.
Manage collector hosts files manually
Collectors need to communicate with the other collectors in their cluster. Collectors also need to resolve themselves through the 127.0.1.1 IP address, otherwise the opsview-messagequeue and other components running on the collector will not work.
Usually, opsview-deploy takes care of collector hosts files and no manual intervention is needed. However, if you have disabled changes to hosts files when running opsview-deploy, manual management of collector hosts files becomes necessary.
Prerequisites Copied
- None
- You should only need to do this if you have disabled changes to hosts files when running opsview-deploy
Process Copied
Follow these steps on each collector in your collector cluster.
- Log in to the collector as
root. - Check the collector’s hostname and FQDN are set correctly using the
hostnamecommand:
hostname
hostname -f
These commands should return the collector’s hostname and FQDN respectively. The command output might look like this:
collector-1
collector-1.domain.name
but should not look this:
localhost
localhost.localdomain
localhost hostname or .localdomain FQDN
Copied
If the collector’s hostname is localhost or its FQDN ends with .localdomain, update the hostname or FQDN to their expected values. The hostname and FQDN should match the details in your opsview_deploy.yml file.
- Check if
/etc/hostshas a 127.0.1.1 entry for the current collector’s hostname and FQDN usinggrep:
grep "127.0.1.1" /etc/hosts
You should see a result like this:
127.0.1.1 collector-1.domain.name collector-1
No command output Copied
If you get no output from the grep command, this means that /etc/hosts does not contain a 127.0.1.1 entry for the collector. Run this echo command to add the missing 127.0.1.1 entry:
echo "127.0.1.1 $(hostname) $(hostname -f)" >> /etc/hosts
- Check that the hostname and FQDN of the current collector resolve to a 127.0.1.1 address using
ping:
ping -c 3 $(hostname)
ping -c 3 $(hostname -f)
Both commands should succeed. If either command fails, go back to step 1 and try again.
- Verify that the current collector can resolve the hostnames and FQDNs of the other collectors in its cluster using
ping.
ping -c 3 collector-2
ping -c 3 collector-2.domain.name
You should be able to ping all the other collectors in the cluster by both hostname and FQDN.
Successful output from pinging a collector with its FQDN might look like this:
PING collector-2.domain.name (192.168.17.126) 56(84) bytes of data.
64 bytes from collector-2.domain.name (192.168.17.126): icmp_seq=1 ttl=64 time=0.585 ms
64 bytes from collector-2.domain.name (192.168.17.126): icmp_seq=2 ttl=64 time=0.821 ms
64 bytes from collector-2.domain.name (192.168.17.126): icmp_seq=3 ttl=64 time=0.744 ms
--- collector-2.domain.name ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 0.585/0.716/0.821/0.103 ms
ping fails
Copied
If you cannot ping a collector by its hostname or FQDN, review your DNS or add entries for the unreachable collector to the /etc/hosts file of the collector from which ping fails. Such entries in the /etc/hosts file might look like this:
192.168.17.125 collector-1.domain.name collector-1
192.168.17.126 collector-2.domain.name collector-2
192.168.17.127 collector-3.domain.name collector-3