Nagios is is a useful tool for monitoring many servers and their services. At a glance (or via email) you can see which services/hosts are experiencing problems.
Monitoring public services such as HTTP, FTP, LDAP, SSH are relatively easy but to go a little further and check disk or swap usage on a remote machine use NRPE.
The assumption below is that the nagios service is debian/ubuntu and the server to be monitored is centos/red hat, possibly a bit weird but …
On the server to be monitored
On CentOS/Red Hat you must have the RPMforge repositories installed:
wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.3.6-1.el5.rf.i386.rpm
rpm -Uhv rpmforge*
and then:
yum install nagios-nrpe nagios-plugins-nrpe
To ensure the service is running and listening use:
chkconfig nrpe on
To view the installed plugins.
ls /usr/lib/nagios/plugins
OR
ls /usr/lib64/nagios/plugins
Next add the nagios machine to the allowed hosts:
vi /etc/nagios/nrpe.cfg
allowed_hosts=127.0.0.1,192.168.0.100
Then start the NRPE service:
service nrpe start
On the server doing the monitoring
Install the NRPE plugin on the monitoring server:
apt-get install nagios-nrpe-plugin
cd /usr/local/nagios/libexec
ln -s /usr/lib/nagios/plugins/check_nrpe ./check_nrpe
Edit the commands file:
vi /usr/local/nagios/etc/objects/commands.cfg
define command {
command_name check_nrpe_load
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_load
}
Edit the services file:
vi /usr/local/nagios/etc/objects/services.cfg
define service{
use generic-service
host_name clam1
service_description LOAD
check_command check_nrpe_load
}
Restart nagios:
/etc/init.d/nagios restart
Check the web application and you should be monitoring your new service.
There is a security issue with passing command arguments to the NRPE and in the logs I saw:
nrpe[12196]: Error: Request contained command arguments, but argument option is not enabled!
so rather than:
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_disk -w 5% c 2%
we actually have to create a script on the host with the arguments hardwired i.e.
vi /usr/lib64/nagios/plugins/check_my_disk
/usr/lib64/nagios/plugins/check_disk -w 5% -c 2%
chmod a+x /usr/lib64/nagios/plugins/check_my_disk
Restart nrpe:
/etc/init.d/nrpe restart
and then from the nagios server call:
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_my_disk
May also need to define the command on the NRPE machine in /etc/nagios – I will have to get back to this!!