I was bored tonight so I wrote a wrapper for hddtemp for Nagios monitoring. I have a bit of a quirky setup for Nagios where I run the local system checks on remote systems via netcat, ipsvd, and a script to handle the query. This allows me to monitor remote drive space, current users, total processes, and current load. Using hddtemp, I can now monitor the temperature of the drives in those machines (which also gives me an idea of how hot/cold the server room itself is).
This may need some tweaking to work with other Nagios setups, but shouldn’t be too hard to adapt. One of these days I’ll do a writeup on my Nagios configuration. Anyways, the wrapper script is as follows. It could probably be optimized a bit more, but it works well enough. Wordpress doesn’t handle the indents very well, so keep that in mind.
#!/bin/sh usage() { echo "${0} -w [warn] -c [crit] [drives]" } if [ "${1}" == "-h" -o "${1}" == "--help" ]; then usage exit 0 fi if [ "${1}" == "-w" ]; then shift warn="${1}" shift else usage exit 1 fi if [ "${1}" == "-c" ]; then shift crit="${1}" shift else usage exit 1 fi while [ "${1}" != "" ]; do drives="${drives} ${1}" shift done if [ "${drives}" == "" ]; then usage exit 1 fi status=0 smsg="" htemp=0 for drive in ${drives}; do msg="" stats=`/usr/local/sbin/hddtemp ${drive}` model=`echo ${stats} | cut -d ':' -f 2` temp=`echo ${stats} | cut -d ':' -f 3 | cut -d ' ' -f 2` dev=`echo ${drive}|cut -d '/' -f 3` if [ "${temp}" -ge "${warn}" ]; then if [ "${status}" != "2" ]; then status=1 fi fi if [ "${temp}" -ge "${crit}" ]; then status=2 fi if [ "${temp}" -gt "${htemp}" ]; then htemp="${temp}" fi smsg="${smsg}${dev}=${temp}C; " done case "${status}" in 2) wmsg="CRITICAL" ;; 1) wmsg="WARN" ;; 0) wmsg="OK" ;; esac echo "HDDTEMP ${wmsg} - ${smsg}|hddtemp=${htemp};${warn};${crit};0"
The output, in Nagios’ status view looks like:
HDDTEMP OK - hda=22C: sda=24C: sdb=24C:
It’s called as “hddtemp-mon -w 30 -c 35 /dev/hda /dev/sda /dev/sdb”.