I was bored tonight so I wrote a wrapper for hddtemp for Nagios monitoring. I have a bit of a quirky setup for Nagios where I run the local system checks on remote systems via netcat, ipsvd, and a script to handle the query. This allows me to monitor remote drive space, current users, total processes, and current load. Using hddtemp, I can now monitor the temperature of the drives in those machines (which also gives me an idea of how hot/cold the server room itself is).

This may need some tweaking to work with other Nagios setups, but shouldn’t be too hard to adapt. One of these days I’ll do a writeup on my Nagios configuration. Anyways, the wrapper script is as follows. It could probably be optimized a bit more, but it works well enough. Wordpress doesn’t handle the indents very well, so keep that in mind.

#!/bin/sh

usage() {
    echo "${0} -w [warn] -c [crit] [drives]"
}

if [ "${1}" == "-h" -o "${1}" == "--help" ]; then
    usage
    exit 0
fi
if [ "${1}" == "-w" ]; then
    shift
    warn="${1}"
    shift
else
    usage
    exit 1
fi
if [ "${1}" == "-c" ]; then
    shift
    crit="${1}"
    shift
else
    usage
    exit 1
fi
while [ "${1}" != "" ]; do
    drives="${drives} ${1}"
    shift
done
if [ "${drives}" == "" ]; then
    usage
    exit 1
fi

status=0
smsg=""
htemp=0

for drive in ${drives}; do
    msg=""
    stats=`/usr/local/sbin/hddtemp ${drive}`
    model=`echo ${stats} | cut -d ':' -f 2`
    temp=`echo ${stats} | cut -d ':' -f 3 | cut -d ' ' -f 2`
    dev=`echo ${drive}|cut -d '/' -f 3`

    if [ "${temp}" -ge "${warn}" ]; then
        if [ "${status}" != "2" ]; then
            status=1
        fi
    fi

    if [ "${temp}" -ge "${crit}" ]; then
        status=2
    fi

    if [ "${temp}" -gt "${htemp}" ]; then
        htemp="${temp}"
    fi

    smsg="${smsg}${dev}=${temp}C; "
done

case "${status}" in
    2)
        wmsg="CRITICAL"
        ;;
    1)
        wmsg="WARN"
        ;;
    0)
        wmsg="OK"
        ;;
esac

echo "HDDTEMP ${wmsg} - ${smsg}|hddtemp=${htemp};${warn};${crit};0"

The output, in Nagios’ status view looks like:

HDDTEMP OK - hda=22C: sda=24C: sdb=24C: 

It’s called as “hddtemp-mon -w 30 -c 35 /dev/hda /dev/sda /dev/sdb”.

Share on: TwitterLinkedIn


Related Posts


Published

Category

Linux

Tags

Stay in touch