problems on their corresponding physical devices (most commonly paper
supply) causing jobs to build up. Since I couldn't find one that will
show the status the way I wanted to, I created a simple script called
"check_printq".
Its a crude one but it works. You can find the script below.
tested on:
AIX 5.3/6.1 with Version 2.5.x
Nagios® Core™ 3.2.3
Icinga 1.3.1
NRPE Version: 2.12
WHAT THE SCRIPT DOES
it will monitor specific queues on the AIX machine. Since our NRPE
setup was compiled without command argument processing, each monitored
queue needs to be defined in the corresponding nrpe configuration.
although the script can be used with arguments if your NRPE daemon
supports it.
DEFINING IT IN NRPE (AIX SIDE):
edit:
/opt/nagios/etc/nrpe_check_commands.cfg
add:
command[check_PTISAP11]=/opt/nagios/libexec/check_printq -Q PTISAP11 -w 3 -c 10 command[check_PTISAP05]=/opt/nagios/libexec/check_printq -Q PTISAP05 -w 4 -c 5
reload nrpe daemon:
kill -1 $PID
or restart:
/opt/nagios/sbin/nrpe -n -c /opt/nagios/etc/nrpe.cfg -d
NAGIOS SERVER SIDE DEFINITION
define service{
hostgroup_name AIX-TEST
service_description PTISAP11
check_command check_nrpe!check_PTISAP11
use template-service-test
}
define service{
hostgroup_name AIX-TEST
service_description PTISAP05
check_command check_nrpe!check_PTISAP05
use template-service-test
}
save and do a syntax check;
sudo nagios -v ../nagios.cfg
and then reload nagios.
SAMPLE RESULTS
qchk -WA
Queue Dev Status Job Files User PP % Blks Cp Rnk
-------------------- -------------- --------- ------ ------------------ ---------- ---- --- ----- --- ---
PTISAP11 hp@PTISAP11 DOWN
QUEUED 1193 STDIN.426216 root 1 1 1
PTISAP08 hp@PTISAP08 READY
PTISAP0801 hp@PTISAP08 READY
PTISAP05 hp@PTISAP05 READY
PTISAP0501 hp@PTISAP05 READY
nagios image:
THE SCRIPT
#! /bin/sh
#####
# check_printq - plugin to check AIX printer queues
# will show queue status and queue job number
# v1.2 - updated some queue checking codes
####
ECHO="/usr/bin/echo"
QCHK="/usr/bin/qchk"
GREP="/usr/bin/egrep"
PROGNAME=`/usr/bin/basename $0`
PROGPATH=`echo $0 | sed -e 's,[\\/][^\\/][^\\/]*$,,'`
REVISION=`echo '$Revision: 1.2 $' | sed -e 's/[^0-9.]//g'`
. $PROGPATH/utils.sh
ShowUsage() {
echo "Usage: $PROGNAME -Q QueueName -w # -c #"
echo "Usage: $PROGNAME --help"
echo "Usage: $PROGNAME --version"
}
ShowHelp() {
print_revision $PROGNAME $REVISION
echo ""
ShowUsage
echo ""
echo "AIX Qdaemon monitor for Nagios"
echo ""
support
}
# Make sure the correct number of command line
# arguments have been supplied
if [ $# -lt 1 ]; then
ShowUsage
exit $STATE_UNKNOWN
fi
# Grab the command line arguments
exitstatus=$STATE_WARNING #default
while test -n "$1"; do
case "$1" in
--help)
ShowHelp
exit $STATE_OK
;;
-h)
ShowHelp
exit $STATE_OK
;;
--version)
print_revision $PROGNAME $VERSION
exit $STATE_OK
;;
-V)
print_revision $PROGNAME $VERSION
exit $STATE_OK
;;
-Q)
qNAME=$2
shift
;;
-w)
qWarnCount=$2
shift
;;
-c)
qCritCount=$2
shift
;;
*)
echo "Unknown argument: $1"
ShowUsage
exit $STATE_UNKNOWN
;;
esac
shift
done
# check if valid printer queue
qCheck=`${QCHK} -WP${qNAME} > /dev/null 2>&1 ; echo $?`
#invalid queue will spit code 64
if [ ${qCheck} -ne 0 ]
then
#if not valid, exit
exit $STATE_UNKNOWN
fi
# get jobcount and compare with threshold
qJobCount=`${QCHK} -P${qNAME} | ${GREP} -c QUEUED`
nSTATUS=$STATE_OK
if [ ${qJobCount} -gt ${qWarnCount} -a ${qJobCount} -lt ${qCritCount} ]
then
nSTATUS=$STATE_WARNING
else
if [ ${qJobCount} -gt ${qWarnCount} -o ${qJobCount} -gt ${qCritCount} ]
then
nSTATUS=$STATE_CRITICAL
fi
fi
# check status if queue is DOWN, READY or RUNNING
qSTAT=`${QCHK} -WP${qNAME} | ${GREP} ${qNAME} | awk '{print $3}'`
case "${qSTAT}" in
READY)
$ECHO "$qNAME is ${qSTAT} with [ ${qJobCount} ] queued Jobs"
exit $nSTATUS
;;
RUNNING)
$ECHO "$qNAME is ${qSTAT} with [ ${qJobCount} ] queued Jobs"
exit $nSTATUS
;;
DOWN)
$ECHO "$qNAME is ${qSTAT} with [ ${qJobCount} ] queued Jobs"
exit $STATE_CRITICAL
;;
*)
# any other state = unknown
${ECHO} "[ ${qSTAT} ] queued Jobs with Unknown STATUS: ${qSTAT}"
nSTATUS=$STATE_UNKNOWN
exit $STATE_UNKNOWN
;;
esac
exit $nSTATUS
## EOF
CHECKING THE SCRIPT
checking from AIX side (local to the script):
run it using:
sh -x /opt/nagios/libexec/check_printq -Q PTISAP11 -w 2 -c 10
this will run the script in debug mode and will show you
what exactly is happening.
from nagios server side:
[user@NAGIOS objects]$ /usr/lib64/nagios/plugins/check_nrpe -H UX0017 -n -p 5666 -c check_PTISAP11; echo $?
PTISAP11 is DOWN with [ 0 ] queued Jobs
2
[user@NAGIOS objects]$ /usr/lib64/nagios/plugins/check_nrpe -H UX0017 -n -p 5666 -c check_PTISAP05; echo $?
PTISAP05 is READY with [ 0 ] queued Jobs
0
this will verify the return codes needed for the monitoring based on the defined NAGIOS PLUGIN RETURN CODES.
Further reading:
http://nagiosplug.sourceforge.net/developer-guidelines.html#PLUGOUTPUT

Good work! Thank you.
ReplyDeletethis just made it to Nagios Exchange:
ReplyDeletehttp://exchange.nagios.org/directory/Plugins/Printing/check_printq/details