In nagios, when I force a host check, I can clearly see in the logs:

[1279640879] EXTERNAL COMMAND: SCHEDULE_FORCED_HOST_CHECK;random-host.domain.org;1279640878

However it never actually runs the check-host-alive check, which is check_ping.

It doesn't appear in the scheduling queue, and even though the host in not pingable, and running the command itself fails it still shows as being up.

/usr/local/nagios/libexec/check_ping -H random-host.domain.org -w 3000.0,80% -c 5000.0,100% -p 1
CRITICAL - Host Unreachable (random-host.domain.org)

Host Status:      UP    
Status Information: PING OK - Packet loss = 0%, RTA = 1.10 ms
Performance Data:   
Current Attempt:    1/10
State Type: HARD
Last Check Type:    ACTIVE
Last Check Time:    07-19-2010 16:02:03
Status Data Age:    0d 0h 0m 30s
Next Scheduled Active Check:    N/A
Latency:    0.000 seconds
Check Duration: 0.053 seconds
Last State Change:  04-01-2009 15:58:50
Current State Duration: 474d 0h 3m 43s
Last Host Notification: N/A
Current Notification Number:    0  
Is This Host Flapping?    NO  
Percent State Change:   0.00%
In Scheduled Downtime?    NO  
Last Update:    07-19-2010 16:02:25

I can grep back through my archive logs, and see where hosts have been reported down however the above host was down for almost 30 minutes and we still received no alert:

HOST ALERT: some-other-random-host.org;DOWN;SOFT;1;CRITICAL - Host Unreachable (some-other-random-host.org)

My host and check-host-alive settings are:

define host{
name                            linuxprod-server
use                             generic-host
check_period                    24x7
max_check_attempts              10
check_command                   check-host-alive
notification_period             24x7
notification_interval           120
notification_options            d,u,r
contact_groups                  linux admins
register                        0
}

and check-host-alive is defined here:

# 'check-host-alive' command definition
define command{
        command_name    check-host-alive
        command_line    $USER1$/check_ping -H $HOSTALIAS$ -w 3000.0,80% -c 5000.0,100% -p 1
        }

Thoughts?

asked 20 Jul '10, 16:04

rfelsburg's gravatar image

rfelsburg ♦
6061618
accept rate: 25%




The CATCH is it doesn't look like its running, because it turns out nagios only logs failed checks unless you tell it to --enable-DEBUG3 at build time.

Took me a couple of days of sorting through to realize that I was only seeing SERVICE ALERT in log files.

Still haven't figured out why it was reporting as up, but I'll get to that next.

link

answered 21 Jul '10, 13:55

rfelsburg's gravatar image

rfelsburg ♦
6061618
accept rate: 25%

Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported

Tags:

×90
×8
×4
×2
×2

Asked: 20 Jul '10, 16:04

Seen: 22,060 times

Last updated: 04 Aug '10, 14:24

powered by OSQA