Nagios探索之五——服务监控的配置

Nagios 监控的主要内容，也是我们最为关注的内容是对服务的监控。Nagios中对服务的定义方式与上文中定义主机的方式一致，参数也大体相同。下面列出定义servce的参数，其含义大多在上文中介绍过了，此处及就不再介绍了。

define service {

host_name host_name

service_description service_description

servicegroups servicegroup_names

is_volatile [0/1] #是否启用“volatile”模式。这个模式的含义是说名这个服务是不稳定的，或者说是危险的。只要其状态改变了，就不会再自己被恢复回来。这个参数很少使用，等到使用的时候我们再来细说。

check_command command_name

max_check_attempts #

normal_check_interval #

retry_check_interval #

active_checks_enabled [0/1]

passive_checks_enabled [0/1]

check_period timeperiod_name

parallelize_check [0/1]

obsess_over_service [0/1]

check_freshness [0/1]

freshness_threshold #

event_handler command_name

event_handler_enabled [0/1]

low_flap_threshold #

high_flap_threshold #

flap_detection_enabled [0/1]

process_perf_data [0/1]

retain_status_information [0/1]

retain_nonstatus_information [0/1]

notification_interval #

notification_period timeperiod_name n

otification_options [w,u,c,r,f]

notifications_enabled [0/1]

contact_groups contact_groups

stalking_options [o,w,u,c]

}

ok，还是让我们来举例说明一下吧。

1、随时监控Web.TEST主机上的HTTP服务（80端口），发生两次不能访问及认定是发生故障，故障累计3次告警，联系人组是mygroup。告警之后每两分钟再进行一次检查，如果10分钟之后仍然没有恢复，再发送一次告警。

define service {

host_name Web.TEST

service_description check_tcp 80

check_period 24x7

max_check_attempts 2

normal_check_interval 3

retry_check_interval 2

contact_groups mygroup

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

check_command check_tcp!80

}

如果要检测其他服务，则将代码中蓝色的两行修改即可。例如，要检查默认的ssh服务是否开启：

define service {

host_name Web.TEST

service_description check_ssh

check_period 24x7

……

check_command check_ssh

}

在测试服务的时候，要充分发挥插件的功能。