8、nagios安装

8.1 下载需要的文件

Nagios-3.4.1.tar.gz

Nagios-plugins-1.4.16.tar.gz

Ndoutils-1.5.2.tar.gz

Npc2.0.4.tar.gz(这个不好找)

Nrpe-2.13.tar.gz

8.2 nagios和nagios plugins的安装

Tar zxvf nagios-3.4.1.tar.gz

Cd nagios-3.4.1

./configure –prefix=/var/www/localhost/htdocs/nagios

Make all

Mkdir /var/www/localhost/htdocs/nagios

Useradd nagios

Passwd nagios

Groupadd nagios

Usermod –G nagios nagios

Usermod –G nagios apache

Chown –R nagios:nagios /var/www/localhost/htdocs/nagios

Make install

Make install-init

Make install-commandmode

Make install-config

Cd ..

Tar zxvf nagios-plugins- 1.4.16.tar.gz

Cd nagios-plugins-1.4.16

./configure –prefix=/var/www/localhost/htdocs/nagios/

Make

Make install

8.3 httpd.conf的修改

vi /etc/apache2/httpd.conf

添加

 

#setting for nagios 20120815

ScriptAlias /nagios/cgi-bin /var/www/localhost/htdocs/nagios/sbin

<Directory /var/www/localhost/htdocs /nagios/sbin">

Options ExecCGI

AllowOverride None

Order allow,deny

Allow from all

AuthName "Nagios Access"

AuthType Basic

AuthUserFile /var/www/localhost/htdocs /nagios/etc/htpasswd

# For this directory to access the authentication file

 

Require valid-user

</Directory>

 

Alias /nagios /var/www/localhost/htdocs /nagios/share

<Directory /var/www/localhost/htdocs /nagios/share">

Options None

AllowOverride None

Order allow,deny

Allow from all

AuthName "Nagios Access"

AuthType Basic

AuthUserFile /var/www/localhost/htdocs/nagios/etc/htpasswd

#For this directory to access the authentication file 

Require valid-user

</Directory>

 

8.4 增加验证用户

Htpasswd –c /var/www/localhost/htdocs/nagios/etc/htpasswd nagios

查看验证问内容

Less /usr/local/nagios/etc/htpasswd

做的这里我们已经可以访问nagios 主页了

登录进去以后除了主页什么都打不开

8.5 nagios配置

Cd /var/www/localhost/htdocs/nagios/etc

Vi nagios.cfg

 

#cfg_file=/usr/local/nagios/etc/localhost.cfg

cfg_file=/usr/local/nagios/etc/contactgroups.cfg

cfg_file=/usr/local/nagios/etc/contacts.cfg

cfg_file=/usr/local/nagios/etc/hostgroups.cfg

cfg_file=/usr/local/nagios/etc/hosts.cfg

cfg_file=/usr/local/nagios/etc/services.cfg

cfg_file=/usr/local/nagios/etc/timeperiods.cfg

check_external_commands=1

command_check_interval=10s

#command_check_interval=-1

 

 

vi cgi.cfg

 

authorized_for_system_information=nagiosadmin,nagios

authorized_for_configuration_information=nagiosadmin,nagios

authorized_for_system_commands=nagios

authorized_for_all_services=nagiosadmin,nagios

authorized_for_all_hosts=nagiosadmin,test,nagios

authorized_for_all_service_commands=nagiosadmin,nagios

authorized_for_all_host_commands=nagiosadmin,nagios

 

Cd objects

Ls 

看到如下配置文件

 

commands.cfg services.cfgwindows.cfgswitch.cfgcontacts.cfglocalhost.cfgtemplates.cfgprinter.cfgtimeperiods.cfg

 

备份好系统自带的文件开始编译

Mv contacts.cfg contacts.cfg.backup

Mv timeperiods.cfg timeperiods.cfg.backup

vi timeperiods.cfg(非必要,系统自带的模板timeperiods.cfg编译非常完善)

 

define timeperiod{

timeperiod_name24x7 

alias24 Hours A Day,7Days A Week

sunday00:00-24:00

monday00:00-24:00

tuesday00:00-24:00

wednesday00:00-24:00

thursday00:00-24:00

friday00:00-24:00

saturday00:00-24:00

}

 

 

vi contacts.cfg

 

define contact{

contact_namenagios

aliasnagios admin

service_notification_period24x7

host_notification_period24x7

service_notification_optionsw,u,c,r

host_notification_optionsd,u,r

service_notification_commandsnotify-service-by-email

host_notification_commandsnotify-host-by-email

emailaaa@abc.com

pager137********

address1CHN 

address2SHA

}

 

 

vi contactgroups.cfg

 

define contactgroup{

contactgroup_namenagios

aliasnagiosAdministrators

membersnagios

}

 

 

vi hosts.cfg

 

define host{

host_namelocalhost

aliaslocalhost

address192.168.254.123

check_commandcheck-host-alive

max_check_attempts5

check_period24x7

contact_groupsnagios

notification_interval10

notification_period24x7

notification_optionsd,u,r

}

 

 

Vi hostgroup.cfg

 

 

define hostgroup{

hostgroup_namehostgroups

aliashostgroups

memberslocalhost

}

 

 

vi service.cfg

 

#service definition

define service{

host_namelocalhost

service_descriptioncheck-host-alive

check_commandcheck-host-alive

max_check_attempts5

normal_check_interval3

retry_check_interval2

check_period24x7

notification_interval10

notification_period24x7

notification_optionsw,u,c,r

contact_groupsnagios

}

 

 

8.6 测试运行

Cd /var/www/localhost/htdocs/nagios/

Bin/nagios –v etc/nagios.cfg

测试成功的提示

 

Total Warnings: 0

Total Errors:0

Things look okay - No serious problems were detected during the pre-flight check otal Warnings: 0

Total Errors:0

Things look okay - No serious problems were detected during the pre-flight check Total Errors:0

Things look okay - No serious problems were detected during the pre-flight check

 

如果Total Errors不是0,根据提示修改,如果看到Total errors是0,运行下面命令,可以将它写成脚本加到开机启动里面,记得补全路径,当前路径是在相对路径下运行的。

 

Bin/nagios –d /etc/nagios.cfg

安装完成访问

8.6.1 nagios测试问题

Contact group 'admins' specified in service 'Hosts' for host 'windows server' is not defined anywhere!

解决方法:

解决方法:

8.6.1.1将templates.cfg配置中的admins组更改为contactgroups.cfg中定义的nagios

8.6.2.1 或者把定义的vi objects/services.cfg 中contact_groups nagios 改为admins 

 

8.7 nagios下windows server监控

 

参考文档

实验目的:Nagios对windows server实现服务监控,如下图!

8.7.1nagios安装,在被监控客户端上

NSClient下载:这里下载的版本是0.3.8(X64)

安装,这个不用说了,Windows双击安装,安装过程需要填写Nagios服务器地址,填上你的Nagios服务器地址( 这里是192.168.254.123),密码可以填写可以不填(这里没填写),其他选项全部选中,默认安装路径c:\program files

8.7.2. NSClient配置,在被监控客户端上。

打开nsc.ini,做以下修改。

8.7.3 nagios配置文件修改,在监控服务器上

Cd /var/www/localhost/htdocs/nagios

Vi etc/object/windows.cfg基本上就是原配置,稍微做了修改

 

 

define host{

usewindows-server

host_namewindows-server

aliasMy Windows Server 

address192.168.254.1 

}

 

Host参数设置

 

define hostgroup{

hostgroup_namewindows-servers

aliasWindows Servers 

}

 

组别

 

define service{

usegeneric-service

host_namewindows-server

service_descriptionNSClient++0.3.8;Version

check_commandcheck_nt!CLIENTVERSION

}

 

NSClient客户监控

 

define service{

usegeneric-service

host_namewindows-server

service_descriptionUptime

check_commandcheck_nt!UPTIME

}

 

运行时间监控

 

define service{

usegeneric-service

host_namewindows-server

service_descriptionCPU Load

check_commandcheck_nt!CPULOAD!-l 5,80,90

}

 

CPU监控,80%警告,90%报警

 

define service{

usegeneric-service

host_namewindows-server

service_descriptionMemory Usage

check_commandcheck_nt!MEMUSE!-w 80 -c 90

}

 

内存监控,80%警告,90%报警。

 

define service{

usegeneric-service

host_namewindows-server

service_descriptionC:\Drive Space

check_commandcheck_nt!USEDDISKSPACE!-l c -w 90 -c 95

}

 

C盘使用监控,-l后跟盘符,90%警告,95%报警。

 

define service{

usegeneric-service

host_namewindows-server

service_descriptionnetlogon

check_commandcheck_nt!SERVICESTATE!-d SHOWALL -l netlogon

}

 

原配置是监控W3SVC服务(IIS),测试机是我的PC,没有IIS所以改了个netlogon服务,所有的服务监控格式都是这样。

 

重启nagios服务

8.7.3 测试遇到的问题

8.7.3.1出现critical错误

具体提示不记得的,检查的半天结果是windows下Mcafee防火墙挡住了,换了个虚拟机的IP地址就可以了。另外跨网段也有可能出现这种问题,解决方法是修改command.cfg,在命令后添加-t 30, 默认或者不填是10

8.7.3.2 出现以下提示

 

NSClient - ERROR: Could not get data for 5 perhaps we don't collect data this far back?

NSClient - ERROR: Could not get value

 

解决方法:

 

运行CMD,进入nsclient安装路径

nsclient++/test

lodctr/r

nsclient++/test

参考资料

http://www.nsclient.org/nscp/wiki/FAQ

 

 

8.8 nagios下linux远程机器的监控

原理同nagios下本机监控,不同的是需要在被监控机器上安装nrpe,nagios及相关插件来监控主机,然后通过监控服务器来获取数据并显示。

首先,由于我是在VMware下测试的,为了方便直接将监控主机做了个克隆,取名Testclient ,原来监控主机是Test(默认IP地址是192.168.254.123,有 cacti+nagios+ntop全套监控软件),开机Testclient

8.8.1被监控电脑上

Vi /etc/conf.d/net修改IP地址

 

modules=("ifconfig")

config_eth0=("192.168.254.124 netmask 255.255.255.0 brd 192.168.254.255")

routes_eth0=("default via 192.168.254.2")

 

修改主机名称(非必要)

Hostname Testclient

添加nagios管理员的用户名密码,这是克隆的电脑可以省略。

Useradd nagios

Passwd nagios

安装nagios插件(这是克隆电脑可以省略)

cd nagios-plugins-1.4.9

./configure –prefix=/var/www/localhost/htdocs/nagios/

Make && make install

安装nrpe监控软件

Cd nrpe-2.1.3

./configure –prefix=/var/www/localhost/htdocs/nagios/

Make all

安装check_nrpe这个插件

make install-plugin 

之前说过监控机需要安装check_nrpe这个插件,被监控机并不需要,我们在这里安装它是为了测试的目的

安装deamon

make install-daemon 

安装配置文件

make install-daemon-config 

安装xinetd脚本

make install-xinetd

编辑脚本

Vi /etc/xinetd/nrpe

 

service nrpe

{

flags= REUSE

socket_type= stream

port= 5666

wait= no

user= nagios

group= nagios

server= /var/www/localhost/htdocs/nagios/bin/nrpe

server_args= -c /var/www/localhost/htdocs/nagios/etc/nrpe.cfg --inetd

log_on_failure+= USERID

disable= no

only_from= 127.0.0.1 192.168.254.123

}

 

开启nrpe服务

Cd /var/www/localhost/htdocs/nagios/

Ln –s etc/nrpe nrpe

Vi nrpe

修改Allow hosts

allowed_hosts=127.0.0.1,192.168.254.123

bin/nrpe –d etc/nrpe.cfg

查看状态

Netstat –at | grep nrpe

tcp00 *:nrpe*:*LISTEN

Netstat –an | grep :5666

tcp00 0.0.0.0:56660.0.0.0:*LISTEN

OK!

添加监控命令

 

command[check_users]=/var/www/localhost/htdocs/nagios/libexec/check_users -w 5 -c 10

command[check_load]=/var/www/localhost/htdocs/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

command[check_hda1]=/var/www/localhost/htdocs/nagios/libexec/check_disk -w 20% -c 10% -p /

command[check_zombie_procs]=/var/www/localhost/htdocs/nagios/libexec/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/var/www/localhost/htdocs/nagios/libexec/check_procs -w 150 -c 200

command[check_free_swap]=/var/www/localhost/htdocs/nagios/libexec/check_swap -w 20% -c 10%

 

具体命令写法可以用/nagios/libexec/check_nrpe –h查看,注意绿色自己监控主机上用。

重启nrpe服务

8.8.2 监控主机上

其实很简单了,同本机监控

定义hosts

 

define host{

host_nameLinuxClient

aliasZhengzhouPC

address192.168.254.124

check_commandcheck-host-alive

max_check_attempts5

check_period24x7

contact_groupsnagios

notification_interval10

notification_period24x7

notification_optionsd,u,r

}

 

 

定义服务,列出其中一个

 

define service{

host_nameLinuxClient

service_descriptioncheckfreeswap

check_commandcheck_nrpe!check_free_swap

max_check_attempts5

normal_check_interval3

retry_check_interval2

check_period24x7

contact_groupsnagios

notification_interval10

notification_period24x7

notification_optionsw,u,c,r

}

 

贴张大功告成图片