一、安装前准备
1.1 解决安装nagios的依赖关系
- # yum groupinstall -y "Development Tools" # 安装开发包
- # yum -y install httpd gd gd-devel php mysql-devel # 安装nagios需要的环境,但不是必须的。
1.2 添加nagios运行所需要的用户和组
- #groupadd nagcmd
- #useradd –G nagcmd nagios
- #passwd nagios
- #usermod –G nagcmd apache # 把apache加入到nagcmd组
1.3 配置apache与php结合
- # vim /etc/httpd/conf/httpd.conf
- DirectoryIndex index.html index.html.var index.php # 这一行添加index.php
- # vim /var/www/html/index.php # 编辑测试页面
- <?php
- phpinfo();
- ?>
- # service httpd start # 启动httpd
在浏览器输入你的IP出现下面图片,则说明apache和php结合成功
二、编译安装nagios
2.1 所有需要的软件包都放在 /usr/local/src下:
- #cd /usr/local/src
- # tar zxf nagios-3.3.1.tar.gz
- # cd nagios-3.3.1
- # ./configure --with-command-group=nagcmd --enable-event-broker
- # make all && make install
- # make install-init # 安装启动脚本,可以用service控制nagios服务
- /usr/bin/install -c -m 755 -d -o root -g root /etc/rc.d/init.d
- /usr/bin/install -c -m 755 -o root -g root daemon-init /etc/rc.d/init.d/nagios
- *** Init script installed ***
- # make install-commandmode # 安装外部命令
- # make install-config # 生成配置文件/usr/local/nagios/etc
2.2 Nagios的默认安装路径是 /usr/local/nagios/*,生成目录说明:
- bin etc include libexec sbin share var
- /bin :nagios的主程序,执行程序 /etc :nagios的配置文件
- /include :头文件的所在目录 /libexec :安装插件产生的命令
- /sbin :cgi 文件所在目录 /share :网页文件所在目录
- /var 日志文件,锁文件等目录
2.3 为email指定您想用来接收nagios警告信息的邮件地址,默认是本机的nagios用户:
- # vi /usr/local/nagios/etc/objects/contacts.cfg
- email nagios@localhost #这个是默认设置 根据自己的情况修改
2.4 在httpd的配置文件目录(conf.d)中创建Nagios的Web程序配置文件:
- # make install-webconf
- /usr/bin/install -c -m 644 sample-config/httpd.conf /etc/httpd/conf.d/nagios.conf
- *** Nagios/Apache conf file installed ***
2.5 创建一个登录nagios的web程序的用户,这个用户帐号在以后通过web登录nagios 认证时所用:
- # htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
2.6 以上过程配置结束以后需要重启httpd:
- # service httpd restart
三、编译安装nagios-plugins
- Nagios的所有工作都是通过插件完成的,因此,在启动nagios前先安装插件
- # tar zxf nagios-plugins-1.4.15.tar.gz
- # cd nagios-plugins-1.4.15
- # ./configure --with-nagios-user=nagios --with-nagios-group=nagios
- # make && make install
四、配置并启动nagios
4.1把nagios添加为系统服务并将之加入到自动启动服务队列:
- # chkconfig --add nagios
- # chkconfig nagios on
4.2检查其主配置文件的语法是否正确:
- # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
出现这样的信息 说明配置文件没有错误 可以启动 如果有错误,可以根据上面的提示修改,哪里出错误了提示很详细的。很容易就可以找出错误。
4.3我们也可以为上面的检测命令添加别名,这样会很方便:
- #alias check-nagios=’ /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg’
4.4如果上面的语法检查没有问题,接下来就可以正式启动nagios服务了:
- # service nagios start
4.5配置selinux
如果您的系统开启了selinux服务,则默认为拒绝nagios web cgi程序的运行。您可以通过下面的命令来检查您的系统是否开启了selinux:
- # getenforce
- # setenforce 0 # 关闭selinux
- # /etc/rc.d/init.d/iptables stop # 关闭iptables
4.6通过Web界面查看nagios
http://ip/nagios
输入用户名密码
这就是nagios的主页面 很丑陋的 默认已经监控本机,并监控本机的负载均衡,当前用户等8个服务。
五、NDOUtils的安装与设置
5.1 Ndoutils 主要用来将nagios的配置信息和event产生的数据存入数据库以方便实现数据的快速检索和处理,因此要先安装mysql:
- # yum -y install php-mysql mysql-server
5.2 编译安装ndoutils
- # tar zxf ndoutils-1.4b9.tar.gz
- # cd ndoutils-1.4b9
- #./configure --prefix=/usr/local/nagios --enable-mysql --disable-pgsql --with-mysql-inc=/usr/include
- --with-mysql-lib=/usr/ lib
- #make
- # cp ./src/ndomod-3x.o /usr/local/nagios/bin
- # cp ./src/ndo2db-3x /usr/local/nagios/bin
- # cp ./src/log2ndo /usr/local/nagios/bin
- # cp ./src/file2sock /usr/local/nagios/bin
- # chown nagios:nagios /usr/local/nagios/bin/*
- # cp ./config/ndo* /usr/local/nagios/etc/ # 复制配置文件
- # chown nagios:nagios /usr/local/nagios/etc/*
5.3 为NDOUtils创建数据库
5.3.1 第一次启动mysql,要进行初始化
- # service mysqld start
- Initializing MySQL database: Installing MySQL system tables...
- OK
- Filling help tables...
- OK
- 。。。。。。。。。。。 # 中间部分省略
- Support MySQL by buying support/licenses at http://shop.mysql.com
- [ OK ]
- Starting MySQL: [ OK ] # MySQL启动成功
5.3.2 授权用户
- # mysql 进入mysql
- mysql> create database ndodb; # 创建nagios需要的数据库
- mysql> GRANT SELECT,INSERT,UPDATE,DELETE ON ndodb.* TO ndouser@localhost
- IDENTIFIED BY '123456'; # 给ndouser 用户授权 密码为“123456”
- mysql> flush privileges;
- mysql> \q
5.3.3 生成ndoutils所需要的数据库
- # cd db
- # ./installdb -undouser -p123456 -hlocalhost -d ndodb
数据库初始化成功,也可以连到mysql数据库里面,此时ndodb库中已经有59张表。
5.4 编辑配置文件
- # cp -p /usr/local/nagios/etc/ndo2db.cfg-sample /usr/local/nagios/etc/ndo2db.cfg
- # vim /usr/local/nagios/etc/ndo2db.cfg
- socket_type=tcp //**line 33
- db_servertype=mysql //**line 78
- db_host=localhost //**line 86 # 链接的数据库主机地址,这里是本机
- db_port=3306 //**line 95 # 链接端口
- db_name=ndodb # 数据库的名字
- db_prefix=nagios_ //**line 111
- db_user=ndouser //** line 121 # 用户名(就是数据库授权的用户)
- db_pass=123456 //**line 122 # 密码
- # cp -p /usr/local/nagios/etc/ndomod.cfg-sample /usr/local/nagios/etc/ndomod.cfg
- # vim/usr/local/nagios/etc/ndomod.cfg
- output_type=tcpsocket //** line 26
- output=127.0.0.1 //**line 39
- # vim /usr/local/nagios/etc/nagios.cfg
- 复制下面内容粘贴到#broker_module=...下面。
- broker_module=/usr/local/nagios/bin/ndomod-3x.o config_file=/usr/local/nagios/etc/ndomod.cfg
- 此外,请确保您的/usr/local/nagios/etc/nagios.cfg中有如下行出现,否则,请自行添加:
- event_broker_options=-1 # 为Nagios开启event broker
5.5 启动ndo2db
- # /usr/local/nagios/bin/ndo2db-3x -c /usr/local/nagios/etc/ndo2db.cfg
- # echo '/usr/local/nagios/bin/ndo2db-3x -c /usr/local/nagios/etc/ndo2db.cfg' >> /etc/rc.local # 开机自动启动
5.6 重启动nagios
- # /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg # 检查配置文件
- # service nagios restart
六、检测安装结果
- # tail -20 /usr/local/nagios/var/nagios.log
七、基于NRPE监控远程Linux主机
7.1 NRPE简介
Nagios监控远程主机的方法有多种,其方式包括SNMP、NRPE、SSH和NCSA等。这里介绍其通过NRPE监控远程Linux主机的方式。 NRPE(Nagios Remote Plugin Executor)是用于在远端服务器上运行检测命令的守护进程,它用于让Nagios监控端基于安装的方式触发远端主机上的检测命令,并将检测结果输出至监控端。而其执行的开销远低于基于SSH的检测方式,而且检测过程并不需要远程主机上的系统帐号等信息,其安全性也高于SSH的检测方式。
7.2 安装配置被监控端
- 7.2.1 先添加nagios用户
- # useradd -s /sbin/nologin nagios
- 7.2.2 NRPE依赖于nagios-plugins,需要先安装之
- # tar zxf nagios-plugins-1.4.15.tar.gz
- # cd nagios-plugins-1.4.15
- # ./configure --with-nagios-user=nagios --with-nagios-group=nagios
- # make all && make install
- 7.2.3 安装NRPE
- # yum -y install openssl*
- # tar -zxvf nrpe-2.12.tar.gz
- # cd nrpe-2.12.tar.gz
- # ./configure --with-nrpe-user=nagios \
- --with-nrpe-group=nagios \
- --with-nagios-user=nagios \
- --with-nagios-group=nagios \
- --enable-command-args \ # 允许传递参数
- --enable-ssl # 需要安装openssl组件
- # make all
- # make install-plugin
- # make install-daemon
- # make install-daemon-config
- 7.2.4 配置NRPE
- # vim /usr/local/nagios/etc/nrpe.conf
- log_facility=daemon
- pid_file=/var/run/nrpe.pid
- server_address=172.16.100.11 # 监听的服务地址,不选择此项默认是0.0.0.0
- server_port=5666
- nrpe_user=nagios
- nrpe_group=nagios
- allowed_hosts=172.16.100.1 # 定义本机所允许的监控端的IP地址
- command_timeout=60 # 命令执行的超时时间
- connection_timeout=300 # 连接超时时间
- debug=0
- 上述配置指令可以做到见名知义,因此,配置过程中根据实际需要进行修改即可。其中,需要特定
- 说明的是allowed_hosts指令用于定义本机所允许的监控端的IP地址。
- 7.2.5 启动NRPE
- 7.2.5.1 方式一:
- # cp init-script /etc/init.d/nrped # 复制nrpe启动脚本
- # chmod +x /etc/init.d/nrped # 赋予nrpe启动脚本执行权限
- # service nrped start
- # chkconfig --add nrped
- # chkconfig nrped on
- 7.2.5.2 方式二:
- # /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d
- 为了便于NRPE服务的启动,可以将如下内容定义为/etc/init.d/nrped脚本:
- #!/bin/bash
- # chkconfig: 2345 88 12
- # description: NRPE DAEMON
- NRPE=/usr/local/nagios/bin/nrpe
- NRPECONF=/usr/local/nagios/etc/nrpe.cfg
- #!/bin/bash
- # chkconfig: 2345 88 12
- # description: NRPE DAEMON
- case "$1" in
- start)
- echo -n "Starting NRPE daemon..."
- $NRPE -c $NRPECONF -d
- echo " done."
- ;;
- stop)
- echo -n "Stopping NRPE daemon..."
- pkill -u nagios nrpe
- echo " done."
- ;;
- restart)
- $0 stop
- sleep 2
- $0 start
- ;;
- *)
- echo "Usage: $0 start|stop|restart"
- ;;
- esac
- exit 0
- # chmod +x /etc/init.d/nrped # 赋予nrpe启动脚本执行权限
- # service nrped start
- # chkconfig --add nrped
- # chkconfig nrped on
- 7.2.5.1 方式三
- 或者,也可以在/etc/xinetd.d目录中创建nrpe文件,使其成为一个基于非独立守护进程的服务;
- 文件内容如下:
- service nrpe
- {
- flags = REUSE
- socket_type = stream
- wait = no
- user = nagios
- group = nagios
- server = /opt/nagios/bin/nrpe
- server_args = -c /etc/nagios/nrpe.cfg -i
- log_on_failure += USERID
- disable = no
- }
- 此种情况下启动NRPE进程需要通过重启xinetd来实现。
- # service xinetd start # 必须事先安装xinetd,默认没安装
- 7.2.6 配置允许远程主机监控的对象
- 在被监控端,可以通过NRPE监控的服务或资源需要通过nrpe.conf文件使用命令进行定义,定
- 义命令的语法格式为:command[<command_name>]=<command_to_execute>。比如:
- command[check_rootdisk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /
- command[check_swap]=/usr/local/nagios/libexec/check_disk -w 40% -c 20%
- command[check_sensors]=/usr/local/nagios/libexec/check_sensors
- command[check_users]=/usr/local/nagios/libexec/check_users -w 10 -c 20
- command[check_load]=/usr/local/nagios/libexec/check_load -w 10,8,5 -c 20,18,15
- command[check_zombies]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
- command[check_all_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
7.3 配置监控端
- 7.3.1 安装NRPE
- # tar -zxvf nrpe-2.12.tar.gz
- # cd nrpe-2.12.tar.gz
- # ./configure --with-nrpe-user=nagios \
- --with-nrpe-group=nagios \
- --with-nagios-user=nagios \
- --with-nagios-group=nagios \
- --enable-command-args \
- --enable-ssl
- # make all
- # make install-plugin
- 7.3.2 定义如何监控远程主机及服务:
- 通过NRPE监控远程Linux主机要使用chech_nrpe插件进行,其语法格式如下:
- check_nrpe -H <host> [-n] [-u] [-p <port>] [-t <timeout>] [-c <command>] [-a <arglist...>]
- 7.3.2.1 使用示例1:
- 1)定义监控远程Linux主机swap资源的命令:
- define command
- {
- command_name check_swap_nrpe
- command_line $USER1$/check_nrpe –H "$HOSTADDRESS$" -c "check_swap"
- }
- 2)定义远程Linux主机的swap资源:
- define service
- {
- use generic-service
- host_name linuxserver1,linuxserver2
- hostgroup_name linux-servers
- service_description SWAP
- check_command check_swap_nrpe
- normal_check_interval 30
- }
- 7.3.2.2 使用示例2:
- 如果希望上面的command定义更具有通用性,那么上面的定义也可以修改为如下:
- 1)定义监控远程Linux主机的命令:
- define command
- {
- command_name check_nrpe
- command_line $USER1$/check_nrpe –H $HOSTADDRESS$ -c $ARG1$
- }
- 2)定义远程Linux主机的swap资源:
- define service
- {
- use generic-service
- host_name linuxserver1,linuxserver2
- hostgroup_name linux-servers
- service_description SWAP
- check_command check_nrpe!check_swap
- normal_check_interval 30
- }
- 7.3.2.3 使用示例3:
- 如果还希望在监控远程Linux主机时还能向其传递参数,则可以使用类似如下方式进行:
- 1)定义监控远程Linux主机disk资源的命令:
- define command
- {
- command_name check_swap_nrpe
- command_line $USER1$/check_nrpe –H "$HOSTADDRESS$" -c "check_swap" -a $ARG1$
- $ARG2$
- }
- 2)定义远程Linux主机的swap资源:
- define service
- {
- use generic-service
- host_name linuxserver1,linuxserver2
- hostgroup_name linux-servers
- service_description SWAP
- check_command check_swap_nrpe!20!10
- normal_check_interval 30
- }
八、基于NSClinet++监控Windows主机
NSClient++与Nagios服务器通信,主要使用Nagios服务器的check_nt插件。原理图如下
(一)windows服务器上安装NSClient++软件
1、下载并解压NSClient++(下载地址:http://sourceforge.net/projects/nscplus) 2、安装插件与配置(安装过程十分简单,直接点击下一步,下一步即可。安装过程注意如下图的设 置即可)
编辑NSClient安装目录下的NSC.ini文件,修改后,重启NSClient即可,如下图所示
(二) NSClient应用监控
8.1基于check_nt:
- 8.1.1 Windows端要启用的模块:
- [modules]
- CheckSystem.dll
- CheckDisk.dll
- FileLogger.dll
- NSClientListener.dll
- [settings]
- allowed_hosts=10.15.62.199 # 监控端IP
- password=123456 # 用于与监控端通信的密码(可以不添加)
- 8.1.2 修改配置后要重启服务,在dos命令窗口执行如下命令:
- C:\Program Files\NSClient++>nsclient++ /stop # 停止nsclient++服务
- C:\Program Files\NSClient++>nsclient++ /start # 启动nsclient++服务
- 8.1.3 在nagios端使用如下命令测试:
- check_nt -H <client ip> -p <port> -v <command> ...
- # check_nt -H 172.16.100.66 -p 12489 -s 123456 -v CPULOAD -w 80 -c 90 -l 5,80,90
- 8.1.4 定义监控主机和服务对象:
- # vim /usr/local/nagios/etc/objects/windows.cfg
- define host{
- use windows-server
- host_name winserver
- alias My Windows machine
- address 172.16.100.66
- }
- define service{
- use generic-service
- host_name winserver
- service_description NSClient++ Version
- check_command check_nt!CLIENTVERSION
- }
- define service {
- use generic-service
- host_name winserver
- service_description Uptime
- check_command check_nt!UPTIME
- }
- define service {
- use generic-service
- host_name winserver
- service_description CPU Load
- check_command check_nt!CPULOAD!-l 5,80,90
- }
- define service{
- use generic-service
- host_name winserver
- service_description Memory Usage
- check_command check_nt!MEMUSE!-w 80 -c 90
- }
- 说明:如果NSClient++安装过程中使用了密码按以下操作
- # vim /usr/local/nagios/etc/objects/commands.cfg
- define command{
- command_name check_nt
- command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -s "PASSWORD" -v $ARG1$
- $ARG2$
- }
8.2基于NRPE:
- 8.2.1 windows端的配置:
- 8.2.1.1 NSClient++要启用如下模块:
- [modules]
- CheckSystem.dll
- CheckDisk.dll
- CheckExternalScripts.dll
- FileLogger.dll
- NRPEListener.dll
- [NRPE]
- use_ssl
- allow_arguments
- allow_nasty_meta_chars
- 8.2.1.2 修改配置后要重启服务,在dos命令窗口执行如下命令:
- C:\Program Files\NSClient++>nsclient++ /stop # 停止nsclient++服务
- C:\Program Files\NSClient++>nsclient++ /start # 启动nsclient++服务
- 8.2.2 Nagios端的配置:
- 8.2.2.1 定义模板:
- define host{
- name tpl-windows-servers
- use generic-host
- check_period 24x7
- check_interval 5
- retry_interval 1
- max_check_attempts 10
- check_command check-host-alive
- notification_period 24x7
- notification_interval 30
- notification_options d,r
- contact_groups admins
- register 0
- }
- 8.2.2.2 定义主机:
- define host{
- use tpl-windows-servers
- host_name windowshost
- alias My First Windows Server
- address 172.16.100.66
- }
- 8.2.2.3 定义服务:
- define service{
- use generic-service
- host_name windowshost
- service_description CPU Load
- check_command check_nrpe!alias_cpu
- }
- define service{
- Use generic-service
- host_name windowshost
- service_description Free Space
- check_command check_nrpe!alias_disk
- }
- 8.2.3 check_nrpe语法:
- check_nrpe -H <host> [-n] [-u] [-p <port>] [-t <timeout>] -c <command> [-a <argument> <argument>
- <argument>]
- check_nrpe的内置命令:
- • CheckAlwaysCRITICAL (check)
- • CheckAlwaysOK (check)
- • CheckAlwaysWARNING (check)
- • CheckCPU (check)
- • CheckCRITICAL (check)
- • CheckCounter (check)
- • CheckEventLog/CheckEventLog (check)
- • CheckFile (check)
- • CheckFileSize (check)
- • CheckMem (check)
- • CheckMultiple (check)
- • CheckOK (check)
- • CheckProcState (check)
- • CheckServiceState (check)
- • CheckTaskSched/CheckTaskSched (check)
- • CheckUpTime (check)
- • CheckVersion (check)
- • CheckWARNING (check)
- • CheckWMI/CheckWMI (check)
- • CheckWMIValue (check)
- 用法示例如下:
- # check_nrpe -H <host> -c CheckCPU -a warn=80 crit=90 time=20m time=10s time=4
8.3 基于NSCA
- 8.3.1 windows端的配置:
- 8.3.1.1 NSClient++要启用如下模块:
- [modules]
- CheckSystem.dll
- CheckDisk.dll
- CheckExternalScripts.dll
- CheckHelpers.dll
- FileLogger.dll
- NSCAAgent.dll
- 8.3.1.2 NSCA Agent配置:
- interval
- encryption_method
- password
- nsca_host
- 8.3.1.3 修改配置后要重启服务,在dos命令窗口执行如下命令:
- C:\Program Files\NSClient++>nsclient++ /stop # 停止nsclient++服务
- C:\Program Files\NSClient++>nsclient++ /start # 启动nsclient++服务
- 8.3.2 Nagios端的配置:
- 8.3.2.1 定义模板:
- define host{
- name tpl-windows-servers
- use generic-host
- check_period 24x7
- check_interval 5
- retry_interval 1
- max_check_attempts 10
- check_command check-host-alive
- notification_period 24x7
- notification_interval 30
- notification_options d,r
- contact_groups admins
- register 0
- }
- 8.3.2.2 主机配置:
- define host{
- use tpl-windows-servers
- host_name windowshost
- alias My First Windows Server
- address 172.16.100.66
- active_checks_enabled 0 # 定义主动检测,0表示关闭,1表示开启
- passive_checks_enabled 1 # 定义被动检测,0表示关闭,1表示开启
- }
- 8.3.2.2 服务配置:
- define service{
- use generic-service
- host_name windowshost
- service_description CPU Load
- check_command check_nrpe!alias_cpu
- active_checks_enabled 0
- passive_checks_enabled 1
- }
- define service{
- use generic-service
- host_name windowshost
- service_description Free Space
- check_command check_nrpe!alias_disk
- active_checks_enabled 0
- passive_checks_enabled 1
- }