server_monitoring
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revisionNext revisionBoth sides next revision | ||
server_monitoring [2009/11/17 08:15] – created 172.26.0.166 | server_monitoring [2010/06/21 22:41] – 172.26.14.218 | ||
---|---|---|---|
Line 1: | Line 1: | ||
===== Server Monitoring ===== | ===== Server Monitoring ===== | ||
- | * [[#ganglia|Ganglia]] - Monitors cluster CPU, disk, network usage | + | * [[server_monitoring: |
- | * [[#monit|Monit]] - Monitors specific | + | * [[server_monitoring: |
- | * [[#nagios|Nagios]] - Monitors services | + | * [[server_monitoring: |
+ | * [[server_monitoring: | ||
- | ===== Ganglia | + | ===== Monit ===== |
- | [[http:// | + | Monit is a free open source utility |
+ | Monit can start a process if it does not run, restart a process if it does not respond and stop a process if it uses too much resources. it logs to syslog or to its own log file and notifies you about error conditions and recovery status via customizable alert. | ||
+ | Monit provides | ||
- | You can see the ganglia installation here: http://hpc.ilri.cgiar.org/ | + | M/Monit expand upon Monit' |
- | For some reason sometimes | + | Get the latest version at: http://mmonit.com/monit/download |
- | {{:ganglia_diagram_smaller.gif|}} | + | < |
+ | $ tar xfz monit-5.0.3.tar.gz | ||
+ | $ cd monit-5.0.3 | ||
+ | $ ./configure && make && make install</ | ||
+ | Accessing monit: | ||
+ | http:// | ||
- | Interesting documentation: | + | ===== Nagios ===== |
- | ==== Notes ==== | + | Nagios is a powerful monitoring system that enables organizations to identify and resolve IT infrastructure problems before they affect critical business processes. http:// |
- | It appears as if other CGIAR clusters have been configured to query our '' | + | === Installation === |
- | < | + | |
- | # Gmond config file for Cluster Cluster. | + | |
- | # Generated by ganglia.xml node without aid from the database. | + | |
- | # | + | |
- | name " | + | |
- | owner " | + | |
- | url " | + | |
- | latlong " | + | |
- | mcast_channel " | + | |
+ | ---- | ||
- | # | + | Download the latest version |
- | # Increase size of gmond user (gmetric) hash table. | + | < |
- | # | + | $ cd nagios-3.2.0 |
- | num_custom_metrics 2048 | + | $ ./ |
+ | $ make all | ||
+ | $ useradd nagios | ||
+ | $ make install | ||
+ | $ make install-init | ||
+ | $ make install-commandmode | ||
+ | $ make install-config | ||
+ | $ make install-webconf | ||
+ | </ | ||
+ | === Configuration === | ||
- | # Uncomment | + | ---- |
- | trusted_hosts 220.227.242.214 # hpc.icrisat.cgiar.org | + | Running |
- | trusted_hosts 202.123.56.187 | + | < |
- | trusted_hosts 216.244.151.133 # ? | + | |
- | trusted_hosts 200.62.229.37 | + | |
- | # Listen only on the private cluster interface. | + | Download and install plugins |
- | mcast_if eth0</ | + | < |
+ | $ wget http:// | ||
+ | $ tar xfz nagios-plugins-1.4.14.tar.gz | ||
+ | $ cd nagios-plugins-1.4.14 | ||
+ | $ ./configure && make && make install | ||
+ | </file> | ||
+ | Edit the configuration files to add host and services to be monitored: | ||
+ | < | ||
- | ==== Connections refused ==== | + | Check remote services http://wiki.nagios.org/index.php/Howtos: |
- | I kept seeing this error in ''/ | + | === Accessing Nagios === |
- | < | + | |
- | Apparently that is the Potato Center' | + | |
- | ===== Monit ===== | + | ---- |
+ | http:// | ||
- | Monit is a free open source utility for managing | + | with username = " |
- | Monit can start a process if it does not run, restart a process if it does not respond and stop a process if it uses too much resources. it logs to syslog or to its own log file and notifies you about error conditions and recovery status via customizable alert. | + | ==== Zabbix ==== |
- | Monit provides a built-in HTTP(S) interface and you can use a browser to access the Monit server. | + | ---- |
+ | Installation: | ||
- | M/Monit expand upon Monit's capabilities to provide monitoring and management of all Monit enabled | + | RHEL-compatible Linux: Ref: http:// |
+ | < | ||
+ | name=Andrew Farley RPM Repository | ||
+ | baseurl=http:// | ||
+ | enabled=1 | ||
+ | gpgcheck=0' | ||
+ | |||
+ | |||
+ | |||
+ | And then you can install zabbix agent, zabbix server, zabbix get, or zabbix proxy with… | ||
+ | < | ||
+ | sudo yum install zabbix-agent | ||
+ | sudo yum install zabbix-server | ||
+ | sudo yum install zabbix-get | ||
+ | sudo yum install zabbix-proxy </ | ||
+ | |||
+ | If it fails to install, you might need to clean the metadata with the following command | ||
+ | |||
+ | sudo yum clean metadata | ||
+ | |||
+ | |||
+ | Debian-Based Linux: | ||
+ | ---- | ||
+ | < | ||
+ | root@simple: | ||
+ | zabbix-agent - network monitoring solution - agent | ||
+ | zabbix-frontend-php - network monitoring solution - PHP front-end | ||
+ | zabbix-proxy-mysql - network monitoring solution - proxy (using MySQL) | ||
+ | zabbix-proxy-pgsql - network monitoring solution - proxy (using PostgreSQL) | ||
+ | zabbix-server-mysql - network monitoring solution - server (using MySQL) | ||
+ | zabbix-server-pgsql - network monitoring solution - server (using PostgreSQL) | ||
+ | root@simple: | ||
+ | </ | ||
+ | |||
+ | === Accessing Zabbix === | ||
+ | |||
+ | http://172.26.12.29/ | ||
+ | username: Admin | ||
+ | password: zabbix | ||
- | Get the latest version at: http:// | ||
- | < | ||
- | $ tar xfz monit-5.0.3.tar.gz | ||
- | $ cd monit-5.0.3 | ||
- | $ ./configure && make && make install</ | ||
- | Accessing monit: | ||
- | http:// |