User Tools

Site Tools


server_monitoring

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
server_monitoring [2009/11/27 12:10] alanserver_monitoring [2010/01/28 11:29] 172.26.0.166
Line 13: Line 13:
  
 Interesting documentation: http://www.ibm.com/developerworks/wikis/display/WikiPtype/ganglia Interesting documentation: http://www.ibm.com/developerworks/wikis/display/WikiPtype/ganglia
- 
 ==== Troubleshooting ==== ==== Troubleshooting ====
 From time to time there are problems with Ganglia's web interface.  You can restart the needed services following this basic procedure: From time to time there are problems with Ganglia's web interface.  You can restart the needed services following this basic procedure:
  
   - Stop data collection daemon on HPC: ''service gmetad stop''   - Stop data collection daemon on HPC: ''service gmetad stop''
-  - Stop monitoring daemon on HPC: ''service gmond stop'' 
   - Stop monitoring daemon on compute nodes: ''rocks run host compute %%'%%service gmond stop%%'%%''   - Stop monitoring daemon on compute nodes: ''rocks run host compute %%'%%service gmond stop%%'%%''
   - Start data collection daemon on HPC: ''service gmetad start''   - Start data collection daemon on HPC: ''service gmetad start''
-  - Star monitoring daemon on HPC: ''service gmond start''+  - Wait a minute or two
   - Start monitoring daemon on compute nodes: ''rocks run host compute %%'%%service gmond start%%'%%''   - Start monitoring daemon on compute nodes: ''rocks run host compute %%'%%service gmond start%%'%%''
  
Line 86: Line 84:
  
 with  username = "nagiosadmin" and password = "nagios" with  username = "nagiosadmin" and password = "nagios"
- 
-