User Tools

Site Tools


news

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
news [2022/08/31 05:50] aorthnews [2023/05/02 12:47] (current) aorth
Line 4: Line 4:
 ---- ----
  
-===== August 282022power issues in Nairobi =====+===== May 22023planned maintenance ===== 
 +ILRI ICT is planning some major maintenance on our Nairobi infrastructure during the upcoming weekend.
  
-Unfortunately we had major power issues at the ILRINairobi campus this weekend +Due to the invasive nature of the maintenance, the HPC will be inaccessible for several hours on the mornings of Saturday, May 6th and Sunday, May 7th beginning at 08:00 AM. Please refrain from submitting long-running jobs as they will be suspended or may be interrupted unexpectedly. 
-and the entire cluster was powered off several timesILRI ICT is doing emergen- + 
-cy maintenance on the UPS power backup system to hopefully mitigate issues like +We will notify you when the maintenance is finishedafter which time you can re-submit your jobs. 
-this in the future.+ 
 + 
 +===== March 23, 2023: planned maintenance/upgrade ===== 
 +I am planning to upgrade the HPC's backend storage systems to a new version of their operating system (CentOS)¹ on Monday, March 27rd at 9AM (EAT). Due to the invasive nature of these upgrades I will need to stop existing SLURM jobs and restrict access to the clsuter. The work will probably take the entire morning. 
 + 
 +--- 
 +¹ Technically speaking, we are upgrading from CentOS 7 to CentOS Stream 8, which is the same upgrade I've performed on several compute nodes over the past few monthsThis is a major upgrade that we only perform every five years or so, and will modernize and increase security on HPC. 
 + 
 + 
 +===== November 15, 2022: planned maintenance/upgrade ===== 
 + 
 +We are planning to upgrade the operating system on the HPC head node. This will bring the OS from CentOS 7 to CentOS Stream 8, which is the same upgrade we did on all compute nodes several months ago. The maintenance will take place on: 
 + 
 +                         Friday, November 25th @ 2–5PM 
 + 
 +Any SLURM jobs that are still running will be re-queued and held until after the maintenance is overThe cluster will be inaccessible during this time. 
 + 
 +Please bear with us for this important update to our infrastructure. 
 + 
 +Thank you! 
 + 
 +===== August 28, 2022: power issues in Nairobi =====
  
-If you had any jobs running they may have crashed or been resubmittedPlease +Unfortunately we had major power issues at the ILRI, Nairobi campus this weekend and the entire cluster was powered off several timesILRI ICT is doing emergency maintenance on the UPS power backup system to hopefully mitigate issues like this in the future.
-check carefully to see the state of your jobs and let me know if you have any +
-questions.+
  
-Apologies+If you had any jobs running they may have crashed or been resubmitted. Please check carefully to see the state of your jobs and let me know if you have any questions.
  
-2022-08-30: Power went off again at 3:30PM+  * **2022-08-29**: Power went off again at 8:00AM
-2022-08-29: Power went off again at 8:00AM.+  * **2022-08-30**: Power went off again at 3:30PM.
  
 ===== August 22, 2022: updated syntax for SBATCH scripts ===== ===== August 22, 2022: updated syntax for SBATCH scripts =====
news.1661925042.txt.gz · Last modified: 2022/08/31 05:50 by aorth