User Tools

Site Tools


brainstorms:research_computing_storage_infrastructure_2012

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Next revisionBoth sides next revision
brainstorms:research_computing_storage_infrastructure_2012 [2012/05/25 08:46] – created aorthbrainstorms:research_computing_storage_infrastructure_2012 [2012/06/14 14:18] aorth
Line 1: Line 1:
 ====== Research computing storage infrastructure, 2012 ====== ====== Research computing storage infrastructure, 2012 ======
-Brainstorming the current and future needs of the ILRI research computing storage infrastructure (we're currently swimming in compute capacity).+Brainstorming the current status and future needs of the ILRI research computing storage infrastructure
 + 
 +===== Current situation ===== 
 +  * HPC (June, 2011) 
 +    * ~6TB of usable disk space, ~2.5TB in use right now 
 +  * boran (database + VM server, January, 2012) 
 +    * ~1.5TB usable disk space, ~20GB in use right now 
 + 
 + 
 + 
 +===== Timeline ===== 
 +  * **May 18**: Alan, Isaac, Etienne, and Mark meet to discuss projects and upcoming storage requirements.  Notable: 
 +    * very real possibility of a getting an Illumina MiSeq (shorter reads, but lots and lots of overlapping data) 
 +    * Cassava genome project? 
 +  * **May 24**: Alan and Isaac meet with NetApp storage representative ("GK" <gkumawat@techno-associates.co.ke>), facilitated by George Ogoti from ICT 
 +    * Existing NetApp is expandable, there are various options we can explore 
 +    * GK is going to get us a quote for the following infrastructure 
 +      * RAID-DP (NetApp's version of RAID6, 2 disk failure) 
 +      * Site redundancy (storage syncs nightly via fiber to ICRAF) 
 +      * ILRI site will have two controllers for high availability 
 +      * ICRAF site will have one controller (less critical storage) 
 +      * Capacity 12TB or 24TB (with usable space roughly half of each figure) 
 +  * **June 13, 2012**: 
 +    * Got the quote back from GK at Techno Associates, two options: 
 +      * 12TB dual configurations: $40,000 
 +      * 24TB dual configurations: $48,000 
 +    * We need to talk to Ian Moore to see what he thinks 
 +    * It's possible we now use ICT's NetApp to provide 1-2TB for GIS server, then building some custom solution for the DMZ 
 +  * **June 14, 2012**: 
 +    * Brainstorming raw storage costs vs NetApp quote: 
 +    * <file>NetApp quote 1TB x 24 = 29000 USD 
 +NetApp quote 2TB x 24 = 39000 USD 
 + 
 +scan.co.uk 1TB Seagate 75 GBP x 24 = 1800 (~2800 USD) 
 +amazon.co.uk 1TB Seagate 65 GBP x 24 = 1560 (~2500 USD) 
 + 
 +scan.co.uk 2TB Seagate 70 GBP x 24 = 1680 (~2600 USD) 
 +amazon.co.uk 2TB Seagate 83 GBP x 24 = 1992 (~3100 USD) 
 + 
 +scan.co.uk 3TB Hitachi 130 GBP x 24 = 3120 (~4900 USD) 
 +amazon.co.uk 3TB Seagate 120 GBP x 24 = 2928 (~4600 USD)</file> 
 + 
 +===== Proposed NetApp architecture ===== 
 +Proposed architecture assuming we expand ICT's existing NetApp rack with extra controllers and storage. 
 + 
 +{{ :brainstorms:research_computing_storage_2012.png?nolink |}} 
 + 
 +**Key points**: 
 +  * Raw storage is sliced in several chunks and shared appropriately 
 +  * NetApp exports CIFS shares to corporate clients and servers (users authenticate with Active Directory credentials) 
 +  * NetApp exports iSCSI block devices to Linux servers in order to allow them to manage their own storage/access/users directly in the OS 
 + 
 +===== Alternatives ===== 
 +  * Build our own storage, ala Backblaze "pods": http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ 
 +  * Using FreeBSD + ZFS?
brainstorms/research_computing_storage_infrastructure_2012.txt · Last modified: 2012/07/24 06:57 by aorth