====== Research computing storage infrastructure, 2012 ====== Brainstorming the current status and future needs of the ILRI research computing storage infrastructure. ===== Current situation ===== * HPC (June, 2011) * ~6TB of usable disk space, ~2.5TB in use right now * boran (database + VM server, January, 2012) * ~1.5TB usable disk space, ~20GB in use right now ===== Timeline ===== * **May 18**: Alan, Isaac, Etienne, and Mark meet to discuss projects and upcoming storage requirements. Notable: * very real possibility of a getting an Illumina MiSeq (shorter reads, but lots and lots of overlapping data) * Cassava genome project? * **May 24**: Alan and Isaac meet with NetApp storage representative ("GK" ), facilitated by George Ogoti from ICT * Existing NetApp is expandable, there are various options we can explore * GK is going to get us a quote for the following infrastructure * RAID-DP (NetApp's version of RAID6, 2 disk failure) * Site redundancy (storage syncs nightly via fiber to ICRAF) * ILRI site will have two controllers for high availability * ICRAF site will have one controller (less critical storage) * Capacity 12TB or 24TB (with usable space roughly half of each figure) * **June 13, 2012**: * Got the quote back from GK at Techno Associates, two options: * 12TB dual configurations: $40,000 * 24TB dual configurations: $48,000 * We need to talk to Ian Moore to see what he thinks * It's possible we now use ICT's NetApp to provide 1-2TB for GIS server, then building some custom solution for the DMZ * **June 14, 2012**: * Brainstorming raw storage costs vs NetApp quote: * NetApp quote 1TB x 24 = 29000 USD NetApp quote 2TB x 24 = 39000 USD scan.co.uk 1TB Seagate 75 GBP x 24 = 1800 (~2800 USD) amazon.co.uk 1TB Seagate 65 GBP x 24 = 1560 (~2500 USD) scan.co.uk 2TB Seagate 70 GBP x 24 = 1680 (~2600 USD) amazon.co.uk 2TB Seagate 83 GBP x 24 = 1992 (~3100 USD) scan.co.uk 3TB Hitachi 130 GBP x 24 = 3120 (~4900 USD) amazon.co.uk 3TB Seagate 120 GBP x 24 = 2928 (~4600 USD) * **June 18, 2012**: * Had a meeting with Ian Moore and Isaac Kahugu about storage * He said he'd support us building our own, but gave us tips to talk to Tor at ICRAF (GIS, MySQL, Drobo), and to consider Dell Equalogic for storage * Another point was that we could possibly buy storage from KENET (to sync off site), or maybe colocate a box there * **July 10, 2012**: * GK from Techno Associates called again with a new offer for a single-site, single-controller NetApp solution: * He said he can give us 12TB raw for $9,000, or 24TB for $11,000 (one controller only, excludes pricing for replication licenses) * **July 22, 2012**: * Begin compiling report about current situation, options, and recommendations * https://docs.google.com/document/d/123VL6l5xt1AspzqTaW2tpJ_XFUZGDIXUCsaFK2_EwUY/edit# * **July 23, 2012**: * George Ogoti provided us with an iSCSI target on their NetApp so we can test configuration and performance, but we're still waiting for a password to auto to the iSCSI. ===== Proposed NetApp architecture ===== Proposed architecture assuming we expand ICT's existing NetApp rack with extra controllers and storage. {{ :brainstorms:research_computing_storage_2012.png?nolink |}} **Key points**: * Raw storage is sliced in several chunks and shared appropriately * NetApp exports CIFS shares to corporate clients and servers (users authenticate with Active Directory credentials) * NetApp exports iSCSI block devices to Linux servers in order to allow them to manage their own storage/access/users directly in the OS ===== Alternatives ===== * Build our own storage, ala Backblaze "pods": http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ * Using FreeBSD + ZFS? * FreeNAS storage based on AMD Fusion APUs: http://the.only.ipnextgen.net/fnas/doku.php ===== Links ===== * Growing a ZFS pool (with good background on pools vs filesystems in ZFS): http://www.itsacon.net/?p=158