1 / 19

Solexa & Lab IT Infrastructure January 8, 2008: Nelson Lab Meeting Jordan Mendler

Solexa & Lab IT Infrastructure January 8, 2008: Nelson Lab Meeting Jordan Mendler Nelson Lab, UCLA Human Genetics Department jmendler@ucla.edu. Cluster Configuration: Queues. all.q : The main production queue.

chibale
Télécharger la présentation

Solexa & Lab IT Infrastructure January 8, 2008: Nelson Lab Meeting Jordan Mendler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Solexa & Lab IT Infrastructure January 8, 2008: Nelson Lab Meeting Jordan Mendler Nelson Lab, UCLA Human Genetics Department jmendler@ucla.edu

  2. Cluster Configuration: Queues • all.q: The main production queue. • Six compute-0-xx nodes: 2 x hyper-threaded 32-bit Intel processors, 6GB RAM, CentOS4-i386, 4 SGE Slots per node • Eight compute-1-xx nodes: 2 x dual-core 64-bit AMD processors, 16GB RAM, CentOS4-x86_64, 4 SGE Slots per node • old.q: The legacy queue. • Several machines from the old cluster have kept their old configuration, in case someone needs to use an application that is not yet available on the new cluster. • Eight compute-0-xx nodes: 2 x hyper-threaded 32-bit Intel processors, 6GB RAM, Fedora Core 2 i386, 4 SGE Slots per node • As the new cluster proves stable, machines will gradually be pulled from old.q and reconfigured for all.q • celsius.q: The queue for the celsius pipeline. • Users should not submit jobs here as this queue is intended only for jobs automatically submitted by celsius. • This queue is subordinate to old.q, meaning that it is suspended when old.q has active jobs -- giving users computing priority.

  3. Cluster Configuration: Software • all.q • Currently installed: • R-2.6.1 • R-Bioconductor-2.1: hopach, limma, affy • Bioperl • Blat, Maq, Solexa_pipeline • simwalk2, plink, mendel, whap, genehunter, merlin • Perl Ensembl API, compara and variation packages • Requested, but not yet installed: • R libraries: celsius, randomForest, heatmap.plus, Hmisc, sn, MASS, cluster, sma, impute, splines, dynamicTreeCut, moduleColor • All of R-Bioconductor • SAGE • See the Cluster HOWTO on the Wiki for more details and examples

  4. Cluster Infrastructure: Setup • TFTP: Boot nodes over the network and points them to a kickstart • DHCP: Assign nodes an IP address • Kickstart: Automated, unattended installation over HTTP • CentOS4, minimal install, install and start puppet, etc • Puppet: Configuration Management (http://reductivelabs.com/projects/puppet/)‏ • compute-1-xx: Base config, SGE client, solexa_pipeline and blat are installed • Can also provision web, database, and all other lab server • RPM: Consistent deployment of packages, installed via puppet • Biopackages.net: Share our packages with the bioinformatics community • Shmux: Execute commands on multiple nodes in parallel • shmux -c "yum -y upgrade" compute-1-0{0,3,7} • Power Distribution Units: Allow for remote 'hard' rebooting of crashed machines • Invest the time doing it the right way, so we can rapidly grow in the future

  5. Solexa: Image Compression • Methods • None 418280 • Lossless • tar.bz2 266316 (63.7%)‏ • lzw.tar 89864 (21.5%)‏ • lzw.tar.bz2 89344 (21.4%)‏ • Lossy • jpeg2000.tar 21568 (5.2%)‏ • jpeg2000.tar.bz2 22224 (5.3%)‏

  6. Solexa: Image Compression LZW (lossless)‏ jpeg2000 (lossy)‏

  7. Solexa Pipeline Results • Original compared to LZW Identical: 15797 Only in LZW: 4522 Not identical: 7595 Only in Orig: 4739 • Original compared to tiff-to-tiff compared Identical: 15797 Only in tiff-to-tiff: 4522 Not identical: 7595 Only in Orig: 4739 • tiff-to-tiff compared to LZW Identical: 27914 Only in tiff-to-tiff: 0 Not identical: 0 Only in LZW: 0 • Original compared to jpeg2000 Identical: 10541 Only in jpeg2000: 7673 Not identical: 9244 Only in Orig: 8346

  8. ImageMagick Differences Original LZW Class: DirectClass PseudoClass Depth: 8 bits 16 bits Gray: 8-bits 16-bits Min: 4 (0.0156863) 0 (0)‏ Max: 54 (0.211765) 65535 (1)‏ Mean: 10.7039 (0.0419759) 2750.89 (0.0419759)‏ Standard deviation: 4.40634 (0.0172798) 1132.43 (0.0172798)‏ Colors: 51 65536 User Time: 0.130u 0.090u ElapsedTime: 0:02 0:01 Pixels per second: 673kb 2.5mb Only in LZW: Software: ImageMagick 6.2.0 03/17/05 Q16 http://www.imagemagick.org Document: C1.1/s_8_1_a.tif.none

  9. Cluster Infrastructure: Storage • With 2 Solexa machines we project about 1.6TB of Solexa data each month, but it can be as much as 10TB/month at full capacity • Commodity storage servers • Linux/FreeBSD servers that are 5U/24 drive or 3U/16 drive • Includes lots of CPU power, so heavy compression is feasible • 20TB redundant at ~$14,000 or $700/TB • Individual servers offer more freedom • Requires either a distributed filesystem or managing multiple volumes • ZFS is rich in features and is currently being ported to a distributed platform • One large storage server • Nexsan Satabeasts offer 42 drives in 4U • 34TB redundant at ~$46.5k or $1367/TB • Amazingly dense if space is an issue • Requires a large upfront cost, and is more expensive than above • Commercial Products • Our Apple XSAN costs about ~$1700/TB to expand • Has lots of problems, and is very slow • Locks you into a particular proprietary vendor • Makes the above options more appealing

  10. Cluster Infrastructure: Storage • Current Plan: • XSAN for home directories, Solexa sequence data and user data • Commodity storage servers for Solexa images, backup, and other archive • Future: • Get LZW working • Hack solexa_pipeline to read directly from compressed images • Experiment with distributed filesystems • One large volume composed from many small/cheap servers • Cheaper, faster and more reliable than individual storage servers • Still early in development, so usability is unknown • Potential to replace XSAN for all data, or at least Solexa data

  11. Solexa: IT Costs per Run • Assumptions: • 8 lane run -> 600GB storage (1G+ machine)‏ • 150GB stored on $1700/TB storage (Xsan)‏ • Images stored on $700/TB storage (Commodity)‏ • Estimating 60 runs/year with both machines (double our historic rate with one machine)‏ • Network, rack space, personnel time and other expenses are not taken into account • Cluster is 32 slots @ $25K, replaced every 2 years

  12. Solexa: IT Costs per Run (XSAN)‏ • Costs: • Storage: $1250 • Data: 150GB -> $250 • Images: 600GB -> $1000 • Cluster: $200 • Total: $1450/Run ($156/lane)‏

  13. Solexa: IT Costs per Run (bz2)‏ • Costs: • Storage: $500 • Data: 150GB -> $250 • Images: 450GB -> $250 • Cluster: $200 • Total: $700/Run ($87/lane)‏

  14. Solexa: IT Costs per Run (LZW)‏ • Costs: • Storage: $350 • Data: 150GB -> $250 • Images: 150GB -> $100 • Cluster: $200 • Total: $550/Run ($68/lane)‏

  15. Solexa: IT Costs per Run (ZFS?)‏ • Costs: • Storage: $150 • Data: 75GB -> $50 • Images: 150GB -> $100 • Cluster: $200 • Total: $350/Run ($44/lane)‏

  16. Solexa: IT Savings per Run • $750 Move images from XSAN and bzip2 • $150 Use LZW • $200 Move all data off of XSAN (distant future)‏

  17. Thanks to Brian for some images, slides and compression help Thanks to Dmitriy for work on compression Thanks to Louis for misc Links: http://genome.ucla.edu/wiki/index.php/Cluster_HOWTO http://sourceforge.net/projects/solexatools/

More Related