1 / 37

A.R.M.S. Active Resource Management Services

A.R.M.S. Active Resource Management Services. Presentation One. Outline Introductions Societal Issue E xamined. Michael Rajs. Outline. Group Members and Roles: s lide 4 Introduce Mentor: slide 5 Societal I ssue: slide 6 History: slides 7-11 Case S tudy: slides 12-16

tracen
Télécharger la présentation

A.R.M.S. Active Resource Management Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A.R.M.S. Active Resource Management Services Presentation One

  2. OutlineIntroductionsSocietal Issue Examined Michael Rajs

  3. Outline • Group Members and Roles: slide 4 • Introduce Mentor: slide 5 • Societal Issue: slide 6 • History: slides 7-11 • Case Study: slides 12-16 • Problem Statement: slide 18 • Computer Components Identified: slides 19 -21 • Major Functional Component Diagram: slide 22 • Current Process Flow: slide 23 • Solution Statement: slide 25 • Objectives: slide 26 • Improved Process Flow: slide 27 • Competition Identified: slides 28-30 • Benefits of Solution: slide 32 • Problems with Solution: slide 33 • Conclusion: slide 34 • References: slides 35-36

  4. Group Members and Roles • Michael Rajs (Group Manager) • Adam Willis (Research Specialist) • Sybil Acotanza (Visualization Engineer) • Scott Pardue (Team Leader) • Jordan Heinrichs (Marketing Analyst) • David Crook (Documentation Specialist)

  5. Yaohang Li • Is an Associate Professor in the Department of Computer Science at Old Dominion University. • His research interests are in Computational Biology, Markov Chain Monte Carlo (MCMC) methods and Parallel Distributed Grid Computing.

  6. What is the societal issue being faced? How do researchers handle the massive amounts of data they are collecting?

  7. Historical Background Adam Willis

  8. Collection of Data • 1890 Census Recorded With an Electric Machine 1 • 1935 Social Security Act 2 • 1974 Privacy Act 3 • 1989 World Wide Web 4 • 1997 Big Data 5 • 2011 IBM’s Watson 6 • Now “Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone.”7

  9. Examples of Big Data • Large Hadron Collider 8 • 150 million sensors report 40 million times per second • Facebook 9 • 2.5 billion – content items shared • 2.7 billion – “Likes” • 300 million – photos uploaded • Walmart 8 • 1 million customer transactions • 2.5 petabytes of data

  10. Big Data Analysis Hardware • Cluster Computing 10 • A cluster consists of many nodes (computers). • Big data can be generated and analyzed quicker by spreading the workload amongst the nodes.

  11. Managing the Cluster • Distributed Resource Management Systems (D-RMS) • Job management subsystem • Physical resource management subsystem • Scheduling and queuing subsystem

  12. Case Study Sybil Acotanza

  13. Dinosolve Case Study • Bioinformatics • Disulfide bond prediction program (Cronk, 2012)

  14. Dinosolve Users • Who will use it? • Drug and antibody design • Bio-energy development • Genetic mapping11 • Why will they use it? • 2% accuracy improvement12

  15. Dinosolve Web Site (Li & Yaseen, http://hpcr.cs.odu.edu/dinosolve/)

  16. Dinosolve Possible Problems • Hard resources for computation • CPU cycles • Memory • Disk space • Network bandwidth • Server crashes

  17. Problem statementComponents of Hardware and SoftwareCurrent Process Flow Scott Pardue

  18. What is the problem? Processing time on big data sets is computationally expensive and as the volume of queries grows the system will progressively drop in performance until the system fails.

  19. What are the components of our current system? The current system uses the following software and hardware.

  20. Software • Unix operating system installed on the dinosolve cluster • Dinosolve algorithm • Sun Grid Engine which will be our Distributed Resource Management System (D-RMS) installed on the cluster. • MySQL (database software) • Web based user interface (website)

  21. Hardware • MySQL database server • A computer cluster to run the dinosolve algorithm • Web server for our web based user interface

  22. Major Functional Component Diagram

  23. Solution StatementObjectivesImproved Process FlowCompetition Identified Jordan Heinrichs

  24. How will we correct the problem? We aim to configure a distributed resource management system (D-RMS), in this case Sun Grid Engine (SGE), to handle resource allocation on the dinosolve cluster.

  25. Objectives • Interpret and visualize current usage statistics • Configure, utilize, and optimize the SGE • Aesthetically pleasing and professional user interface

  26. Process Flow with Solution

  27. Competing Distributed Resource Management Systems • Sun Grid Engine (SGE) • Portable Batch System (PBS) • Load Sharing Facility (LSF)

  28. Competing Resource Management Systems Reference 31

  29. Competing Protein Prediction Servers Reference 19,20 and 21

  30. Benefits of solutionProblems with solutionConclusion David Crook

  31. What benefits will come from attaining our goals? • Efficient utilization of available resources • Increased throughput of the cluster • An intuitive and professional user interface • Rise in popularity due to excellent accuracy, efficiency, and professional design

  32. Problems with solution • Improper synchronization of cluster resources can lead to a deadlock in the system • Race conditions between the HPCR cluster and the MySQL database

  33. Conclusion With the updated user interface and correctly configured Sun Grid Engine we hope to establish a reputable Disulfide Bonding Prediction Server.

  34. References for history • http://www.columbia.edu/cu/computinghistory/hh/index.html • http://query.nytimes.com/gst/abstract.html?res=F50C11FE385D13728DDDAE0A94DA415B868FF1D3 • http://www.census.gov/history/pdf/kraus-natdatacenter.pdf • http://www.bbc.co.uk/history/historic_figures/berners_lee_tim.shtml • http://dl.acm.org/citation.cfm?id=266989.267068&coll=DL&dl=GUIDE • http://www.nytimes.com/2012/08/12/business/how-big-data-became-so-big-unboxed.html?_r=1 • http://www-01.ibm.com/software/data/bigdata/ • http://en.wikipedia.org/wiki/Big_data • http://techcrunch.com/2012/08/22/how-big-is-facebooks-data-2-5-billion-pieces-of-content-and-500-terabytes-ingested-every-day/ • http://en.wikipedia.org/wiki/Computer_cluster

  35. References for case study 11.  Li, Y. (2010, September 1). CAREER: Novel Sampling Approaches for Protein Modeling Applications [Abstract]. National Science Foundation Award Abstract #1066471. 12.  Li, Y., & Yaseen, A. (2012). Enhancing Protein Disulfide Bonding Prediction Accuracy with Context-based Features. Biotechnology and Bioinformatics Symposium 13.  bioinformatics. 2011. In Merriam-Webster.com. Retrieved February 15, 2013, from http://www.merriam-webster.com/dictionary/bioinformatics 14. Cronk, J. D. (2012). Disulfide Bond. Retrieved February 15, 2013, from Biochemistry Dictionary:  http://guweb2.gonzaga.edu/faculty/cronk/biochem/D-index.cfm?definition=disulfide_bond 15.  Yan, Y., & Chapman, B. (2008). Comparative Study of Distributed Resource Management Systems–SGE, LSF, PBS Pro, and LoadLeveler. Technical Report-Citeseerx. 16. Li, Y., & Yaseen, A. (2012). Dinosolve. Retrieved from http://hpcr.cs.odu.edu/dinosolve/

  36. References for competition 17. Arvind Krishna, “Why Big Data? Why Now?”, IBM , 2011 URL: http://almaden.ibm.com/colloquium/resources/Why%20Big%20Data%20Krishna.PDF 18. Yonghong Yan, Barbara M. Chapman, Comparative Study of Distributed Resource Management Systems - SGE, LSF, PBS Pro, and LoadLeveler, Department of Computer Science, University of Houston, May 2005 (pdf) 19. Dr. Li’s site http://hpcr.cs.odu.edu/dinosolve/ 20. Scratch Predictor http://scratch.proteomics.ics.uci.edu/ 21. DiANNAserver http://clavius.bc.edu/~clotelab/DiANNA/ Portable Batch System (PBS) 22. http://resources.altair.com/pbs/documentation/support/PBSProUserGuide12-2.pdf 23. http://www.pbsworks.com/SupportDocuments.aspx?AspxAutoDetectCookieSupport=1 24. http://resources.altair.com/pbs/documentation/support/PBSProRefGuide12-2.pdf 25.http://resources.altair.com/pbs/documentation/support/PBSProAdminGuide12-2.pdf 26.http://www.pbsworks.com/(S(tykrsyqbemmlf3o5zwrmjrgf))/images/solutions-en-US/PBS-Pro_Datasheet-USA_WEB.pdf 27.http://agendafisica.files.wordpress.com/2011/05/pbs.pdf Moab HPC Suite 28.http://www.adaptivecomputing.com/publication/420/wppa_open/ IBM Platform LSF 29.http://public.dhe.ibm.com/common/ssi/ecm/en/dcd12354usen/DCD12354USEN.PDF Apache Hadoop with Zookeeper 30. http://zookeeper.apache.org/doc/current/zookeeperOver.html 31. http://www.cloud-net.org/~swsellis/tech/solaris/performance/doc/blueprints/0102/jobsys.pdf References

More Related