1 / 23

Distributed Computing

Distributed Computing. Utilize unused PC resources Processing Complex calculations Load distribution 25% of storage is unused SANs 100 computers 80gb drives = 6tb unused. Process Sharing Applications. For large-scale computations Data analysis, data mining, scientific computing

cailean
Télécharger la présentation

Distributed Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Computing Utilize unused PC resources • Processing • Complex calculations • Load distribution 25% of storage is unused • SANs • 100 computers 80gb drives = 6tb unused

  2. Process Sharing Applications • For large-scale computations • Data analysis, data mining, scientific computing • Research Problems • SETI@Home • Folding@Home • distributed.net • Genome@Home • FightAIDS@Home • climate simulation • Economics • medicine

  3. Distributed Computing • P2P is not distributed computing; similar challenges and issues from: sharing and taking advantage of resources available at endpoints and harnessing their power for computationally intensive problems • SETI@home, fightaids@home, genome@home • Grid computing and e-science • Computational grids to solve/simulate real-life problems • E-Science • Commercial applications • United Devices, Entropia, Avaki, etc.

  4. Distributed Computing A central coordinator schedules tasks on volunteer computers, Master worker paradigm, Cycle stealing • Dedicated Applications • SETI@Home, distributed.net, • Décrypthon (France) • Production applications • Folding@home, Genome@home, • Xpulsar@home,Folderol, • Exodus, Peer review, • Research Platforms • Javelin, Bayanihan, JET, • Charlotte (based on Java), • Commercial Platforms • Entropia, Parabon, • United Devices, Platform (AC) Client application Params. /results. Coordinator Parameters Internet Volunteer PC Volunteer PC Downloads and executes the application Volunteer PC

  5. Master Raw Data Raw Data Processed Data Processed Data Carol Alice Bob Ted Data Crunching Data Crunching Data Crunching Data Crunching Cycle Sharing Model • Chunks of data are sent to client in suspend mode • Data is processed by clients when client is not in use and returned to the master • Internet-based (Master-slave) computing • Example:SETI@Homescans radio telescope images

  6. SETI@HOME 3. SETI client gets data from server and runs Client/Server P2P • Launched In 1996 • Scientific experiment - uses Internet-connected computers in the Search for Extraterrestrial Intelligence (SETI) • Distributes a screen saver–based application to users • Applies signal analysis algorithms different data sets to process radio-telescope data. • Has more than 3 million users - used over a million years of CPU time to date 1. Install Screen Server SETI@Home Main Server 4. Client sends results back to server Radio-telescope Data 2. SETI client (screen Saver) starts

  7. Distributed Computing: SETI@home • Search for Extraterrestrial Intelligence that has over two million computers crunching away and downloading data gathered from the Arecibo radio telescope in Puerto Rico • The SETI@Home project is widely regarded as the fastest computer in the world • In fact, the project has already performed the single largest cumulative computation to date • From the architecture point of view Seti@Home is based upon client-server • The centralised servers hold enormous amounts of data gathered from the Arecibo radio telescope "listening" to the skies • That data needs to be analysed for distinct or unusual radio waves that might suggest extraterrestrial communications • http://setiathome.ssl.berkeley.edu

  8. SETI@Home • Search for Extraterrestrial Intelligence

  9. Processing • Intel’s Netbatch • 10,000 workstations over 25 locations • Chip design • Shortened time for chip development • Reduced outlay for new mainframes • $500 million savings

  10. Processing • Amerada Hess • Connects 200 Dell PCs to handle complex seismic data interpretation • Allowed them to replace a pair of IBM supercomputers. “We’re running seven times the throughput at a fraction of the cost.” Richard Ross, CIO

  11. Storage • Intel • Distribution of computer-based training • Prevents large downloads from central servers • Preserves bandwidth • Preserves expensive network storage

  12. P2P Distributed Computing Allows any node to play different roles (client, server, system infrastructure) Client (PC) Server (PC) accept request PC Potential communications for parallel applications PC PC result provide PC PC P2P system Client (PC) request accept PC PC PC result Server (PC) PC provide Request may be related to Computations or data Accept concerns computation or data A very simple problem statement but leading to a lot of research issues: scheduling, security, message passing, data storage Large Scale enlarges the problematic: volatility, confidence, etc.

  13. Programming Problem Systems Problem “Three Obstaclesto Making P2P Distributed Computing Routine” • New approaches to problem solving • Data Grids, distributed computing, peer-to-peer, collaboration grids, … • Structuring and writing programs • Abstractions, tools • Enabling resource sharing across distinct institutions • Resource discovery, access, reservation, allocation; authentication, authorization, policy; communication; fault detection and notification; … Credit: Ian Foster

  14. P2P for Distributed Computing or Web Computing • The distributed computing P2P applications are highlighted by the use of millions of Internet clients to analyze data looking for extraterrestrial life (SETI@home http://setiathome.ssl.berkeley.edu/ ) and the • Newer project examining the folding of proteins ( Folding@home http://www.stanford.edu/group/pandegroup/Cosm/ ). • These are building distributed computing solutions for a special class of applications: • Those that can be divided into a huge number of essentially independent computations, and a central server system doles out separate work chunks to each participating client. • In the parallel computing community, these problems are called "pleasingly or embarrassingly parallel". • This approach is included in the P2P category because the computing is Peer based even though it does not have the "Peer only communication" characteristic of all aspects of Gnutella and Napster for information transfer. • SETI@home and Folding@home are elegantly implemented as screen savers that you download.

  15. P2P space: Distributed Computing • Distributed Collaboration • Use under utilized Internet and/or network resources for improving computation and data analysis • MetaComputing,  CareScience,   DataSynapse,  Distributed.net,  DistributedScience,  Entropia,  Parabon, The Open Lab • Distributed Search Engines • Used to easily lookup and share files and offer content management • BearShare, Filetopia, Hotline Connect, InfraSearch, Plebio, Jibe,  LimeWire,  MusicBrainz.org,  NeuroGrid,  NextPage, Redfoot, Opencola, Project Pandango

  16. Entropia Financial Modeling I

  17. Entropia Financial Modeling II • Each basic financial instrument can be calculated independently • Central Server interprets the total simulation • Make Money or Learn what causes market swings or ….

  18. Drug Structure Simulations

  19. United Devices also does DrugSimulation • Parameter Study: do billions of simulations – each with different parameters • Search Engine like interface to simulation • Works as each calculation fits in a PC – a detailed molecular model would usually not do this

  20. Performance of Entropia Network

  21. Server Server Server Server Server Server Peer to Peer P2P “Illusion” among collaborating clients For Napster like Services or Collaboration

More Related