1 / 53

myGrid

myGrid. Architectural issues in a bioinformatics Grid http://www.mygrid.org.uk Luc Moreau, University of Southampton, UK. Overview. Bioinformatics background myGrid facts Service oriented architecture Architectural issues Notification service Grid component model Service directory

olisa
Télécharger la présentation

myGrid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. myGrid Architectural issues in a bioinformatics Grid http://www.mygrid.org.uk Luc Moreau, University of Southampton, UK

  2. Overview • Bioinformatics background • myGrid facts • Service oriented architecture • Architectural issues • Notification service • Grid component model • Service directory • Conclusions

  3. Bioinformatics & Genomics • Large amounts of data • Highly heterogeneous • Data types • Data forms • Community • Highly complex and inter-related • Volatile

  4. Bioinformatics Data • Descriptive as well as numeric • Literature • Analogy/ knowledge-based Text Extraction

  5. Bioinformatics Analysis • Different algorithms • BLAST, FASTA, pSW • Different implementations • WU-BLAST, NCBI-BLAST • Different service providers • NCBI, EBI, DDBJ

  6. The HGP will make available potentially thousands of targets for Understanding biology & genetics Drug discovery Diagnostics Many genes will be linked with diseases Cancer HIV Parkinson’s Asthma Malaria Autoimmune (arthritis) Cardiovascular Antibacterial & antifungal The Human Genome Project

  7. Drug Discovery

  8. In silico experimentation • Discovery of resources and tools, staging of operations, sharing of results • Process is as important as outcome • Science is dynamic – change happens • Scientific discovery is personal & global • Provenance and history

  9. Overview • Bioinformatics background • myGrid facts • Service oriented architecture • Architectural issues • Notification service • Grid component model • Service directory • Conclusion

  10. myGrid • EPSRC funded pilot project • Generic middleware within application setting • 36 month in 42 month performance period • Start 1st October • 16 full-time post docs altogether • 6 DTA studenships • 1 technical project manager • 1 system manager • 1 secretarial post

  11. myGrid consortium • Scientific Team • Biologists and Bioinformaticians • GSK, AZ, Merck KGaA, Manchester, EBI • Technical Team • Manchester, Southampton, Newcastle, Sheffield, EBI, Nottingham • IBM, SUN • GeneticXchange • Network Inference, Epistemics Ltd

  12. myGrid outcomes • e-Scientists • Bioinformatics demonstrator (on cold carp) • Developers • myGrid-in-a-Box developers kit • Integrating some existing bioinformatics tools with myGrid

  13. Overview • Bioinformatics background • myGrid facts • Service oriented architecture • Architectural issues • Notification service • Grid component model • Service directory • Conclusions

  14. Overview • Bioinformatics background • myGrid facts • Service oriented architecture • Architectural issues • Notification service • Grid component model • Service directory • Conclusions

  15. Architectural Issues

  16. Architectural Issues • Notification service

  17. Vision • Asynchronous delivery and persistence of messages • Topics can be created and discovered on the fly • Subscribers can subscribe to topics, publishers can publish messages on a given topic • Peer to peer network of notification services • Topology can be re-organized to enhance reliability • Subscribers and publishers can negotiate over QoS

  18. notifications Subscriber Subscriber stub Publisher stub Publisher Subscriberdelegator publisherdelegator QoS A notification service instance

  19. Hub-1 Hub-2 Hub-3 NS-1-2 NS-2-2 NS-2-1 NS-1-3 NS-3-1 NS-1-1 NS-3-2 P-1-3-1 P-1-1-2 P-1-3-2 P-1-1-1 S-2-1-1 S-1-1-1 P-2-2-2 S-3-1-1 P-2-2-1 P-3-1-1 P-3-2-1 Federated notification services • Strong communication links between hubs • Efficient data replication • Simple notification routing

  20. QoS Negotiation Protocol

  21. Current status • Push and pull messaging • Topic,message and publisher filter • WSDL interface • Workflow interaction • Integration with mySQL, openJMS, tomcat and Axis • Federated service (undergraduate project) • QoS negotiation (PhD work underway) • OGSA compliance

  22. Experimentation • Windows and Unix platforms with Tomcat 4.0.5, Axis beta 3.0, OpenJMS 0.7.2 and mySQL 3.23.51 • Aggregation test with 500 topics, 2,000 subscribers, 2,000 publishers and 10,000 registered subscriptions, 10,000 notifications • 72 hours non-stop subscribing/publishing with the above populations

  23. Architectural Issues • Notification service

  24. Architectural Issues • Notification service • Grid component model

  25. Grid Component Model The myGrid framework is a component model for flexible, simple and future-proof deployment and use of services on the Grid.

  26. Problems Addressed • For service developers and deployers: • Ease of development of sophisticated services by separation of concerns and re-use of third party functionality. • Consistent distribution of functionality over a set of services, e.g. access control, support for fault-tolerance. • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.

  27. Problems Addressed • For service clients: • Development of service clients that are not limited by the range of standards known at deployment time. • Control over how service operations are invoked, so that they can make use of the most suitable protocols supported by a service. • Provision of a standard client interface hiding the differences in deployment philosophy that each middleware technology brings. • Application of solutions to the above to services deployed using technologies such as OGSA Grid Services, Web Services and Enterprise JavaBeans.

  28. Nested Component Model

  29. Framework

  30. Current Status • Startpoints for Web Services • Deployment within nested containers • Facades for exposing EJBs as Web Services • Performance tests

  31. Current Implementation

  32. Current Work • Automated deployment in nested containers • Definition of containers for deployment-time configuration • Using containers to provide minimal functionality of OGSA Grid Services • Startpoints for EJBs, Grid Services

  33. Experimentation • Our experiments have shown that nesting in our containers is not costly compared to method invocation and nested inner classes • The cost of calling EJBs via the Web Service façade comes mostly from the use of SOAP, and the consequential requirement for conversion to/from objects

  34. Architectural Issues • Notification service • Grid component model

  35. Architectural Issues • Notification service • Grid component model • Service directory

  36. Service Directory Views • Multiple service directories will co-exist (IBM, Microsoft, EBI, local institutions) • Need to attach metadata to service directory entries • Metadata is personal to the scientist: trust, perceived QoS, ontological description • Need for a mechanism to allow scientists to add their metadata and to make it available to other users as a “regular service directory”.

  37. Views: status • Currently in design phase • Use cases in the process of being finalized • Preliminary specification of interfaces • More work is needed on policy languages • Design to be finalized by end of January • First prototype of core functionality 4 months later

  38. Overview • Bioinformatics background • myGrid facts • Service oriented architecture • Architectural issues • Notification service • Grid component model • Service directory • Conclusions

  39. Conclusions • More architectural issues being addressed • Security (GSI, RBAC), but where is the community going? • Fault tolerance

  40. Workflow enactment • WSFL compatible enactment engine • Support for fault tolerance, checkpointing, migration • Editor

More Related