190 likes | 338 Vues
Patterns for E-Research. Dave Berry, Research Manager E-Research within the University of Edinburgh, 2 nd March 2005. E-Research. “The invention and application of computing methods to extend our capabilities in any research discipline”
E N D
Patterns for E-Research Dave Berry, Research Manager E-Research within the University of Edinburgh, 2nd March 2005
E-Research “The invention and application of computing methods to extend our capabilities in any research discipline” “Research in any discipline which benefits from and often depends on the use of advanced facilities and methods for computation, data curation, digital communication and visualisation”
9 12 18 Technology Growth Optical Fibre(bits per second) Doubling Time(months) Gilder’s Law(32X in 4 yrs) Data Storage(bits per sq. inch) Storage Law (16X in 4yrs) Performance per Dollar Spent Chip capacity(# transistors) Moore’s Law(5X in 4yrs) 0 1 2 3 4 5 Number of Years Triumph of Light – Scientific American. George Stix, January 2001
Pattern 1: Distributed Collaboration • Groups in different sites working together • Sharing knowledge and ideas • Technologies: • Shared repositories • Wikis, SourceForge/NeSCForge, Forums, … • Videoconferencing • Computer Supported Cooperative Work (CSCW)
Technology: Access Grid Cameras Microphones
Pattern 2:Simulation & Modelling • Large variety of topics, e.g. • Protein folding • Position of atoms in semiconductors • Human heart • Ecology of ice sheets • Multiple scales • Remote visualisation and control
Example:The TeraGyroid Scientific Experiment High-density isosurface of the late-time configuration in a ternary amphiphilic fluid as simulated on a 643 lattice by LB3D. Gyroid ordering coexists with defect-rich, sponge-like regions. The dynamical behaviour of such defect-rich systems can only be studied with very large scale simulations, in conjunction with high-performance visualisation and computational steering. See http://www.realitygrid.org/workshop-2004/presentations/blake.ppt
Pattern 3:Data archives • Data archives maintain data for widespread use, e.g. • UK Borders, Go-Geo, … (EDINA) • ArkDB (Roslin) • Mouse Atlas (HGU) • EMBL, UniProt, … (EBI) • Census, … (MIMAS) • Client-server access • Schemas defined centrally • Often subject to change… • … if they’re defined at all!
Infrastructure: Digital Curation Centre communities of practice: users curation organisations eg DPC community support & outreach Collaborative Associates Network of Data Organisations service definition & delivery management & admin support research collaborators research development co-ordination testbeds& tools Industry standards bodies
Pattern 4: Federated data • Sites maintain their own data • Remote access to other sites • Control access to your site • Integrated views • Community-defined schemas • Translation between schemas • Distributed algorithms • Run jobs remotely • Distributed data mining
Pattern 5: Parameter Search • Run the same algorithm on different data, e.g. • Finding local minima • Combinatorial search • Allows the use of multiple machines, e.g. • A cluster • Multiple clusters • Desktop PCs
Example:ClimatePrediction.net See www.climateprediction.net
Composing Patterns • Patterns that compose… • Complex problems require many inputs and many processes • Shared contributions compose indefinitely, accumulating knowledge • … and how to compose them • A common infrastructure • Technologies, naming, schemas, … • Workflow languages • Portals and “problem-solving environments”
SyntenyGrid Service blast + Example:BRIDGES (BioInformatics) Authorisation
Example: FireGrid (proposal) 1000s of sensors & gateway processing Emergency Responders KBS and Planning Maps, models, scenarios Super-real-time simulation (HPC)
Mont Blanc WTC Kobe Kings Cross Piper Alpha
Practical Challenges • Technical • A variety of partial answers • Standardisation work is long and political • Social • Sharing of resources means sharing YOUR resources • Contributor recognition and IPR • Defining common schemas and ontologies • Training, funding for software developers and sysadmins • Responsibility of data publishers • Cost, dependability, trustworthy, capable, flexibility, … • Management of infrastructure • Operation – NGS (national), ACF (local) • Funding