1 / 19

Scientific Workflow Requirements

Scientific Workflow Requirements. Carole Goble, University of Manchester, UK Bertram Ludaescher, SDSC, USA. Attendees included. Bob Mann Anthony Mayer Austin Tate Bertram Lud ä scher Geoffrey Fox Jeffrey Grethe Matthew Shields Mike Wilde Simon Cox Carole Goble Antoon Goderis

gusty
Télécharger la présentation

Scientific Workflow Requirements

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scientific Workflow Requirements Carole Goble, University of Manchester, UK Bertram Ludaescher, SDSC, USA

  2. Attendees included • Bob Mann • Anthony Mayer • Austin Tate • Bertram Ludäscher • Geoffrey Fox • Jeffrey Grethe • Matthew Shields • Mike Wilde • Simon Cox • Carole Goble • Antoon Goderis • Earl Ecklund • Alan Bundy • Albert Burger • Jessica Chen-Burger • And a bunch more whose names we didn’t get

  3. Scientific Workflow Requirements • characterise scientific workflows, • identify their requirements • compare/contrast with business workflow requirements. Some science stakeholders • neuroscience, astronomy, engineering Few business stakeholders

  4. A Scientist Writes • “Work in my problem solving environment so that I don’t need to change the way I work.”

  5. User facing • Reflect the modelling paradigm of the scientist. • Varies between experiments, disciplines • Which user would that be then? • Creators, users, auditors, validators (I know if its right if I see it but I can’t right it) • Biologists compared to bioinformaticians, and transitioning between • Different users different environments • Appropriate levels of abstraction. • User models -> workflow models • Simple to use & intuitive creation, deployment, execution and debugging environments

  6. Supporting Scientific Practice • Incrementally exploratory prototypical TYPE A • Got the data, now get the nature paper before the next guy • Large scale production TYPE B • Got the idea, Get the data for every many experiments, and even many teams, communities blah blah • Migration from TYPE A to TYPE B. • Capture of TYPE A for later non-interactive replay in a parameterised fashion. • Workflow creation paradigms • by example, plagiarism, drag and drop • Provenance tracking

  7. Cool tools, right tools • I love my VI editor • Diagramming tools, text tools • Works on all workflows, use which you like when you like. • Good tools! Easy tools! Friendly tools! For the domain user (which user?) not the computer scientist  • Cat skinning • Multiple scripting language support • Multiple ways to write a workflow

  8. One size does not fit all

  9. Transparency and control • Looking under the hood and inside the box • observe, trace, compare, muse, fettle & fiddle. • What should be transparent? • Do users need to know what format data is in or just that it is an image? • Unveil at different levels of detail, through the wedding cakes, stacks • Opaque to some users some of the time, drillable by others some of the time • Role, authorisation, policy • Scientist knows best

  10. User interaction • Creation, Discovery, Enactment • Single User interaction with workflow execution • Choice between paths of execution in specific states • Parameter modification mid-run • Collaborative multi-user interaction in creation • Reusing workflows -> Modularisation • Reusing wfs with different parameters and datasets • Joining up wfs from different areas, different disciplines and across scales • E-science crosses disciplines!! • No support for “extreme team wf creation” • Collaborative multi-user interaction in execution?

  11. Legacy and Extensibility • Ingesting legacy and external applications & services • May not run on every platform, may need an emulator. • Heterogeneity – of types, platforms etc • Include arbitrary services available within the users domain or hacked up by the users. • Simon’s piece of Matlab hackery – dark matter services. • On the fly development and assimilation • Suspending the workflow, or prompting the user • For the prototypical exploratory workflows largely. • Massaging, lubrication, facilitating, gluing without programming ! • Easy to extend to meet specific or unique requirements

  12. More on workflow sorts • Batch vs interactive • Dataflow vs control flow vs state driven • Incrementally exploratory prototypical vs large scale production (and migration from former to latter).

  13. Workflow lifecycles • Prototypical workflow development to production run • Different parts of the lifecycle might need different environments and policies • Different sorts of users will interact at different points in the lifecycle.

  14. Security, trust and validation • Guarantees • That a provisioned service is what it says it is and follows all notification mandates. • Models of soundness at different level, well behavedness • 500 lambs follow 10-15 shepherds (or wolves?) • Validate at the right time not every time. • Confidence in someone else’s stuff • I can look at it to check it but I can’t write it.

  15. Business vs Scientific • Its all the same and its all different • Use cases and scenarios needed. • Classify business and scientific workflow against Matthew’s Stack • Drivers • Science workflow driven by scientific questions, outcomes and vanity. • Business workflow driven by business processes & goals and $£€ • Granularity • Business languages for coarse grain of swf • Scientists hack at fine grain level

  16. Business vs Scientific • Individualism vs Corporations • Ratios -- more creators than users in science? • What is the Scientific Business Process?

  17. A techy writes • Formal underpinning in CS theory • What is the underlying formal theoretic model? What is the natural scripting language? • Dataflow is function & parallel • Control flow is imperative & sequential? • SWF creation as programming. • What are the languages?

  18. Next Steps • Write this up! • Harvest some business use cases from Forrester report style sources (and get Tony Hey to pay) • Collect scientific workflow examples • Develop matrixes of system, functional and language requirements against these examples. • Er … that’s it!

  19. Fin

More Related