140 likes | 167 Vues
The Red Team. Gwen Jacobs Ed Lazowska. What biologists want …. Can I evaluate an experimental design? Can I store the results? Can I visualize the results? Can I reproduce the results? Can I make inquiries? Can I share and build upon data, tools, results?. Can I store the results?
E N D
The Red Team Gwen Jacobs Ed Lazowska
What biologists want … • Can I evaluate an experimental design? • Can I store the results? • Can I visualize the results? • Can I reproduce the results? • Can I make inquiries? • Can I share and build upon data, tools, results?
Can I store the results? • Data validation / quality control • Partial data • Errors in the data • Flamingly false data • Synonyms and homonyms • Context in which the data was gathered • Storing/retrieving combinatorial structures • Shared repositories
Can I visualize the results? • Multi-dimensional data visualization is the challenge • Need time as a variable
Can I reproduce the results? • Jill’s talk goes here
Can I make inquiries? • Data mining • Non-parametric statistics • Content-based image retrieval • Standards: yes or no?
Can I share data, tools, results? • Ontologies / semantics • Dealing with synonyms/homonyms • Standards: yes or no? • Yes: Can’t we all just get along? • No: Standards impede innovation; what we need is technologies that would allow ontologies to interoperate – schema mapping etc. • (cont’d …)
How to make the best algorithms known • How to make tools that are usable by other than the developer, and that can interoperate • Data integration / federation • Searching the intergalactic knowledge base
What can we do? • Fundamentally change the structure of the biomedical enterprise • Make computing explicit • Improve the peer review of computational work • Adequately fund the Roadmap • Fund algorithm and tool development where there is a clear biological driver • Create alternative funding models for hardening software • New panels, new panelists
Define “challenge problems” • “Here are 3 large databases, here are 3 tough questions, whoever’s first wins” • Use your own tools • Tests tool capabilities • Have someone else use your tools • Tests tool usability
Support training • Hourglass model • Broad at the undergraduate level • Narrow and deep at the graduate level • Broadening again post-graduate • Undergraduate • Less specialized • More concept-focused • CS students should have a serious minor (e.g., biology) • Bio students should have lots of computation (programming, data structures, algorithms, statistics, a smattering of databases and visualization)
Support creation of robust software by a non-R01 process • Need for software development and algorithm development needs to be explicitly recognized in R01’s • Separate mechanism needed to fund the hardening of software tools that are of value to the community • Also may need to explicitly support algorithm and tool development (community infrastructure)
Focus on tools usable by others • Figure out how to mandate reproducible research – openly publish • data • tools • papers
Dangling observations • Need progress on simulation • Hierarchical / multi-level • Hybrid • Computer scientists and biologists have mismatched goals • CS people seek a general solution • Biologists want a specific application addressed