230 likes | 371 Vues
Transparency, applications, and ab-stuff – effect on tools for e-science: it’s all about Informatics. June 21, 2010, IATUL 2010. Peter Fox (RPI and WHOI) pfox@cs.rpi.edu Tetherless World Constellation. Working premise.
E N D
Transparency, applications, and ab-stuff – effect on tools for e-science: it’s all about Informatics June 21, 2010, IATUL 2010 Peter Fox (RPI and WHOI) pfox@cs.rpi.edu Tetherless World Constellation
Working premise Scientists – actually ANYONE - should be able to access a global, distributed knowledge base of scientific data that: • appears to be integrated • appears to be locally available But… data and information is obtained by multiple means (instruments, models, analysis) using various (often opaque) protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed AND created in a form that facilitates generation, not use (except by accident) And… there exist(ed) significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology…
Data has Lots of Audiences More Strategic Tools Less Strategic From “Why EPO?”, a NASA internal report on science education, 2005 SCIENTISTS!
So what about abduction? • No, not the criminal meaning… • Is a method of logical inference introduced by Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch". • Abductive reasoning starts when an inquirer considers of a set of seemingly unrelated facts, armed with an intuition that they are somehow connected. • The term abduction is commonly presumed to mean the same thing as hypothesis; however, an abduction is actually the process of inference that produces a hypothesis as its end result
Abductive Information System? • What would this look like in application tools? • If you consent that induction is fundamentally part of how an information system is developed, then how to allow for abduction before induction may be possible? • Design factors? Architecture factors? Library factors? Cognitive factors?
Modern informatics enables a new scale-free** framework approach • Use cases • requirements • Stakeholders • Distributed authority • Access control • Ontologies • Maintaining Identity
Marine habitat - change Rock Several disciplines; biology, geology, chemistry, oceanography Several applications; science, fishing, habitat change, climate and environmental change, data integration Complex inter-relations, questions Use case: What is the temperature and salinity of the water and are these marine specimens usual or part of an ecosystem change? Scallop, shell fragment Scallop, number, density Flora or fauna? What is this? Scallop, size, shape, color, place Dirt/ mud; one person’s noise is another person’s signal Src: WHOI and the HabCam group
Multi-tiered interoperability used by
But back to reality Fragmentation Disconnection Encapsulation … all are bad for … transparency 20080602 Fox VSTO et al.
What is the ecosystem? • Just a few elements and they are scattered Accountability Explanation Justification Verifiability Proof Trust Transparency
Access Control Essential For Establishing Trust • Licensing • Intellectual property • Security/ defence • Endangered species • Sensitive Data • Full life cycle data, information and knowledge management and stewardship
Provenance • Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility • Knowledge provenance; enrich with semantics (especially the relations between concepts previously isolated, and retaining context) and semantically-aware tools
MODIS Terra & Aqua vs. AIRS Cloud Top Pressure AIRS vs. MODIS Terra AIRS vs. MODIS Aqua Correlation maps for Jan 1 – 16, 2008 Impact:Findings using aerosol data apply to other geophysical parameters! MODIS Aqua vs. MODIS Terra
Semantic Advisor Your Selected Options: Spatial Area: Longitude ( -30, 150), Latitude (-10,60) Parameters: A: MYD08_D3.005 Aerosol Optical Depth at 550 nm B: MOD08_D3.005 Aerosol Optical Depth at 550 nm Temporal Range: Begin Date: Jan 01 2008 End Date: Jan 31 2008 Visualization Function: Lat –Lon map Time-averaged About your selected parameters: Known Issues: The difference of EQCT and Day Time Node, modulated by data-day definition, caused the included overpass time difference, which makes the artifact difference. See sample images: MODIS Terra vs. MODIS Aqua AOD Correlation Included Overpass time Difference Continue process to display image Return to selection page
Tetherless World Constellationtw.rpi.edu • Future Web • Web Science • Policy • Social Hendler Themes • Xinformatics • Data Science • Semantic eScience • Data Frameworks Fox McGuinness • Semantic Foundations • Knowledge Provenance • Ontology Engineering Environments • Inference, Trust Multiple depts/schools/programs ~ 30 (Post-doc, Staff, Grad, Ugrad)
Partitioning Data Science Semantic eScience Xinformatics
Use cases • Do you have any data online from Hutchins from award number OCE-0423418? • I want to download (temperature, biological, ...) data in the following areas (N. Atlantic, bounding box, where the JGOFs survey was done, ...) • What new data has been added to this repository since last year (and organize it by project) • Show me all the places where the surface temperature in the North Atlantic is 25 degrees during June. Tetherless World Constellation
Quick prototype of use case 1 Tetherless World Constellation
Current version Tetherless World Constellation
Current version Tetherless World Constellation