Preserving Cloud Information
E N D
Presentation Transcript
Preserving Cloud Information Bruce R. Barkstrom&John J. Bates NCDC
Outline • Fundamental Preservation Commandments • Questions • Variability Quantification • Error Analysis and Physics • Costs • What Can We Do Now?
Four Commandments for Preserving Information • Thou shalt not be forced to preserve information before it is ready • Thou shalt not lose information – if possible • Thou shalt not cost more than necessary • Thou must make data accessible and valuable • To current users • To future users
When is Data Ready for Preservation? • When we have a good model of the underlying “natural variability” and “expected climate change” of the fields being measured • Not just mean and standard deviation – current applications need description of extreme events • Need regional time variations • When we have a physical basis for estimating errors and their impact on climate change detectability • Need more than just measurement statistics • Must include probability distribution of possible biases
Quantification of Field Variability • The “variability Turing test”: • Can you generate an ensemble of computer generated fields with statistics that is indistinguishable from those of the real field? • The “climate Turing test”: • Can you generate a model of “trends” whose statistics are indistinguishable from those of the expected climate changes?
Current State • Measurement “Requirements” for Climate usually stated as global values of means and standard deviations • Corresponding statistics can be generated by appropriate white noise • Is this adequate? • Probably not – clouds variations are more complex than a global mean and simple latitudinal variations • Can we come up with a common basis for stating variability across Earth science? • Regional? • Regional with moving systems?
No Preservation Without Understandable Error Assessments • Error assessments for climate data records are difficult • Need physical basis for estimating uncertainties, not just internally consistent measurement statistics • Error assessments must be tied to algorithm code – data editing is as important as coefficients or outlines of algorithms • Errors are not believable if entire data production process is not publicly understandable
Current State • Algorithm Theoretical Basis Documents do not necessarily represent the “as-built” algorithms with their data editing • EOS data production systems are “overwhelmingly complex” • May need new documentation tools to provide understanding – 100,000 lines of code is not readable in a Sunday afternoon • As Science Teams disperse, community knowledge will be lost unless we take steps to prevent it • May need to develop “data scholars”
Action Items • Can this workshop produce an understandable, quantitative description of cloud variability – and of expected cloud property changes? • Is it possible to develop a community-accepted standard checklist of errors for cloud properties?
Sample Error Checklist • Are the “as-built” instrument drawings available? • Is the ground calibration data available? • Is there a computational math model of the instrument that includes all of the physics of the measurement? • How was the gain determined? • How was the spectral response determined? • How was the Point Spread Function measured? • …
Models for Preservation Funding • The Cemetery Model: • Pay when the body is deposited; live off the interest • The Advanced Cemetery Model: • Pay for the previous bodies, as well as the one you’re depositing; make sure to add new bodies (the Cemetery as Pyramid) • The Cemetery as Theme Park: • Make the cemetery interesting to visit; charge admission • The Public Broadcasting Approach: • Beg for support annually – and ask for volunteers
Actions That Can Reduce Preservation Costs and Risk • Arrange a “Submission Agreement” (data will) with your designated archive • Gather required original documents and make sure your archive can accept them • Drawings • Calibration plans and procedures • Science Team minutes • Source Code • Arrange peer review of documentation
Summary • Our data will not survive without careful thought to ensure • Physical insight into the measured variables and the measurement process • Adequate public access to the measurement process • Cost-effective archival • Archives know less than you do about your data; if you don’t act to preserve that information, archives can’t preserve it!