400 likes | 524 Vues
Gulf Breeze 1. June 11, 2003. Metadata is data about data… like the information contained in a library’s card catalog. What Is Metadata?. The Who, What, Where, When, Why, etc… The short, informative form of documentation used to evaluate the usefulness of data
E N D
Gulf Breeze 1 June 11, 2003
Metadata is data about data… like the information contained in a library’s card catalog What Is Metadata? • The Who, What, Where, When, Why, etc… • The short, informative form of documentation used to evaluate the usefulness of data • Metadata describe the content, quality, condition, and other characteristics of data
Why Is Metadata Important? • To provide information to process and interpret data received from an external source • Lack of knowledge about other organizations' data can lead to duplication of effort • A small amount of time invested at the beginning of a project may save money in the future • Metadata standards will increase the value of such data by facilitating data sharing through time and space
Metadata “20-Year Rule” “Will someone 20 years from now, not familiar with the data or how they were obtained, be able to find data sets of interest and then fully understand and use the data solely with the aid of the documentation archived with the data set?” • Committee on Geophysical Data, National Research Council, Solving the Global Change Puzzle, National Academy Press, 1991. NOTE: Improperly documented information is of sharply limited value, regardless of the probable high quality of the original research effort.
Metadata Types • Metadata can be divided into 3 types: • Descriptive– Describes the information object, it’s contents, context, etc. Examples include: object identifier, title, author, language, keywords, abstract, etc. • Administrative– Statistics about the technical creation and maintenance of the information object: availability (who has access), file format, file compression status, review status, accessibility (date made active / retired), etc. • Structural– Information about associations within or among information objects: related information objects (parent/child, subject matter), table of contents, index, etc.
Metadata Standards • CSDGM – Content Standard for Digital Geospatial Metadata (FGDC-STD-001-1998): Provides a common set of terminology and definitions for the documentation of digital geospatial data and establishes the following: • names of data elements and compound elements (groups of data elements) to be used for these purposes, • the definitions of these compound elements and data elements, and • information about the values that are to be provided for the data elements • The following extensions of CDSGM increate the utility of the standard: • Biological Data Profile (FGDC-STD-001.1-1999) • Metadata Profile for Shoreline Data (FGDC-STD-001.2-2001) • Remote Sensing Metadata (FGDC-STD-012-2002)
FGDC The Federal Geographic Data Committee is a 19 member interagency committee, organized in 1990 under OMB (Office of Management and Budget) Circular A-16 that promotes the coordinated use, sharing, and dissemination of geospatial data on a national basis. It is composed of representatives from the Executive Office of the President, Cabinet-level, and independent agencies. The FGDC is developing the National Spatial Data Infrastructure (NSDI) – in cooperation with organizations from State, local and tribal governments, the academic community, and the private sector – to develop policies, standards, and procedures for organizations to cooperatively produce and share geographic data.
Z39.50 – ISO and ANSI/NISO Standard "Z39.50" refers to the International Standard, ISO 23950: "Information Retrieval (Z39.50): Application Service Definition and Protocol Specification", and to ANSI/NISO Z39.50. It is a standard that specifies a client/server-based protocol for searching and retrieving information from remote databases. It is currently implemented for searches over TCP/IP. Z39.50 Search, Present, Browse, Resource, and Access Facilities control • which fields are searchable • which styles of searching (e.g. free-text, proximity) will be used • what kinds of records are returned • how to • browse summary information • control various accounting issues • control access
GIS Standards • GIS (Geographic Information System): GIS data and metadata are currently managed redundantly with duplicate repositories and uncoordinated development of software and administrative support. The EIMS team has proposed hardware and software options for coordinating and consolidating GIS data over the next 7 to 20 years (ORD GIS Data Mgmt v1.0.pdf): • EIMS/ESRI Integration • Compusult MetaManager Update with EIMS Search Integration • WME Takeover of GIS Searching and Display • Decentralized ESRI and Search of Metadata
GIS Vision • All of the proposed GIS options share the following characteristics: • Involve Common Of-the-Shelf (COTS) software • Allow for CDGSM file input (plain files) • Enable standard EIMS searches • Facilitate the potential integration of search functionality for consolidated GIS metadata
VISION: Geospatial One-Stop • Geospatial One-Stop will use the existing nodes for NSDI • EIMS with Z39.50 server will require no changes to comply with Geospatial One-Stop
EIMS (Meta)Data Standards • Use of Environmental Data Registry (EDR) and National Technical Information Service (NTIS) among other sources to constrain reference tables • Z39.50 access to Federal Geographic Data Committee (FGDC) materials • Z39.50 complies with Section 508 compliance • Production of EXtensible Markup Language (XML) output • Proposed utilization of Ecological Metadata Language (EML) • Proposed coordination of (Geographic Information System) GIS data sets
Gulf Breeze 2 June 11, 2003
Information Objects: Data Sets Databases Documents Meetings Models Multimedia Projects Spatial Data Web Sites Metadata (data about data) EIMS Overview
What is EIMS • A system to capture, store, manage, and distribute information about environmental resources collected, developed, and used by the Agency and its regional and state partners • A system that facilitates environmental assessment activities by providing metadata and access information for the information objects it describes • The EPA Federal Geographic Data Committee (FGDC) node with a working Z39.50 server which enables it to serve valid FGDC records • A relational database that can enforce data integrity (e.g., ensuring that data entered for fields is of the appropriate type/format, preventing the deletion of data that would cause broken relationships) Metadata in EIMS enable people to find and understand environmental resources
What does EIMS do? • Provide a managed directory of environmental data (Information Objects) housed within the EPA LAN • Cross-Silo (can search for records throughout the EPA, from multiple divisions and groups) • Multiple Information Types (Data Sets, Databases, Documents, Meetings, Models, Multimedia, Projects, Spatial Data, Web Sites) • Provide controlled access (registration of roles and enforcement of permissions) • Provide system management including support and back-ups for data stored directly within EIMS • Enable partners to build the inventory using interactive web forms • Employ web technology for search and retrieval of descriptive information (metadata) about environmental Information Objects • Potential enhancement to EIMS would enable search and retrieval of the data using EML (for Data Sets, Databases, some Documents, and Spatial Data) • Offer controlled Information Object access via web link, file download, and/or instructions for obtaining access via external system • Accept feeds from other systems (e.g., TIMS, SI) • Potential enhancement to EIMS would enable pushing feeds to external systems
EIMS Archival Functionality • Metadata Archival – EIMS primarily serves as a metadata repository, housing metadata about information objects (including links to the actual information object and/or instructions for accessing it) • Information Object (Data) Archival – Since EIMS allows users to upload files directly into its database (as large objects – BLOBs or CLOBs), users can store the information object files within EIMS • Potential for storing data (which has been uploaded as EML) so that users can upload reports from EIMS into their own tools for multi-source searching and handling • Potential for storing data translated into EML and browsed/searched via EaGLe portal
Customized Metadata for Information Types • EIMS provides customized sets of metadata elements based upon the Information Type of the Information Object (data): • Data Sets • Databases • Documents • Meetings • Models • Multimedia • Projects • Spatial Data Sets • Web Sites
EIMS Data Standards • EIMS is a relational database, thus, has constraints on data entry / updates that would impair data integrity • Interchange standards • Output in XML • Input in CSDGM, XML, and HTML compliant format • Accept bulk loads from relational data stores • Interactive web-based input • Protocols • Standard procedures for data input and output • User registration (access control)
Gulf Breeze 3 June 11, 2003
EIMS Users • Agency Staff • State and Regional Stakeholders • Other Federal Agencies • The public
EIMS Security • Access Control • Access to EIMS contents controlled according to assigned security and privileges (so not everyone can see everything) • Access controls exist for metadata records and for the information objects described by them • EIMS provides access controls to only those information objects uploaded into the EIMS system; the system cannot enforce access controls for information objects housed elsewhere within the Agency LAN • Partners determine what is distributed and to whom, consistent with Agency and Office policy • System Back-up • Off-site backup of data by National Technology Services Division (NTSD) • NTSD database administration and server maintenance
Who can search for EIMS records? • Anyone with access to the internet can search for records in EIMS • Records requiring confidentiality can be restricted to a subset of users • EPA Only – accessible to only EPA registered users • Group – accessible to only a specifically defined group of system users • Owner – accessible only by the designated owner of the EIMS record
How do you search for a record in EIMS? • Via the EIMS web-based application • Simple Search • Advanced Search • Via proposed EaGLe Portal • Browse records based upon content groupings • Search
Search Via EIMS Application – Simple • Enter desired criteria and select the Search button
NOTE: The Advanced Search allows you to search by Entry ID – if you chose to do so, no other selection criteria are necessary Search Via EIMS Application – Advanced • Enter desired criteria and select the Search button
How do you view a record in EIMS? • Select desired record from search results to view the EIMS Metadata Report
Links to each heading in report EIMS Metadata Report Report section header
Who Should Create Metadata? • The Bench Scientists who are the owners of the data (Information Object) should create the metadata entry in EIMS • Stewardship of the metadata records by the source of the information • Need to know the scientific information in order to properly document it
How do you add a record to EIMS? • Login to the database • Select the Create Directory Entry menu option • Populate the Metadata Entry Main Page • Populate as many of the other components as desired • The ORD encourages you to enter complete metadata to facilitate effective re-use of the actual Information Object NOTE: The record will not be available to the specified user base until a Metadata Librarian has reviewed and released it
Login to the database Login as EPA-Registered or Self-Registered NOTE: EPA-Registered users will use their TSSMS ID and EIMS-specific password
Select the Create Directory Entry option NOTE: EIMS will display customized menu options based upon the application privileges of the user
Related Entries Administrative Details Access Policy References Acronyms Quality Assurance Citations Project Tracking Contacts Project Attributes Data Elements Objectives Document Details Methods Time Frame Downloads / URL Links Keywords Geographic Area GPRA Populate the other metadata components
When is the metadata record released to the specified user base? DEP sets Metadata Status to “Pending” Record reviewed per EPA Partner’s Business Process Metadata Librarian sets Metadata Status to “Reviewed”
EaGLe Prototype • Browse Metadata and Data • Use predefined groupings to browse entries • Search Metadata and Data • Use web form to enter selection criteria and search for matching records • View Metadata and Data • Select metadata report and/or actual data from search results • Enter Metadata and Data • Use web form(s) to enter metadata and data into EIMS (bulk loading also available) • Release Metadata and Data • Use web form to indicate that the entry should be made available to the designation user base
Links to pages for specified research program Links to pages for specified research program Browse Via Proposed EaGLe Portal Continued
Browse Via Proposed EaGLe Portal (continued) Links to pages for specified research program Continued
Browse Via Proposed EaGLe Portal (continued) Links to specified content in EIMS Links to specified content in EIMS Links to specified content in EIMS Continued Continued