80 likes | 197 Vues
This document outlines considerations, plans, and discussions regarding the cataloguing of data within the Belle Analysis Data Grid, focusing on the transition from LDAP services. It highlights the advantages of using a Master/Slave Logic File Naming Directory and describes a proposed model for managing metadata and facilitating data-oriented inquiries. Key use cases include retrieving simulation data, assessing job statuses, and identifying tools in use. By improving cataloguing practices, we aim to enhance data accessibility and usability for analysts and researchers.
E N D
Data Cataloguing thoughts, plans, discussions for the Belle Analysis Data Grid Lyle Winton winton@physics.unimelb.edu.au
Catalog Collection Sub-collection File Location File Location File Location Sub-collection Sub-collection Collection Sub-collection Sub-collection File Location File Location Catalog Collection Replica Catalog • Currently implemented using LDAP, central service • Looks like development is moving away from this!
Catalog Collection Sub-collection File Location Catalog Collection Sub-collection File Location File Location File Location File Location File Location Sub-collection Sub-collection Sub-collection Sub-collection Collection Sub-collection Collection Sub-collection Sub-collection File Location Sub-collection File Location File Location Catalog Collection File Location Catalog Collection Catalog Collection Sub-collection File Location Catalog Collection Sub-collection File Location File Location File Location File Location File Location Sub-collection Sub-collection Sub-collection Sub-collection Collection Sub-collection Collection Sub-collection Sub-collection File Location Sub-collection File Location File Location Catalog Collection File Location Catalog Collection LDAP Advantages - Referrals
Catalog Catalog Collection Collection Sub-collection Sub-collection File File Location Location Catalog Collection Sub-collection File Location Catalog Catalog Collection Collection Sub-collection Sub-collection File File Location Location File File Location Location File Location File File Location Location File File Location Location File Location File File Location Location Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Collection Collection Sub-collection Sub-collection Collection Sub-collection Collection Collection Sub-collection Sub-collection Sub-collection Sub-collection File File Location Location Sub-collection File Location Sub-collection Sub-collection File File Location Location File File Location Location Catalog Catalog Collection Collection File Location File File Location Location Catalog Collection Catalog Catalog Collection Collection Catalog Catalog Collection Collection Sub-collection Sub-collection File File Location Location Catalog Collection Sub-collection File Location File File Location Location File Location File File Location Location File Location Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Sub-collection Collection Collection Sub-collection Sub-collection Collection Sub-collection Sub-collection Sub-collection File File Location Location Sub-collection File Location File File Location Location Catalog Catalog Collection Collection File Location Catalog Collection LDAP Advantages - Master/Slave
Logical File Name Directory Replica Location Service (RLS) File Sub-collection Collection Catalog File GUID File Sub-collection 1 1 Sub-collection Sub-collection Collection File Sub-collection File Physics File Name (PFN) Meta-Data Repository ? ? ? ? An RLS model for data cataloguing
Meta-Data Repository Data, Simulation, Skim, Histogram
Meta-Data Structure High Level Name, Version
Meta-Data Structure • Data oriented queries… • List the files resulting from job X. • Retrieve the list of all simulation data of X type of event. • How can file X be regenerated? (if lost or expired) • Other queries that we can imagine… • What is the status of job X ? • What analyses similar to X have been undertaken? • What tools are being used for X analysis? • Who else is doing analysis X or using tool Y ? • What are the typical parameters used for tool X ?And for analysis Y ? • Search for data skims (filtered sets) that are supersets of my analysis criteria. • Do we have enough information so that future apps & agents could propose data, tools, and parameters given a high level job/analysis description?