180 likes | 311 Vues
SAM middleware components. Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford. Outline. Introduction to SAM Internals of a SAM station Design of a SAM station SAM-Grid Architecture D0 reconstruction effort Outlook and Summary.
E N D
SAM middleware components Stefan Stonjek University of Oxford 7th GridPP Meeting 02nd July 2003 Oxford Stefan Stonjek
Outline • Introduction to SAM • Internals of a SAM station • Design of a SAM station • SAM-Grid Architecture • D0 reconstruction effort • Outlook and Summary Stefan Stonjek
Introduction to SAM • SAM: Sequential data Access via Meta-data • SAM is a distributed data handling system • One SAM station per processing node/cluster/site • D0: RAL, IC, Manchester, Lancaster • CDF: RAL, Oxford, Glasgow, Scotgrid, UCL, Liverpool Stefan Stonjek
SAM – central vs. decentral • Each SAM station has a local file cache • Files are transferred from station to station (no central storage, peer to peer) • Central database keeps track of all files, metadata, users, etc. in the SAM system • No full peer to peer yet • Peer to peer with central database Stefan Stonjek
The SAM Station • Each station runs one station master process • This communicates with the outside world • Local SAM processes talk to the station master • Station master talks with the central database Stefan Stonjek
A SAM Analysis Project • For every new analysis job a new project is created • Corresponds to a list of files • Project-Master process keeps track of the status of each file in this project • A project can have multiple consumers • Every file to only one consumer • Allow easy processing on farms Stefan Stonjek
SAM File transfers • Station initiates file transfers • Station keeps track of the needs of all projects • transfer files accordingly • Stager uses can use different transfer protocols • Depends on local and remote configuration • Cache content of each station is kept in central database Stefan Stonjek
SAM Station to database communication • Station talks to a db-server (=CORBA to SQL translator) • ORACLE database • Just one client for the database • Reduce load to database Stefan Stonjek
Station to Station Transfer • File transfer is done station to station • Several possible transfer protocols • Negotiated between stations • Each station has it’s own cache • Location information from central database Stefan Stonjek
Grid Job and Information Management (JIM) • Counterpart for the data handling system (SAM) • Based on existing tools (Globus, Condor etc.) • Allow brokering based on information from the data-handling system Stefan Stonjek
SAM-Grid Architecture Stefan Stonjek
Job Handling • Condor for submission and brokering • Decision making is based on: • Resource information (general and job specific) • Job information • Decision making is interfaced with data handling middleware • not just static resource information • allows brokering to include data handling considerations • Decision making is entirely in the Condor framework • strong promotion of standards • interoperability • GRAM protocol to transfer job to execution site • Authentication via GSI (Grid Security Infrastructure) Stefan Stonjek
Job Management Stefan Stonjek
JIM Monitoring • Information Management • Resource description for brokering • Infrastructure for monitoring • Monitors sites, resources and jobs • Distributed knowledge • Web based information retrival Stefan Stonjek
SAM-Grid Logistics Stefan Stonjek
Outlook:D0 Reprocessing Challenge • D0 will reprocess all Run II data • 01st Sep 2003 – 25th Nov 2003 (86 days), Conference deadline • Lions share at D0 remote computing facilities, including • RAL, IC, Manchester, Lancaster • Karlsruhe, Wuppertal, Lyon, Michigan, NIKHEF etc. • SAM to move data, runjob site job management • JIM submission and monitoring Stefan Stonjek
Outlook:D0 Reprocessing Challenge (2) • 150 million events / 22.5 TByte input data • Second level to second level • 25 TByte output data • SAM routinely handles this data volume • Currently mainly on-site of Fermilab • First large scale, large volume “real” data challenge • First HEP experiment to reprocess data in distributed fashion Stefan Stonjek
Summary • SAM is a distributed data handling system • It is used in production • JIM allows to broker jobs based on job specific information and dynamic resources • GridPP plays a vital role for the development of SAM-Grid Stefan Stonjek