200 likes | 312 Vues
The Xrootd Proxy Service, developed at the Stanford Linear Accelerator Center, addresses the growing need for scalable and high-performance data access in high-energy physics experiments, such as the BaBar experiment. With over 500 physicists collaborating worldwide, the service facilitates access to vast data sets (300 TB/year) generated from particle collisions aimed at exploring fundamental questions in physics, like the mystery of missing antimatter. It combines secure, efficient file serving with advanced distribution architectures, ensuring seamless scalability and fault tolerance across numerous servers.
E N D
Xrootd Proxy Service Andrew Hanushevsky Heinz Stockinger Stanford Linear Accelerator Center SAG 2004 20-September-04 http://xrootd.slac.stanford.edu
The BaBar Experiment • Use big-bang energies to create B meson particles • Look at collision decay products • Answer the question “where did all the anti-matter go?” • 500 physicists collaborating from >70 sites in 10 countries • USA, Canada, China, France, Germany, Italy, Norway, Russia, UK, Taiwan • The experiment produces large quantities of data • 300 TBytes/year for 10 years • Most data stored as objects using Root persistency framework • Some data stored in Objectivity/DB database • Expected to double every year as detector luminosity increases • Heavy computational load • 5,000 1-2GHZ CPU’s spread over 35 sites world-wide • Work is distributed across the collaboration 2: xrootd
BaBar is the Forerunner • LHC at CERN • The Large Hadron Collider • Due to start in 2007 • Will generate several order of magnitude more data • Will require even more compute cycles • Example: • ATLAS • Probe the Higgs boson energy range • Explore the more exotic reaches of physics 3: xrootd
The Data Access Need • Scalable high performance access to data • Must scale to 100’s if not 1000’s of data servers • Most data is read-only • Data is written only once • Versioned • Secondary access to distributed data • As a backup strategy 4: xrootd
Solution Fundamentals • Extensible base server architecture • Allows for high performance implementation • Rich but efficient server protocol • Combines file serving with P2P elements • Allows client hints for improved performance • Administrative security • Implies a structured peer-to-peer framework 5: xrootd
The Implementation • High Performance File-Based Access • Fluidly scalable • Works well in single server environments • Scales beyond 32,000 cooperative data servers • Naively extensible • Requirement for this level of scaling • Servers can be added at any time without disruption • Fully fault-tolerant • Servers can be removed at any time without disruption • Flexible Security • Allowing use of almost any protocol 6: xrootd
data Entities & Relationships xrootd Data Network (redirectors steer clients to data Data servers provide data) olbd Control Network Managers & Servers (resource info, file location) Redirectors olbd M ctl xrootd olbd S Data Clients xrootd Data Servers 7: xrootd
Example: SLAC Configuration data servers kan01 kan02 kan03 kan04 kanxx redirectors bbr-rdr-a bbr-rdr03 bbr-rdr04 client machines 8: xrootd
Data Growth & More Fault Tolerance • BaBar Data Is Replicated • Backup Strategy • Processing Strategy • Some data only available at one site • Use grid techniques to make data accessible • But, when thing go wrong would like access • The proxy solution 9: xrootd
The 10,000 Foot View SLAC us INFN it Internet FZK de RAL uk IN2P3 fr 10: xrootd
The Reality • Sites has a fear of hosting… • Distributed Denial of Service Attacks • Massive illegal file sharing • Only certain hosts allowed to get outside • Rarely batch worker machines • The ones that need remote data most • The Firewall Issue 11: xrootd
A Closer Look SLAC Firewall Firewall Firewall IN2P3 RAL IN2P3proxy RALproxy xrootd’s Firewalls require Proxy servers 12: xrootd
Proxy Service • Attempts to address competing goals • Security • Deal with firewalls • Scalability • Administrative • Configuration • Performance • Ad hoc forwarding for near-zero wait time • Intelligent caching in local domain 13: xrootd
Proxy Implementation • Uses capabilities of olbd and xrootd • Simply an extension of local load balancing • Implemented as a special file system type • Interfaces in the Logical File System layer (ofs) • Functions in the Physical File System layer (oss) • Primary developer is Heinz Stockinger 14: xrootd
Proxy Interactions data01 data02 data03 data04 RAL local olb proxy olb 4 5 local olb proxy olb SLAC 3 1 red01 data02 data03 proxy01 2 client machines 15: xrootd
Why This Arrangement? • Minimizes cross-domain knowledge • Necessary for scalability in all areas • Security • Configuration • Fault tolerance & recovery 16: xrootd
2 3 2 1 Scalable Proxy Security SLAC PROXY OLBD RAL PROXY OLBD Data Servers Data Servers Firewall 1 Authenticate & develop session key 2 Distribute session key to authenticated subscribers 3 Data servers can log into each other using session key 17: xrootd
Proxy Performance • Introduces minimal latency overhead • Virtually undetectably from US/Europe • Negligible on faster links • 2% slower on fast US/US links • 10% slower on LAN • Can be further improved • Parallel streams • Better window size calculation • Asynchronous I/O 18: xrootd
Proxy Study Conclusion • Proxy Service easily integrates into xrootd • Largely due to peer-to-peer architecture • Provides enhanced service at minimal cost • Allows access to addition data sources • Increases fault tolerance • Covers up for grid transfer mistakes • Scalable in all aspects • Security, number of servers, administration 19: xrootd
Overall Conclusion • xrootd provides high performance file access • Improves over afs, ams, nfs, etc. • Unique performance, usability, scalability, security, compatibility, and recoverability characteristics • Should scale to tens of thousand clients • Will be distributed as part of CERN’s root package • Open software, supported by • SLAC (server), • INFN-Padova (client) • CERN (security, packaging) 20: xrootd