80 likes | 149 Vues
Future developments: storage. Wahid Bhimji. Xrootd testing. Xrootd as a file access protocol used by HEP that offers both performance in file access as well as failover / redirection. Recently DPM support of this improved.
E N D
Future developments: storage Wahid Bhimji
Xrootd testing • Xrootd as a file access protocol used by HEP that offers both performance in file access as well as failover / redirection. Recently DPM support of this improved. • ECDF (and Glasgow) now use xrootd copying instead of DPM’s legacy protocol rfio. • We are involved in testing the redirection aspects for ATLAS (“FAX”) too. • http / WebDav offers a more widely used alternative
Federation traffic Modest levels now will grow when in production • In fact inc. local traffic UK sites dominate • Oxford and ECDF switched to xrootd for local traffic
Systematic FDR load tests in progress EU cloud results Slide Stolen from I. Vukotic Absolute values not important (Affected by CPU /HT Etc.) and setup Point is remote read can be good but varies
Other stuff • Puppet: Testing DPM modules for ECDF storage • But: we don’t use for WNs or anything else … • S3: (with Imperial Swift instance not ECDF) • DPM integration: some problems with accessing swift storage - new development version to test… • Access of files from cluster via ROOT - not done yet • SRM: making ECDF a non-SRM site for Atlas • As part of WLCG “Storage Interfaces” Working Group • Stage-out; FTS3 copies; space reporting– all in progress
“HEPDOOP” – a proposal • “Big data” – not a buzzword: plenty of industry activity • HEP uses little of the same tools • HEPDOOPbridges the divide 1st Phase:, 1 year : Technical review via demonstrators • Workshops with interspersed development activities • Use-case focused: Deliver ATLAS Higgs analysis with non-HEP tools • Milestones: • BigData Workshop Imperial 28th June • CHEP2013 (poster + possible birds-of-a-feather session) 2nd Phase: Possible ongoing activity providing a technical-level bridge between GridPP and wider Big Data communities: • Continuing interoperability in the case of common aims • Delivering advanced data processing and management tools for HEP, wider academia, and industry.
Ntuple making Data Filtering Skimming/ Slimming Data Mining Cuts Multivariate Analyses Statistical Analysis Visualisation Initial development areas Principle: focus on ease-of-use and access to wide community not (just) performance Typical HEP analysis flow: Starting with skimming and mining Python (scikit) version of H->bb analysis implemented Next step: map / reduce skimming code on local Hadoop cluster (or cloud resources)