260 likes | 389 Vues
Present and Future Networks an HENP Perspective Harvey B. Newman, Caltech HENP WG Meeting Internet2 Headquarters, Ann Arbor October 26, 2001 http://l3www.cern.ch/~newman/HENPWG_Oct262001.ppt. Next Generation Networks for Experiments.
E N D
Present and Future Networks an HENP Perspective Harvey B. Newman, Caltech HENP WG Meeting Internet2 Headquarters, Ann Arbor October 26, 2001 http://l3www.cern.ch/~newman/HENPWG_Oct262001.ppt
Next Generation Networks for Experiments • Major experiments require rapid access to event samples and subsets from massive data stores: up to ~500 Terabytes in 2001, Petabytes by 2002, ~100 PB by 2007, to ~1 Exabyte by ~2012. • Across an ensemble of networks of varying capability • Network backbones are advancing rapidly to the 10 Gbps range:Gbps end-to-end requirements for data flows will follow • Advanced integrated applications, such as Data Grids, relyon seamless “transparent” operation of our LANs and WANs • With reliable, quantifiable (monitored), high performance • They depend in turn on in-depth, widespread knowledge of expected throughput • Networks are among the Grid’s basic building blocks • Where Grids interact by sharing common resources • To be treated explicitly, as an active part of the Grid design • Grids are interactive; based on a variety of networked apps • Grid-enabled user interfaces; Collaboratories
Tier2 Center Tier2 Center Tier2 Center Tier2 Center Tier2 Center HPSS HPSS HPSS HPSS LHC Computing Model Data Grid Hierarchy (Ca. 2005) CERN/Outside Resource Ratio ~1:2Tier0/( Tier1)/( Tier2) ~1:1:1 ~PByte/sec ~100 MBytes/sec Online System Experiment Offline Farm,CERN Computer Ctr ~25 TIPS Tier 0 +1 ~2.5 Gbits/sec HPSS Tier 1 IN2P3 Center INFN Center RAL Center FNAL Center ~2.5 Gbps Tier 2 ~2.5 Gbps Tier 3 Institute ~0.25TIPS Institute Institute Institute Physicists work on analysis “channels” Each institute has ~10 physicists working on one or more channels 100 - 1000 Mbits/sec Physics data cache Tier 4 Workstations
Baseline BW for the US-CERN Transatlantic Link: TAN-WG (DOE+NSF) Plan: Reach OC12 Baseline in Spring 2002; then 2X Per Year
Transatlantic Net WG (HN, L. Price) Bandwidth Requirements [*] [*] Installed BW. Maximum Link Occupancy 50% Assumed The Network Challenge is Shared by Both Next- and Present Generation Experiments
1970 1975 1980 1985 1990 1995 2000 2005 2010 Total U.S. Internet Traffic 100 Pbps Limit of same % GDP as Voice 10 Pbps 1 Pbps 100Tbps New Measurements 10Tbps 1Tbps 100Gbps Projected at 4/Year Voice Crossover: August 2000 10Gbps 1Gbps ARPA & NSF Data to 96 100Mbps 10Mbps 4/Year 2.8/Year 1Mbps 100Kbps 10Kbps 1Kbps 100 bps 10 bps U.S. Internet Traffic Source: Roberts et al., 2001
AMS-IX Internet Exchange Throughput Accelerated Growth in Europe (NL) Monthly Traffic4X Growth from 2000-2001 Hourly Traffic8/23/01 3.0 Gbps 2.0 Gbps 1.0 Gbps 0
Tier0/1 facility Tier2 facility Tier3 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link GriPhyN iVDGL Map Circa 2002-2003US, UK, Italy, France, Japan, Australia • International Virtual-Data Grid Laboratory • Conduct Data Grid tests “at scale” • Develop Common Grid infrastructure • National, international scale Data Grid tests, leading to managed ops (GGOC) • Components • Tier1, Selected Tier2 and Tier3 Sites • Distributed Terascale Facility (DTF) • 0.6 - 10 Gbps networks: US, Europe, transoceanic Possible New Partners • Brazil T1 • Russia T1 • Pakistan T2 • China T2 • …
Abilene and Other Backbone Futures • Abilene partnership with Qwest extended through 2006 • Backbone to be upgraded to 10-Gbps in three phases:Complete by October 2003 • Detailed Design Being Completed Now • GigaPoP Upgrade start in February 2002 • Capability for flexible provisioning in support of future experimentation in optical networking • In a multi- infrastructure • Overall approach to the new technical design and business plan is for an incremental, non-disruptive transition • Also: GEANT in Europe; Super-SINET in Japan; Advanced European national networks (DE, NL, etc.)
TEN-155 and GEANTEuropean A&R Networks 2001-2002 Project: 2000 - 2004 TEN-155OC12 Core GEANT: from 9/0110 & 2.5 Gbps European A&R Networks are Advancing Rapidly
Tohoku U OXC KEK NII Chiba National Research Networks in Japan SuperSINET • Start of operation January 2002 • Support for 5 important areas: HEP, Genetics, Nano Technology, Space/Astronomy, GRIDs • Provides • 10 Gbps IP connection • Direct inter-site GbE links • Some connections to 10 GbE in JFY2002 HEPnet-J • Will be re-constructed with MPLS-VPN in SuperSINET IMnet • Will be merged into SINET/SuperSINET NIFS IP Nagoya U NIG WDM path IP router Nagoya Osaka Osaka U Tokyo Kyoto U NII Hitotsubashi ICR Kyoto-U ISAS U Tokyo Internet IMS NAO U-Tokyo
STARLIGHT: The Next GenerationOptical STARTAP StarLight, the Optical STAR TAP, is an advanced optical infrastructure and proving ground for network services optimized for high-performance applications. In partnership with CANARIE (Canada), SURFnet (Netherlands), and soon CERN. • Started this Summer • Existing Fiber: Ameritech, AT&T, Qwest; MFN, Teleglobe, Global Crossing and Others • Main distinguishing features: • Neutral location (Northwestern University) • 40 racks for co-location • 1/10 Gigabit Ethernet based • Optical switches for advanced experiments • GMPLS, OBGP • 2*622 Mbps ATMs connections to the STAR TAP • Developed by EVL at UIC, iCAIR at NWU, ANL/MCS Div.
NewYork ABILENE UK SuperJANET4 STARLIGHT NL ESNET GENEVA SURFnet GEANT It MREN GARR-B STAR-TAP Fr Renater DataTAG Project • EU-Solicited Project. CERN, PPARC (UK), Amsterdam (NL), and INFN (IT) • Main Aims: • Ensure maximum interoperability between US and EU Grid Projects • Transatlantic Testbed for advanced network research • 2.5 Gbps wavelength-based US-CERN Link 7/2002 (Higher in 2003)
Daily, Weekly, Monthly and Yearly Statistics on 155 Mbps US-CERN Link BW Upgrades Quickly Followedby Upgraded Production Use 20 - 60 Mbps Used Routinely
Throughput Changes with Time • Link, route upgrades, factors 3-16 in 12 months • Improvements in steps at times of upgrades • 8/01: 105 Mbps reached with 30 Streams: SLAC-IN2P3 • 9/1/01: 102 Mbps reached in One Stream: Caltech-CERN See http://www-iepm.slac.stanford. edu/monitoring/bulk/ • Also see the Internet2 E2E Initiative: http://www.internet2.edu/e2e
Caltech to SLAC on CALREN2A Shared Production OC12 Network • SLAC: 4 CPU Sun; Caltech: 1 GHz PIII; GigE Interfaces • Need Large Windows; Multiple streams help • Bottleneck bandwidth ~320 Mbps; RTT 25 msec;Window > 1 MB needed for a single stream • Results vary by a factor of up to 5 over time;sharing with campus traffic CALREN2
Max. Packet Loss Rates for Given Throughput [Matthis: BW < MSS/(RTT*Loss0.5)] • 1 Gbps LA-CERN Throughput Means Extremely Low Packet Loss • ~1E-8 with standard packet size • According to the Equation a single stream with 10 Gbps throughput requires a packet loss rate of 7 X 1E-11 with standard size packets • 1 packet lost per 5 hours ! • LARGE Windows • 2.5 Gbps Caltech-CERN 53 Mbytes • Effects of Packet Drop (Link Error) on a 10 Gbps Link: MDAI • Halve the Rate: to 5 Gbps • It will take ~ 4 Minutes for TCP to ramp back up to 10 Gbps • Large Segment Sizes (Jumbo Frames) Could Help, Where Supported • Motivation for exploring TCP Variants; Other Protocols
Key Network Issues & Challenges Net Infrastructure Requirements for High Throughput • Careful Router configuration; monitoring • Enough Router “Horsepower” (CPUs, Buffer Space) • Server and Client CPU, I/O and NIC throughput sufficient • Packet Loss must be ~Zero (well below 0.1%) • I.e. No “Commodity” networks • No Local infrastructure bottlenecks • Gigabit Ethernet “clear path” between selected host pairs • To 10 Gbps Ethernet by ~2003 • TCP/IP stack configuration and tuning is Absolutely Required • Large Windows • Multiple Streams • End-to-end monitoring and tracking of performance • Close collaboration with local and “regional” network engineering staffs (e.g. router and switch configuration).
Key Network Issues & Challenges None of this scales from 0.08 Gbps to 10 Gbps • New (expensive) hardware • The last mile, and tenth-mile problem • Firewall performance; security issues Concerns • The “Wizard Gap” (ref: Matt Matthis; Jason Lee) • RFC2914 and the Network Police; “Clever” Firewalls • Net Infrastructure providers (Local, regional, national, int’l) who may or may not want (or feel able) to accommodate HENP “bleeding edge” users • New TCP/IP developments (or TCP alternatives) are required for multiuser Gbps links [UDP/RTP ?]
Internet2 HENP WG [*] • To help ensure that the required • National and international network infrastructures(end-to-end) • Standardized tools and facilities for high performance and end-to-end monitoring and tracking, and • Collaborative systems are developed and deployed in a timely manner, and used effectively to meet the needs of the US LHC and other major HENP Programs, as well as the general needs of our scientific community. • To carry out these developments in a way that is broadly applicable across many fields • Forming an Internet2 WG as a suitable framework [*] Co-Chairs: S. McKee (Michigan), H. Newman (Caltech); Sec’y J. Williams (Indiana); With thanks to Rob Gardner (Indiana) http://www.usatlas.bnl.gov/computing/mgmt/lhccp/henpnet/
Network-Related Hard Problems “Query Estimation”: Reliable Estimate of Performance • Throughput monitoring, and also Modeling • Source and Destination Host & TCP-stack Behavior Policy Versus Technical Capability Intersection • Strategies: (New Algorithms) • Authentication, Authorization, Priorities and Quotas Across Sites • Metrics of Performance • Metrics of Conformance to Policy • Key Role of Simulation (for Grids as a Whole): “Now Casting” ?
US CMS Remote Control RoomFor LHC US CMS will use the CDF/KEK remote control room concept for Fermilab Run II as a starting point. However, we will (1) expand the scope to encompass a US based physics group and US LHC accelerator tasks, and (2) extend the concept to a Global Collaboratory for realtime data acquisition + analysis
Networks, Grids and HENP • Next generation 10 Gbps network backbones are almost here: in the US, Europe and Japan • First stages arriving in 6-12 months • Major International links at 2.5 - 10 Gbps in 0-12 months • There are Problems to be addressed in other world regions • Regional, last mile and network bottlenecks and qualityare all on the critical path • High (reliable) Grid performance across network means • End-to-end monitoring (including s/d host software) • Getting high performance toolkits in users’ hands • Working with Internet E2E, the HENP WG and DataTAG to get this done • iVDGL as an Inter-Regional Effort, with a GGOC • Among the first to face and address these issues
Lookup Discovery Service Lookup Service Service Listener Lookup Service Remote Notification Registration Station Server Station Server Station Server Proxy Exchange Agent-Based Distributed System: JINI Prototype (Caltech/NUST) • Includes “Station Servers” (static) that host mobile “Dynamic Services” • Servers are interconnected dynamically to form a fabric in which mobile agents can travel with a payload of physics analysis tasks • Prototype is highly flexible and robust against network outages • Amenable to deployment on leading edge and future portable devices (WAP, iAppliances, etc.) • “The” system for the travelling physicist • Studies with this prototype use the MONARC Simulator, and build on the SONN study See http://home.cern.ch/clegrand/lia/
6800 Hosts; 36 (7 I2) Reflectors Users In 56 Countries Annual Growth 250%