140 likes | 144 Vues
ESnet Network Requirements ASCAC Networking Sub-committee Meeting April 13, 2007. Eli Dart ESnet Engineering Group Lawrence Berkeley National Laboratory. Requirements from Instruments and Facilities.
E N D
ESnet Network RequirementsASCAC Networking Sub-committee MeetingApril 13, 2007 Eli Dart ESnet Engineering Group Lawrence Berkeley National Laboratory
Requirements from Instruments and Facilities • This is the ‘hardware infrastructure’ of DOE science – types of requirements can be summarized as follows • Bandwidth: Quantity of data produced, requirements for timely movement • Connectivity: Geographic reach – location of instruments, facilities, and users plus network infrastructure involved (e.g. ESnet, Internet2, GEANT) • Services: Guaranteed bandwidth, traffic isolation, etc.; IP multicast • Data rates and volumes from facilities and instruments – bandwidth, connectivity, services • Large supercomputer centers (NERSC, NLCF) • Large-scale science instruments (e.g. LHC, RHIC) • Other computational and data resources (clusters, data archives, etc.) • Some instruments have special characteristics that must be addressed (e.g. Fusion) – bandwidth, services • Next generation of experiments and facilities, and upgrades to existing facilities – bandwidth, connectivity, services • Addition of facilities increases bandwidth requirements • Existing facilities generate more data as they are upgraded • Reach of collaboration expands over time • New capabilities require advanced services
Requirements from Examining the Process of Science (1) • The geographic extent and size of the user base of scientific collaboration is continuously expanding • DOE US and international collaborators rely on ESnet to reach DOE facilities • DOE Scientists rely on ESnet to reach non-DOE facilities nationally and internationally (e.g. LHC, ITER) • In the general case, the structure of modern scientific collaboration assumes the existence of a robust, high-performance network infrastructure interconnecting collaborators with each other and with the instruments and facilities they use • Therefore, close collaboration with other networks is essential for end-to-end service deployment, diagnostic transparency, etc. • Robustness and stability (network reliability) are critical • Large-scale investment in science facilities and experiments makes network failure unacceptable when the experiments depend on the network • Dependence on the network is the general case
Requirements from Examining the Process of Science (2) • Science requires several advanced network services for different purposes • Predictable latency, quality of service guarantees • Remote real-time instrument control • Computational steering • Interactive visualization • Bandwidth guarantees and traffic isolation • Large data transfers (potentially using TCP-unfriendly protocols) • Network support for deadline scheduling of data transfers • Science requires other services as well – for example • Federated Trust / Grid PKI for collaboration and middleware • Grid Authentication credentials for DOE science (researchers, users, scientists, etc.) • Federation of international Grid PKIs • Collaborations services such as audio and video conferencing
Aggregation of Requirements from All Case Studies • Analysis of diverse programs and facilities yields dramatic convergence on a well-defined set of requirements • Reliability • Fusion – 1 minute of slack during an experiment (99.999%) • LHC – Small number of hours (99.95+%) • SNS – limited instrument time makes outages unacceptable • Drives requirement for redundancy, both in site connectivity and within ESnet • Connectivity • Geographic reach equivalent to that of scientific collaboration • Multiple peerings to add reliability and bandwidth to interdomain connectivity • Critical both within the US and internationally • Bandwidth • 10 Gbps site to site connectivity today • 100 Gbps backbone by 2010 • Multiple 10 Gbps R&E peerings • Ability to easily deploy additional 10 Gbps lambdas and peerings • Per-lambda bandwidth of 40 Gbps or 100 Gbps should be available by 2010 • Bandwidth and service guarantees • All R&E networks must interoperate as one seamless fabric to enable end2end service deployment • Flexible rate bandwidth guarantees • Collaboration support (federated trust, PKI, AV conferencing, etc.)
ESnet Traffic has Increased by10X Every 47 Months, on Average, Since 1990 Apr., 2006 1 PBy/mo. Nov., 2001 100 TBy/mo. 53 months Jul., 1998 10 TBy/mo. 40 months Oct., 1993 1 TBy/mo. 57 months Terabytes / month Aug., 1990 100 MBy/mo. 38 months Log Plot of ESnet Monthly Accepted Traffic, January, 1990 – June, 2006
Requirements from Network Utilization Observation • In 4 years, we can expect a 10x increase in traffic over current levels without the addition of production LHC traffic • Nominal average load on busiest backbone links is greater than 1 Gbps today • In 4 years that figure will be over 10 Gbps if current trends continue • Measurements of this kind are science-agnostic • It doesn’t matter who the users are, the traffic load is increasing exponentially • Bandwidth trends drive requirement for a new network architecture • New ESnet4 architecture designed with these drivers in mind
Requirements from Traffic Flow Observations • Most ESnet science traffic has a source or sink outside of ESnet • Drives requirement for high-bandwidth peering • Reliability and bandwidth requirements demand that peering be redundant • Multiple 10 Gbps peerings today, must be able to add more flexibly and cost-effectively • Bandwidth and service guarantees must traverse R&E peerings • “Seamless fabric” • Collaboration with other R&E networks on a common framework is critical • Large-scale science is becoming the dominant user of the network • Satisfying the demands of large-scale science traffic into the future will require a purpose-built, scalable architecture • Traffic patterns are different than commodity Internet • Since large-scale science will be the dominant user going forward, the network should be architected to serve large-scale science
Aggregation of Requirements from Network Observation • Traffic load continues to increase exponentially • 15-year trend indicates an increase of 10x in next 4 years • This means backbone traffic load will exceed 10 Gbps within 4 years requiring increased backbone bandwidth • Need new architecture – ESnet4 • Large science flows typically cross network administrative boundaries, and are beginning to dominate • Requirements such as bandwidth capacity, reliability, etc. apply to peerings as well as ESnet itself • Large-scale science is becoming the dominant network user
Required Network Services Suite for DOE Science • We have collected requirements from diverse science programs, program offices, and network analysis – the following summarizes the requirements: • Reliability • 99.95% to 99.999% reliability • Redundancy is the only way to meet the reliability requirements • Redundancy within ESnet • Redundant peerings • Redundant site connections where needed • Connectivity • Geographic reach equivalent to that of scientific collaboration • Multiple peerings to add reliability and bandwidth to interdomain connectivity • Critical both within the US and internationally • Bandwidth • 10 Gbps site to site connectivity today • 100 Gbps backbone by 2010 • Multiple 10+ Gbps R&E peerings • Ability to easily deploy additional lambdas and peerings • Service guarantees • All R&E networks must interoperate as one seamless fabric to enable end2end service deployment • Guaranteed bandwidth, traffic isolation, quality of service • Flexible rate bandwidth guarantees • Collaboration support • Federated trust, PKI (Grid, middleware) • Audio and Video conferencing • Production ISP service