Download
rethinking network control and management n.
Skip this Video
Loading SlideShow in 5 Seconds..
Rethinking Network Control and Management PowerPoint Presentation
Download Presentation
Rethinking Network Control and Management

Rethinking Network Control and Management

154 Vues Download Presentation
Télécharger la présentation

Rethinking Network Control and Management

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Rethinking Network Control and Management David A. Maltz dmaltz@microsoft.com

  2. Context for Network Control and Management • Many different network environments • Access, backbone networks • Data-center networks, enterprise/campus • Many different technologies • Longest-prefix routing, label switching, circuit switching • IP, Ethernet, MPLS, optical circuits • Outsourcing of responsibility into the network • Middle-boxes: firewalls, network monitoring, … • Many different policies • Routing, reachability, transit, traffic engineering, robustness

  3. ATT/CMU Study of 31 Production networks • Provider & enterprise networks (10-1200 routers) • Many different routing designs • Packet filters, multiple OSPF instances, multiple ASs 2000 Lines in config file 1000 0 0 881 Router ID

  4. Fundamental Problem: Wrong Abstractions OSPF OSPF OSPF BGP BGP BGP Shell scripts Traffic Eng • Management Plane • Figure out what is happening in network • Decide how to change it Planning tools Databases Configs SNMP netflow modems OSPF • Control Plane • Multiple routing processes on each router • Each router with different configuration program • Huge number of control knobs: metrics, ACLs, policy Link metrics Routing policies FIB • Data Plane • Distributed routers • Forwarding, filtering, queueing • Based on FIB or labels FIB FIB Packet filters

  5. Inside a Single Network Shell scripts • Management Plane • Figure out what is happening in network • Decide how to change it Traffic Eng Planning tools Databases Configs SNMP netflow modems • Control Plane • Multiple routing processes on each router • Each router with different configuration program • Huge number of control knobs: metrics, ACLs, policy OSPF Link metrics Routing policies OSPF OSPF OSPF BGP BGP BGP FIB FIB FIB • Data Plane • Distributed routers • Forwarding, filtering, queueing • Based on FIB or labels • State everywhere! • Dynamic state in FIBs • Configured state in settings, policies, packet filters • Programmed state in magic constants, timers • Many dependencies between bits of state • State updated in uncoordinated, decentralized way! Packet filters

  6. Inside a Single Network Shell scripts • Management Plane • Figure out what is happening in network • Decide how to change it Traffic Eng Planning tools Databases Configs SNMP netflow modems • Control Plane • Multiple routing processes on each router • Each router with different configuration program • Huge number of control knobs: metrics, ACLs, policy OSPF Link metrics Routing policies OSPF OSPF OSPF BGP BGP BGP FIB FIB FIB • Data Plane • Distributed routers • Forwarding, filtering, queueing • Based on FIB or labels • Logic everywhere! • Path Computation built into routing protocols • Routing Policy distributed across the routers • Packet Filters placed by tools inMng. Plane • No way to arbitrate inconsistencies between logic • State everywhere! • Dynamic state in FIBs • Configured state in settings, policies, packet filters • Programmed state in magic constants, timers • Many dependencies between bits of state • State updated in uncoordinated, decentralized way! Packet filters

  7. Control Plane: The Key Leverage Point • Great Potential: control plane determines the behavior of the network • Reaction to events, reachability, services • Great Opportunities • Each network (administrative domain) has its own control plane • A radical clean-slate control plane can be deployed • Agnostic to user data format: IPv4/v6, ethernet, circuit • No changes to end-system software • Control plane is the nexus of network evolution • Changing the control plane logic can smooth transitions in network technologies and architectures

  8. An Alternative: The 4D Architecture • Key principles • Network-level objectives • Network-wide views • Direct control • Corollaries • Predictable behavior (including overload threshold) • Zero device-specific or manual configuration • Data plane support for network-wide view • Define objectives in terms of organizationally salient entities

  9. Good Abstractions Reduce Complexity Management Plane All decision making logic lifted out of control plane • Eliminates duplicate logic in management plane • Dissemination plane provides robust communication to/from data plane switches Configs Decision Plane Control Plane FIBs, ACLs FIBs, ACLs Dissemination Data Plane Data Plane

  10. Overview of the 4D Architecture Network-level objectives Decision Plane: • Allmanagement logic implemented on centralized servers making all decisions • Decision Elements use views to compute data plane state that meets objectives, then directly writes this state to routers Decision Dissemination Direct control Network-wide views Discovery Data

  11. Concerns and Challenges • Distributed Systems issues • How will communication between routers and DEs survive failures in the network? • Latency means DE’s view of network is behind reality. Will the control loop be stable? • What is the overhead to/from the DEs? • What happens in a network partition? • Networking issues • Does the 4D simplify control and management? • Can we create logic to meet multiple objectives?

  12. Evaluation of the 4D Prototype • Evaluated using Emulab (www.emulab.net) • Linux PCs used as routers (650 – 800MHz) • Tested on 9 enterprise network topologies (10-100 routers each) Example network with 49 switches and 5 DEs

  13. Performance of the 4D Prototype Trivial prototype has performance comparable to well-tuned production networks • Recovers from single link failure in < 300 ms • < 1 s response considered “excellent” • Faster forwarding reconvergence possible • Survives failure of master Decision Element • New DE takes control within 1 s • No disruption unless second fault occurs • Gracefully handles complete network partitions • Less than 1.5 s of outage

  14. Thanks!

  15. Future Work • Scalability • Evaluate over 1-10K switches, 10-100K routes • Networks with backbone-like propagation delays • Structuring decision logic • Arbitrate among multiple, potentially competing objectives • Unify control when some logic takes longer than others • Protocol improvements • Better dissemination and discovery planes • Deployment in today’s networks • Data center, enterprise, campus, backbone (RCP)

  16. Future Work • Expand relationships with security • Securing the infrastructure • Using 4D as mechanism for monitoring/quarantine • Formulate models that establish bounds of 4D • Scale, latency, stability, failure models, objectives • Generate evidence to support/refute principles

  17. Themes of Network Control & Management Holistic Design • Many different technologies – a few common problems • Find the right abstractions: exploit commonality Clean Slate • How much autonomy do routers/switches need? • New principles for controlling networks • Separate networking issues from distributed system issues Leverage Network Structure • Many different types of networks exist - each with different objectives and topologies

  18. Recent Publications • G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A. Greenberg, G. Hjalmtysson, J. Rexford, “On Static Reachability Analysis of IP Networks,” IEEE INFOCOM 2005, Orlando, FL, March 2005. • J. Rexford, A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, G. Xie, J. Zhan, H. Zhang, “Network-Wide Decision Making: Toward a Wafer-Thin Control Plane,” Proceedings of ACM HotNets-III, San Diego, CA, November 2004. • D. A. Maltz, J. Zhan, G. Xie, G. Hjalmtysson, A. Greenberg, H. Zhang, “Routing Design in Operational Networks: A Look from the Inside,” Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (ACM SIGCOMM 2004), Portland, Oregon, 2004. • D. A. Maltz, J. Zhan, G. Xie, H. Zhang, G. Hjalmtysson, A. Greenberg, J. Rexford, “Structure Preserving Anonymization of Router Configuration Data,” Proceedings of ACM/Usenix Internet Measurement Conference (IMC 2004), Sicily, Italy, 2004.

  19. A Clean-slate Design • What are the fundamental causes of network problems? • How to secure the network and protect the infrastructure? • What functionality needs to be distributed – what can be centralized? • How to reduce/simplify the software in networks? • What would a “RISC” router look like? • How to leverage technology trends? • CPU and link-speed growing faster than # of switches

  20. Three Principles forNetwork Control & Management Network-level Objectives: • Express goals explicitly • Security policies, QoS, egress point selection • Do not bury goals in box-specific configuration Reachability matrix Traffic engineering rules Management Logic

  21. Three Principles forNetwork Control & Management Network-wide Views: • Design network to provide timely, accurate info • Topology, traffic, resource limitations • Give logic the inputs it needs Reachability matrix Traffic engineering rules Management Logic Read state info

  22. Three Principles forNetwork Control & Management Direct Control: • Allow logic to directly set forwarding state • FIB entries, packet filters, queuing parameters • Logic computes desired network state, let it implement it Reachability matrix Traffic engineering rules Write state Management Logic Read state info

  23. Overview of the 4D Architecture Network-level objectives Dissemination Plane: • Provides a robust communication channel to each router – and robustness is the only goal! • May run over same links as user data, but logically separate and independently controlled Decision Dissemination Direct control Network-wide views Discovery Data

  24. Overview of the 4D Architecture Network-level objectives Discovery Plane: • Each router discovers its own resources and its local environment • E.g., the identity of its immediate neighbors Decision Dissemination Direct control Network-wide views Discovery Data

  25. Overview of the 4D Architecture Network-level objectives Data Plane: • Spatially distributed routers/switches • Can deploy with today’s technology • Looking at ways to unify forwarding paradigms across technologies Decision Dissemination Direct control Network-wide views Discovery Data

  26. Fundamental Problem: Conflation of Issues • Ideal case: all routing information flooded to all routers inside network • Robustness achieved via flooding • Reality: routing information filtered and aggregated extensively • Route filtering used to implement security and resource policies • Route aggregation used to achieve scalability

  27. 4D Separates Distributed Computing Issues from Networking Issues • Distributed computing issues ! protocols and network architecture • Overhead • Resiliency • Scalability • Networking issues ! management logic • Traffic engineering and service provisioning • Egress point selection • Reachability control (VPNs) • Precomputation of backup paths

  28. 4D Can Leverage Network Structure • Decision plane logic can be specialized for structure of each physical network • Distributed protocols must be prepared for arbitrary topology graphs • 4D enables network logic specialized differently for access and for backbone • E.g., creating aggregation tree in access network • Advantages • Faster route computations • Retain flexibility to evolve network as needed • Support transition to 100x100 architecture

  29. The Feasibility of the 4D Architecture We designed and built a prototype of the 4D Architecture • 4D Architecture permits many designs – prototype is a single, simple design point • Decision plane • Contains logic to simultaneously compute routes and enforce reachability matrix • Multiple Decision Elements per network, using simple election protocol to pick master • Dissemination plane • Uses source routes to direct control messages • Extremely simple & robust • Quickly route around failed data links, even multiple failures