1 / 37

Data Center Networks

Data Center Networks. CS 401/601 Computer Network Systems Mehmet Gunes. Slides modified from: Mohammad Alizadeh , Albert Greenberg, Changhoon Kim, Srinivasan Seshan. What are Data Centers?. Large facilities with 10s of thousands of networked servers

torn
Télécharger la présentation

Data Center Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Center Networks CS 401/601 Computer Network Systems Mehmet Gunes Slides modified from: Mohammad Alizadeh, Albert Greenberg, Changhoon Kim, Srinivasan Seshan

  2. What are Data Centers? • Large facilities with 10s of thousands of networked servers • Compute, storage, and networking working in concert • “Warehouse-Scale Computers” • Huge investment: ~ 0.5 billion for large datacenter

  3. Data Center Costs The Cost of a Cloud: Research Problems in Data Center Networks. SigcommCCR 2009. Greenberg, Hamilton, Maltz, Patel. • *3 yr amortization for servers, 15 yr for infrastructure; 5% cost of money

  4. Server Costs 30% utilization considered “good” in most data centers! • Uneven application fit • Each server has CPU, memory, disk: • most applications exhaust one resource, stranding the others • Uncertainty in demand • Demand for a new service can spike quickly • Risk management • Not having spare servers to meet demand brings failure just when success is at hand

  5. Goal: Agility – Any service, Any Server • Turn the servers into a single large fungible pool • Dynamically expand and contract service footprint as needed • Benefits • Lower cost (higher utilization) • Increase developer productivity • Achieve high performance and reliability

  6. Datacenter Networks Provide the illusion of “One Big Switch” Storage (Disk, Flash, …) Compute 10,000sof ports

  7. Datacenter Traffic Growth • Today: Petabits/s in one DC • More than core of the Internet! • Source: “Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network”, SIGCOMM 2015.

  8. Latency is King Who does she know? Large-scale Web Application Traditional Application What has she done? • 1 user request  1000s of messages over DC network • Microseconds of latency matter • Even at the tail (e.g., 99.9th percentile) App. Logic App Logic Alice << 1µs latency App Tier DataStructures 10μs-1ms latency App Logic App Logic App Logic App Logic App Logic App Logic App Logic App Logic App Logic App Logic Fabric Single machine Data Tier Eric Minnie Pics Videos Apps Data Center • Based on slide by John Ousterhout (Stanford)

  9. Datacenter Arms Race • Amazon, Google, Microsoft, Yahoo!, … race to build next-gen mega-datacenters • Industrial-scale Information Technology • 100,000+ servers • Located where land, water, fiber-optic connectivity, and cheap power are available

  10. Computers + Net + Storage + Power + Cooling

  11. DC Networks — L2 pros, cons? — L3 pros, cons? Internet CR CR DC-Layer 3 . . . AR AR AR AR DC-Layer 2 S S • Key • CR = Core Router (L3) • AR = Access Router (L3) • S = Ethernet Switch (L2) • A = Rack of app. servers . . . S S S S … … A A A A A A ~ 1,000 servers/pod == IP subnet Reference – “Data Center: Load balancing Data Center Services”, Cisco 2004

  12. Reminder: Layer 2 vs. Layer 3 • Ethernet switching (layer 2) • Fixed IP addresses and auto-configuration (plug & play) • Seamless mobility, migration, and failover • Broadcast limits scale (ARP) • No multipath (Spanning Tree Protocol) • IP routing (layer 3) • Scalability through hierarchical addressing • Multipath routing through equal-cost multipath • Can’t migrate w/o changing IP address • Complex configuration

  13. Layer 2 vs. Layer 3 for Data Centers

  14. Data center networks • load balancer: application-layer routing • receives external client requests • directs workload within data center • returns results to external client (hiding data center internals from client) Internet Border router Load balancer Load balancer Access router Tier-1 switches B A C Tier-2 switches TOR switches Server racks 7 6 5 4 8 3 2 1

  15. Scaling a LAN network • Self-learning Ethernet switches work great at small scales, but buckle at larger scales • Broadcast overhead of self-learning linear in the total number of interfaces • Broadcast storms possible in non-tree topologies • Goals • Scalability to a very large number of machines • Isolation of unwanted traffic from unrelated subnets • Ability to accommodate general types of workloads (Web, database, MapReduce, scientific computing, etc.)

  16. Data center networks • rich interconnection among switches, racks: • increased throughput between racks (multiple routing paths possible) • increased reliability via redundancy Tier-1 switches Tier-2 switches TOR switches Server racks 7 6 5 4 8 3 2 1

  17. Broad questions • How are massive numbers of commodity machines networked inside a data center? • Virtualization: How to effectively manage physical machine resources across client virtual machines? • Operational costs: • Server equipment • Power and cooling

  18. Data Center Network

  19. Hierarchical Addresses

  20. Hierarchical Addresses

  21. Hierarchical Addresses

  22. Hierarchical Addresses

  23. PortLand: Location Discovery Protocol • Location Discovery Messages (LDMs) exchanged between neighboring switches • Switches self-discover location on boot up

  24. Data Center Packet Transport • Large purpose-built DCs • Huge investment: • R&D • business • Transport inside the DC • TCP rules • 99.9% of traffic

  25. TCP in the Data Center • TCP does not meet demands of apps. • Suffers from bursty packet drops, Incast, ... • Builds up large queues: • Adds significant latency. • Wastes precious buffers, esp. bad with shallow-buffered switches. • Operators work around TCP problems • Ad-hoc, inefficient, often expensive solutions • No solid understanding of consequences, tradeoffs

  26. Partition/Aggregate Application Structure Deadline = 250ms MLA MLA TLA Picasso • Time is money • Strict deadlines (SLAs) • Missed deadline • Lower quality result ……… 1. Art is a lie… 1. 1. Deadline = 50ms 2. The chief… 2. Art is a lie… 2. Art is… ….. 3. ….. ….. 3. 3. Picasso “I'd like to live as a poor man with lots of money.“ “The chief enemy of creativity is good sense.“ “Computers are useless. They can only give you answers.” “Bad artists copy. Good artists steal.” “Art is a lie that makes us realize the truth. “It is your work in life that is the ultimate seduction.“ “Inspiration does exist, but it must find you working.” “Everything you can imagine is real.” Deadline = 10ms Worker Nodes

  27. Generality of Partition/Aggregate • The foundation for many large-scale web applications. • Web search, Social network composition, Ad selection, etc. • Example: Facebook • Partition/Aggregate ~ Multiget • Aggregators: Web Servers • Workers: Memcached Servers Internet Web Servers Memcached Protocol Memcached Servers

  28. Workloads • Partition/Aggregate (Query) • Short messages [50KB-1MB] (Coordination, Control state) • Large flows [1MB-50MB] (Data update) Delay-sensitive Delay-sensitive Throughput-sensitive

  29. Tension Between Requirements High Throughput Low Latency High Burst Tolerance • Deep Buffers: • Queuing Delays • Increase Latency • Shallow Buffers: • Bad for Bursts & • Throughput Objective: Low Queue Occupancy & High Throughput • AQM – RED: • Avg Queue Not Fast • Enough for Incast • Reduced RTOmin • Doesn’t Help Latency

  30. Review: The TCP/ECN Control Loop Sender 1 ECN = Explicit Congestion Notification ECN Mark (1 bit) Receiver Sender 2

  31. Two Key Ideas • React in proportion to the extent of congestion, not its presence • Reduces variance in sending rates, lowering queuing requirements • Mark based on instantaneous queue length. • Fast feedback to better deal with bursts.

  32. DCTCP in Action (Kbytes) Setup: Win 7, Broadcom 1Gbps Switch Scenario: 2 long-lived flows, K = 30KB

  33. Why it Works • High Burst Tolerance • Large buffer headroom → bursts fit • Aggressive marking → sources react before packets are dropped • Low Latency • Small buffer occupancies → low queuing delay • 3. High Throughput • ECN averaging → smooth rate adjustments, low variance

  34. Current solutions for increasing data center network bandwidth FatTree BCube 1. Hard to construct 2. Hard to expand

  35. Fat-Tree • Inter-connect racks (of servers) using a fat-tree topology • Fat-Tree: a special type of Clos Networks (after C. Clos) K-ary fat tree: three-layer topology (edge, aggregation and core) • each pod consists of (k/2)2 servers & 2 layers of k/2 k-port switches • each edge switch connects to k/2 servers & k/2 aggr. switches • each aggr. switch connects to k/2 edge & k/2 core switches • (k/2)2 core switches: each connects to k pods

  36. Fat-Tree Fat-tree with K=4

  37. Why Fat-Tree? • Fat tree has identical bandwidth at any bisections • Each layer has the same aggregated bandwidthCan be built using cheap devices with uniform capacity • Each port supports same speed as end host • All devices can transmit at line speed if packets are distributed uniform along available paths • Great scalability: k-port switch supports k3/4 servers Fat tree network with K = 3 supporting 54 hosts

More Related