1 / 34

Scalable Rule Management for Data Centers

NSDI’13. Scalable Rule Management for Data Centers. Masoud Moshref , Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013. Introduction: Definitions. Datacenters use rules to implement management policies. Datacenters use rules to implement management policies.

clare
Télécharger la présentation

Scalable Rule Management for Data Centers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NSDI’13 Scalable Rule Management for Data Centers Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013

  2. Introduction: Definitions Datacenters use rules to implement management policies Datacenters use rules to implement management policies Datacenters use rules to implement management policies An actionon a set of ranges on flow fields An action on a set of ranges on flow fields • Access control • Rate limiting • Traffic measurement • Traffic engineering • Examples: • Deny • Accept • Enqueue • Flow fields examples: • Src IP / Dst IP • Protocol • Src Port / Dst Port

  3. Introduction: Definitions Datacenters use rules to implement management policies An actionon a set of ranges on flow fields Src IP • R1: Accept • SrcIP: 12.0.0.0/7 • DstIP: 10.0.0.0/8 Dst IP R2 • R2: Deny • SrcIP: 12.0.0.0/8 • DstIP: 8.0.0.0/6 R1

  4. Current practice Rules are saved on predefined fixed machines On hypervisors On switches

  5. Machines have limited resources Top-of-Rack switch TCAM Network Interface Card Software switches on servers

  6. Future datacenters will have many fine-grained rules • VLAN per server • Traffic management (NetLord, Spain) 1M rules • Per flow decision • Flow measurement for • traffic engineering (MicroTE, Hedera) 10M – 100M rules • Regulating VM pair communication • Access control (CloudPolice) • Bandwidth allocation (Seawall) 1B – 20B rules

  7. Rule location trade-off (resource vs. bandwidth usage) TCAM R0 Storing rules at hypervisor incurs CPU overhead

  8. Rule location trade-off (resource vs. bandwidth usage) TCAM R0 Move the rule to ToR switch and forward traffic Storing rules at hypervisor incurs CPU overhead

  9. Rule location trade-off: Offload to servers TCAM R1

  10. Challenges: Concrete example VM1 VM3 VM5 VM7 Src IP VM0 VM2 VM4 VM6 0 1 2 3 4 5 6 7 Dst IP 0 R0 R1 R3 1 R4 2 R5 3 4 R2 5 6 R6 7

  11. Challenges: Overlapping rules 0 1 2 3 4 5 6 7 0 R0 R1 R3 • Source Placement: Saving rules on the source machine means minimum overhead 1 R4 2 3 Src IP 4 R2 5 0 1 2 3 4 5 6 7 Dst IP 6 0 R0 R1 R3 7 1 R4 2 R5 3 4 R2 5 6 R6 7 VM2 VM6

  12. Challenges: Overlapping rules 0 1 2 3 4 5 6 7 0 R0 R1 R3 • If Source Placement is not feasible 1 2 3 Src IP 4 R2 5 0 1 2 3 4 5 6 7 Dst IP 6 0 R0 R1 R3 7 1 R4 R4 2 3 4 R2 5 6 7 VM2 VM6

  13. Challenges Preserve the semantics of overlapping rules Respect resource constraints • Heterogeneous devices Minimize traffic overhead Handle Dynamics • Traffic changes • Rule changes • VM Migration

  14. Contribution: vCRIB, a Virtual Cloud Rule Information Base Proactive rule placement abstraction layer Optimize traffic given resource constraints & changes Rules R1 R2 R3 R4 R3

  15. vCRIB design Topology & Routing Source Partitioning with Replication • Overlapping Rules Rules Partitions Minimum Traffic Feasible Placement

  16. Partitioning with replication Smaller partitions have more flexibility Cutting causes rule inflation R6 R2 R5 R0 R4 R2 R6 R5 R3 R3 R0 R0 R2 R5 R3 R8 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7 0 0 0 0 R7 R 1 R7 R 1 1 1 1 1 2 2 2 2 R0 R4 3 3 3 3 R1 P2 P1 P3 R3 R4 R0 R6 4 4 4 4 R3 R3 5 5 5 5 R0 6 6 6 6 R8 7 7 7 7 R8 R 1 R 1 R7 R1 P1 P2 P3

  17. Partitioning with replication • Introduce the concept of similarity to mitigate inflation P1 P2 (7 rules) R0 R0 R4 R6 0 0 0 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6 7 7 7 R3 R3 R0 R2 R5 0 0 0 R3 1 1 1 2 2 2 R0 R6 R2 R5 R8 3 3 3 R3 4 4 4 A7 R1 R 1 5 5 5 R1 6 6 6 7 7 7 R7 R1 P2(5 rules) P3 (5 rules) P1 (5 rules)

  18. Per-source partitions Src IP Dst IP 0 1 2 3 4 5 6 7 • Limited resource for forwarding • No need for replication to approximate source-placement • Closer partitions are more similar 0 R0 R1 R3 1 R4 2 R5 3 4 R2 5 6 R6 7

  19. vCRIB design: Placement T11 Topology & Routing Source Partitioning with Replication T21 T22 Rules T23 Partitions T32 T33 • Resource Constraints • Traffic Overhead Placement Minimum Traffic Feasible Placement

  20. vCRIB design: Placement Topology & Routing Source Partitioning with Replication Rules Partitions Resource-Aware Placement • Resource Constraints Feasible Placement • Traffic Overhead • Traffic Overhead Traffic-Aware Refinement Traffic-Aware Refinement Minimum Traffic Feasible Placement

  21. FFDS (First Fit Decreasing Similarity) Put a random partition on an empty device Add the most similar partitions to the initial partition until the device is full • Find the lower bound for optimal solution for rules • Prove the algorithm is a 2-approximation of the lower bound

  22. vCRIB design: Heterogeneous resources Topology & Routing Source Partitioning with Replication Rules Partitions Resource-Aware Placement • Resource Heterogeneity Resource Usage Function Feasible Placement Traffic-Aware Refinement Minimum Traffic Feasible Placement

  23. vCRIB design: Traffic-Aware Refinement Topology & Routing Source Partitioning with Replication Rules Partitions Resource-Aware Placement Resource Usage Function Feasible Placement Traffic-Aware Refinement • Traffic Overhead Minimum Traffic Feasible Placement

  24. Traffic-aware refinement • Overhead greedy approach • Pick maximum overhead partition • Put it where minimizes the overhead and maintains feasibility P4 P2 VM2 VM4

  25. Traffic-aware refinement • Overhead greedy approach • Pick maximum overhead partition • Put it where minimizes the overhead and maintains feasibility • Problem: Local minima • Our approach: Benefit greedy P4 P2 VM2 VM4

  26. vCRIB design: Dynamics Topology & Routing Source Partitioning with Replication Rules Partitions Resource-Aware Placement Rule/VM Dynamics Resource Usage Function Feasible Placement Traffic-Aware Refinement • Dynamics Major Traffic Changes Minimum Traffic Feasible Placement

  27. vCRIB design Topology & Routing Source Partitioning with Replication • Overlapping Rules Rules • Resource Constraints Partitions Resource-Aware Placement Rule/VM Dynamics • Resource Heterogeneity Resource Usage Function Feasible Placement Traffic-Aware Refinement • Traffic Overhead Major Traffic Changes • Dynamics Minimum Traffic Feasible Placement

  28. Evaluation • Comparing vCRIB vs. Source-Placement • Parameter sensitivity analysis • Rules in partitions • Traffic locality • VMs per server • Different memory sizes • Where is the traffic overhead added? • Traffic-aware refinement for online scenarios • Heterogeneous resource constraints • Switch-only scenarios

  29. Simulation setup • 1k servers with 20 VMs per server in a Fat-tree network • 200k rules generated by ClassBench and random action • IPs are assigned in two ways: • Random • Range • Flows • Size follows long-tail distribution • Local traffic matrix (0.5 same rack, 0.3 same pod, 0.2 interpod) 0 1 2 3 4 5 6 7

  30. Comparing vCRIB vs. Source-Placement • Maximum Load is 5K Capacity is 4K • Random: Average load is 4.2K • vCRIB finds low traffic feasible solution • Range is better as similar partitions are from the same source • Adding more resources helps vCRIB reduce traffic overhead

  31. Parameter sensitivity analysis: Rules in partitions Total space vCRIB Feasible Source placement • Defined by maximum load on a server

  32. Parameter sensitivity analysis: Rules in partitions Total space vCRIB A vCRIB <10% Traffic <10% Traffic A Source placement Lower traffic overhead for smaller partitions and more similar ones

  33. Conclusion and future work • Conclusion vCRIB allows operators and users to specify rules, and manages their placement in a way that respects resource constraints and minimizes traffic overhead. • Future work • Support reactive placement by adding the controller in the loop • Break a partition for large number of rules per VM • Test for other rulesets

  34. NSDI’13 Scalable Rule Management for Data Centers Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013

More Related