500 likes | 832 Vues
Best Current Practices for IPv4 Multicast Deployment. Bill Nickless nickless@mcs.anl.gov http://www.mcs.anl.gov/home/nickless. What is Multicast?. A multicast sender simply sends its data, and intervening routers "conspire" to get the data to all interested listeners. (S. Deering)
E N D
Best Current Practices for IPv4 Multicast Deployment Bill Nicklessnickless@mcs.anl.govhttp://www.mcs.anl.gov/home/nickless
What is Multicast? • A multicast sender simply sends its data, and intervening routers "conspire" to get the data to all interested listeners. (S. Deering) • Destination of IP multicast packets is a “Group” address, within 224.0.0.0/4.
Notation • Specific source address(es): S • Specific group address(es): G • Specific source traffic for a group: (S,G) • All sources traffic for a group: (*,G) • Rendezvous Point RP
Any Source Multicast • Senders send multicast group-addressed packets. • Receivers register their interest in groups by way of IGMPv2 (*,G) Joins • Network keeps track of all senders for each group, and delivers packets from all senders to each interested Receiver.
Source Specific Multicast • Senders send multicast group-addressed packets. • Receivers register their interest in specific sources sending to specific groups by way of IGMPv3 (S,G) Joins (well, group membership reports….) • Receivers are responsible for specifying which Senders’ traffic they want to receive.
Reachability NOT DEFINED BY INTERNET STANDARDS
Reachability (Where To?) • NOT DEFINED BY INTERNET STANDARDS • Unicast reachability is interpreted by implementation and practice as: Send me IP packets with destination addresses that match this advertisement. • Think ‘show ip route’
Reachability (Whence?) • NOT DEFINED BY INTERNET STANDARDS • Multicast reachability is interpreted by implementation and practice as: Here’s where to get IP packets from sources that match this advertisement. • Think ‘show ip rpf’
Reachability Examples terra% netstat –rn Kernel IP routing table Destination Gateway Genmask Flags Iface 140.221.11.103 0.0.0.0 255.255.255.255 UH eth0 140.221.8.0 0.0.0.0 255.255.252.0 U eth0 127.0.0.0 0.0.0.0 255.0.0.0 U lo 224.0.0.0 0.0.0.0 240.0.0.0 U eth0 0.0.0.0 140.221.11.253 0.0.0.0 UG eth0
Reachability Examples Kiwi#show ip route 140.221.11.103 Routing entry for 140.221.8.0/22 Known via "ospf 683", distance 110, metric 1117, type intra area Last update from 140.221.20.124 on GigabitEthernet5/0, 03:35:56 ago Routing Descriptor Blocks: * 140.221.20.124, from 140.221.47.6, 03:35:56 ago, via GigabitEthernet5/0 Route metric is 1117, traffic share count is 1
Reachability Examples Kiwi#show ip rpf 140.221.11.103 RPF information for terra.mcs.anl.gov (140.221.11.103) RPF interface: GigabitEthernet5/0 RPF neighbor: stardust-msfc-20.mcs.anl.gov (140.221.20.124) RPF route/mask: 140.221.8.0/22 RPF type: unicast (ospf 683) RPF recursion count: 0 Doing distance-preferred lookups across tables
The Old MBONE • Excellent first approximation. • Used tunnels to encapsulate multicast traffic over unicast paths. • Routing done by user-space daemons running on general purpose Unix boxes. • Internet Group Management Protocol (IGMP)(Think Multicast ARP) • Pre-dates the World Wide Web (hence SDR)
Lessons Learned from MBONE • Distance Vector Metric Routing Protocol (DVMRP) does not scale • Easy to create IP Multicast “amplifiers”. • Separate tunneled routing infrastructure not aligned with modern BGP Internetworking. • Flood & Prune does not scale • Examples: PIM-Dense Mode, DVMRP. • Not sensitive to available bandwidth. • Requires downstream routers that are smart and powerful enough to send prune messages.
Applying Those Lessons • Multicast Border Gateway Protocol. • Provides reachability and policy control for multicast routing, just as BGP does for unicast. • Protocol Independent Multicast (Sparse Mode) • Listeners receive traffic only when requested. • Forms multicast distribution trees. • Multicast Source Discovery Protocol • Finding active sources in other PIM Sparse Mode domains (usually other ASes).
Setting Reachability Policy: Multicast Border Gateway Protocol • RFC 2283 adds the MP_REACH_NLRI attribute to BGP-4. • Identifies a BGP route as unicast, multicast, or both • When implemented in a router, all the standard BGP machinery is available for prefix filtering, preference setting, MEDs, AS length comparisons, etc. • M-BGP routes can be independent of BGP, allowing for different inter-AS unicast/multicast reachability.
Cisco M-BGP Configuration router bgp 683 network 130.202.0.0 nlri unicast multicast network 140.221.0.0 nlri unicast multicast neighbor 192.5.170.130 remote-as 145 nlri unicast multicast neighbor 192.5.170.130 description vBNS neighbor 192.5.170.130 soft-reconfiguration inbound neighbor 192.5.170.130 route-map from-vbns-lp-400 in neighbor 192.5.170.130 route-map to-vbns-med-10 out
Cisco M-BGP Configuration route-map from-vbns-lp-400 permit 10 match nlri unicast set local-preference 400!route-map from-vbns-lp-400 permit 15 match as-path 145 match nlri multicast set local-preference 400!route-map to-vbns-med-10 permit 10 match ip address 50 set metric 10
Cisco M-BGP Configuration access-list 50 permit 140.221.0.0access-list 50 permit 130.202.0.0!ip as-path access-list 145 deny _24_ip as-path access-list 145 deny _293_ip as-path access-list 145 deny _11537_ip as-path access-list 145 permit .*
Juniper M-BGP Configuration routing-options { rib inet.2 { static { route 141.142.0.0/16 reject; route 141.142.109.0/25 next-hop 141.142.11.74; route 141.142.109.128/25 next-hop 141.142.11.74; route 141.142.104.0/24 next-hop 141.142.11.74; route 141.142.105.0/24 next-hop 141.142.11.74; route 141.142.108.0/24 next-hop 141.142.11.74; } } }
Juniper M-BGP Configuration routing-options { rib-groups { ifrg { import-rib [ inet.0 inet.2 ]; } mcrg { export-rib inet.2; import-rib inet.2; } igp-rg { export-rib inet.0; import-rib [ inet.0 inet.2 ]; } }}
Juniper M-BGP Configuration protocols { bgp { group anl { import [ bgp-anl-accept reject-all ]; family inet { any; } export [ bgp-announce-ncsa reject-all ]; peer-as 683; neighbor 206.220.243.21; } }
Monitoring M-BGP (Cisco) Kiwi#show ip mbgp sum BGP router identifier 192.5.170.2, local AS number 683 MBGP table version is 324285 4121 network entries and 12621 paths using 862335 bytes of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ 192.5.170.130 4 145 53420 20497 324285 0 0 Up/Down State/PfxRcd 5d14h 346
Kiwi#show ip mbgp 128.163.3.214 MBGP routing table entry for 128.163.0.0/16, version 323761 Paths: (3 available, best #2) 24 145 10490 10437, (aggregated by 10437 128.163.55.253), (received-only) 192.12.123.10 from 192.12.123.10 (198.10.80.66) Origin IGP, localpref 100, valid, external, atomic-aggregate 145 10490 10437, (aggregated by 10437 128.163.55.253) 192.5.170.130 from 192.5.170.130 (204.147.135.241) Origin IGP, localpref 400, valid, external, atomic-aggregate, best 145 10490 10437, (aggregated by 10437 128.163.55.253), (received-only) 192.5.170.130 from 192.5.170.130 (204.147.135.241) Origin IGP, localpref 100, valid, external, atomic-aggregate
Monitoring M-BGP (Juniper) nickless@charlie> show bgp neighbor 206.220.243.21 Peer: 206.220.243.21+179 AS 683 Local: 206.220.243.160+1969 AS 1224 [. . .] NLRI advertised by peer: inet-unicast inet-multicast NLRI for this session: inet-unicast inet-multicast Peer supports Refresh capability (2) Table inet.0 Bit: 10006 Active Prefixes: 13 Received Prefixes: 13 Suppressed due to damping: 0 Table inet.2 Bit: 20006 Active Prefixes: 9 Received Prefixes: 9 Suppressed due to damping: 0
nickless@charlie> show route table inet.2 140.221.34.1 inet.2: 5046 destinations, 5046 routes (5045 active, 0 holddown, 1 hidden) + = Active Route, - = Last Active, * = Both 140.221.0.0/16 *[BGP/170] 2w5d 19:24:04, MED 0, localpref 1000 AS path: 683 I > to 206.220.243.21 via at-1/0/0.683 [BGP/170] 3d 04:38:22, MED 0, localpref 60 AS path: 11537 683 I > to 141.142.11.246 via so-2/2/0.0 [BGP/170] 1w0d 11:18:35, localpref 60 AS path: 145 683 I > to 141.142.11.1 via at-1/0/0.145 [BGP/170] 2w5d 19:23:42, localpref 60 AS path: 38 683 I > to 192.17.8.32 via at-1/0/0.38 [BGP/170] 4d 05:55:21, MED 5, localpref 20 AS path: 2914 683 I > to 192.17.8.34 via at-1/0/0.2914
PIM Sparse Mode • RFC 2362 defines PIM Sparse Mode. • No PIM-SM activity until: • A host starts transmitting traffic (or) • A host subscribes to a group. • A Rendezvous Point (RP) is the root of the shared distribution tree for multicast traffic within a PIM Domain. • Given enough traffic, a source-based distribution tree is created. (Enough is typically anything greater than zero). • Inter-PIM Domain distribution trees are all source-based.
Multicast Session Discovery Protocol (MSDP) • Not yet an RFC (in Last Call stage). See http://www.ietf.org/html.charters/msdp-charter.htmlandftp://ftp.ietf.org/internet-drafts/ draft-ietf-msdp-spec-09.txt • Currently only covers IPv4. • PIM-SM RPs communicate through MSDP to find active multicast sources. • If “interested”, the RP initiates a PIM-SM Join towards each active source.
Reachability Redux • A BGP NLRI=Multicast route is a statement of reachability. • Inter-domain PIM-Sparse Mode Joins follow the BGP reachability topology. • MSDP forwarding between RPs follows the BGP reachability topology. • Not doing MSDP where you do M-BGP means that you’ve formed an MSDP “black hole”.
Cisco PIM-SM w/ MSDP Configuration • interface ATM3/0.145 point-to-point description vBNS MBGP+PIM-SM+MSDP ip address 192.5.170.129 255.255.255.252 ip pim border ip pim sparse-mode ip multicast ttl-threshold 32 ip multicast boundary 10ip msdp peer 204.147.128.141ip msdp description 204.147.128.141 vBNSip msdp sa-filter in 204.147.128.141 list 111ip msdp sa-filter out 204.147.128.141 list 111ip msdp sa-request 204.147.128.141ip msdp ttl-threshold 204.147.128.141 32ip msdp cache-sa-state
access-list 10 deny 224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NETaccess-list 10 deny 224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NETaccess-list 10 deny 239.0.0.0 0.255.255.255access-list 10 permit 224.0.0.0 15.255.255.255 • access-list 111 deny ip any host 224.0.2.2 ! SUN-RPC.MCAST.NETaccess-list 111 deny ip any host 224.0.1.3 ! RWHOD.MCAST.NETaccess-list 111 deny ip any host 224.0.1.24 ! MICROSOFT-DS.MCAST.NETaccess-list 111 deny ip any host 224.0.1.22 ! SVRLOC.MCAST.NETaccess-list 111 deny ip any host 224.0.1.2 ! SGI-DOG.MCAST.NETaccess-list 111 deny ip any host 224.0.1.35 ! SVRLOC-DA.MCAST.NETaccess-list 111 deny ip any host 224.0.1.60 ! HP-DEVICE-DISC.MCAST.NETaccess-list 111 deny ip any host 224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NETaccess-list 111 deny ip any host 224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NETaccess-list 111 deny ip any 239.0.0.0 0.255.255.255access-list 111 deny ip 10.0.0.0 0.255.255.255 anyaccess-list 111 deny ip 127.0.0.0 0.255.255.255 anyaccess-list 111 deny ip 172.16.0.0 0.15.255.255 anyaccess-list 111 deny ip 192.168.0.0 0.0.255.255 anyaccess-list 111 permit ip any
Juniper PIM-SM w/ MSDP Config protocols {pim { rib-group mcrg; rp { local { address 141.142.12.1; } } interface all { mode sparse; version 2; }} }
Juniper PIM-SM w/ MSDP Config protocols { msdp { rib-group mcrg; group anl { /* kiwi-loop.anchor.anl.gov */ peer 192.5.170.2 { local-address 141.142.12.1; } } } }
Monitoring MSDP and PIM-Sparse • Verify that MSDP session has come up with your peer:Kiwi#show ip msdp sum MSDP Peer Status SummaryPeer Address AS State Uptime/ Reset Peer Name Downtime Count204.147.128.141 145 Up 1d12h 11 cs.dng.vbns.net nickless@charlie> show msdp peer 192.5.170.2 Peer address Local address State Last up/down Peer-Group192.5.170.2 141.142.12.1 Established 2w5d18h anl
Monitoring MSDP and PIM-Sparse • Verify that active sources are being discovered: Kiwi#show ip msdp sa-cache 224.2.177.155 MSDP Source-Active Cache - 4020 entries (128.197.160.27, 224.2.177.155), RP 204.147.128.141, MBGP/AS 145, 03:40:18/00:05:03 […etc] nickless@charlie> show msdp source-active group 233.2.171.1 Group address Source address Peer address Originator Flags 233.2.171.1 140.221.34.1 141.142.11.246 192.5.170.2 Accept 192.5.170.2 192.5.170.2 Accept 192.17.8.32 192.5.170.2 Accept 204.147.128.141 192.5.170.2 Accept
Monitoring MSDP and PIM-Sparse • Verify that you are receiving traffic from those active sources, and are forwarding:Kiwi#show ip mroute count 224.2.177.155 128.163.3.214 Forwarding Counts: Pkt Count/Pkts per second/ Avg Pkt Size/Kilobits per secondOther counts: Total/RPF failed/ Other drops(OIF-null, rate-limit etc)Group: 224.2.177.155, Source count: 26, Group pkt count: 31060731 RP-tree: Forwarding: 159/0/429/0, Other: 72/0/0 Source: 128.163.3.214/32, Forwarding: 7089/0/480/0, Other: 6/0/0
Kiwi#show ip mroute 224.2.177.155 128.163.3.214 IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT, M - MSDP created entry, X - Proxy Join Timer Running Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode (128.163.3.214, 224.2.177.155), 03:55:28/00:03:22, flags: MT Incoming interface: ATM3/0.145, RPF nbr 192.5.170.130, Mbgp Outgoing interface list: ATM0/0.216, Forward/Sparse, 03:55:28/00:03:08 ATM0/0.200, Forward/Sparse, 03:55:28/00:02:04
nickless@charlie> show multicast route group 233.2.171.1 \ source-prefix 140.221.34.1 extensive Group Source prefix Act Pru NHid Packets IfMismatch T/O 233.2.171.1 140.221.34.1 /32 A F 68 1829657 0 355 Upstream interface: at-1/0/0.683 Session name: Static Allocations nickless@charlie> show multicast route group 233.2.171.1 \ source-prefix 140.221.34.1 extensive Group Source prefix Act Pru NHid Packets IfMismatch T/O 233.2.171.1 140.221.34.1 /32 A F 68 1830512 0 355 Upstream interface: at-1/0/0.683 Session name: Static Allocations
nickless@charlie> show pim join 233.2.171.1 extensive Group Source RP Flags [. . .] 233.2.171.1 140.221.34.1 sparse,spt-pending Upstream interface: at-1/0/0.683Upstream State: Local RP, Join to Source Downstream Neighbors: Interface: ge-1/1/0.103 141.142.0.14 State: Join Flags: S Timeout: 182 Interface: gr-1/2/0.0 141.142.11.74 State: Join Flags: S Timeout: 208
Other Tips • ATM peerings are best done with point-to-point subinterfaces. (What’s a Designated Router in the context of an ATM exchange point, anyway?) • MSDP Source Actives are made from PIM Register messages. If you’re not sending MSDP SA messages for a source, you may have a problem with the Designated Router for that source.
More Tips • MSDP encapsulates data in its Source Active messages (just like they were encapsulated in the PIM Sparse Mode Register messages). This was done primarily to support SDR. • It is possible for MSDP to work while PIM-SM is not working, so you can’t always count on SDR to verify multicast routing.
Debugging Multicast • You must have: • at least one constantly active source • at least one constantly active receiver • Start near the receiver • Identify the PIM-SM Designated Router • Verify IGMP state in the Designated Router • Look for (S,G) state in the Designated Router
Debugging Multicast • Follow the Reverse Path Forwarding (RPF) from the Designated Router back towards the source • Verify PIM-SM has been configured on each interface along the RPF, because that determines the forwarding tree topology. • Check (S,G) state in each router. • Check (S,G) counters in each router.
Debugging Multicast • If the source is external to your PIM Domain: • Verify that you have an MSDP SA for that source. • Verify that the M-BGP Next Hop is: • A PIM Sparse Mode neighbor • An MSDP peer • Verify that you’re actually choosing the NLRI=Multicast route as your preferred RPF path. (hello BGP distance)
Debugging Multicast • What if nobody can hear your source? • Verify that the (S,G) shows up at your RP. • Verify that your RP is MSDP announcing the source, and that it shows up in your peer’s MSDP SA cache. • Verify your PIM-SM adjacency with your peer. • Verify that you have your peer’s interface in the outgoing list for the (S,G). • Verify that packet counters show traffic going out.
The Beacon: Test Signal • Testing Multicast requires active sessions • http://dast.nlanr.net/projects/beacon • In Java, so runsanywhere
The Beacon: Issues • Shows current state only. • Archive state over time? • How to visualize evolving state? Inherently a 3-dimensional problem, since state is 2D already. • Server scaling problems with O(40) beacons. • Currently seeing O(70) beacons at any time. • Assumes Any Source Multicast model.
Core Multicast Building Blocks • M-BGP: RFC 2283 is implemented by Juniper and Cisco in all major releases. AG community has used Juniper/Cisco the most. • MSDP: Implemented by Juniper, Cisco, Foundry... • PIM-Sparse Mode: RFC 2362 is implemented by a whole raft of vendors, including Cisco, Juniper, Foundry, Extreme, Marconi, etc.
Edge Multicast Building Blocks • IGMPv2 is widely available in Layer 2 and Layer 3 devices, and in most host operating systems. • IGMPv3 is coming soon to support SSM: • Available in Layer 3 devices from Cisco and Juniper. • IGMPv3 will be available in Windows XP (Whistler). • Ugly hack workarounds exist (URD et al).
North American IP Multicast Status • ESNet, Abilene, vBNS+, and NREN all running M-BGP, MSDP, and PIM-SM amongst themselves and with their customers/peers. • Regional and Institutional networks are currently the most common stumbling blocks for multicast apps. • STARTAP in Chicago is an international IP multicast meeting point. • International / commercial networks are coming online.