220 likes | 246 Vues
NetFlow: Digging Flows Out of the Traffic. Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004. Outline. Motivation Possible Approaches What is NetFlow Solution Design Snapshots Trouble-Shooting Example Present State. Motivation. CHALLENGE
E N D
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004
Outline • Motivation • Possible Approaches • What is NetFlow • Solution Design • Snapshots • Trouble-Shooting Example • Present State ESCC Meeting - Columbus/OH
Motivation • CHALLENGE • Steve Wolf challenge: “Show me all traffic exchanged between ESnet and Abilene.“ • Generalized challenge: To show ingress and egress traffic exchanged with ESnet broken down by AS. • MAIN REQUIREMENTS • ability to identify the top 100 flows involving institutions directly using ESnet • ability to identify AS-AS traffic • ability to visualize the top 10 flows and their evolution during a period of time • scalability to process data from all ESnet border routers ESCC Meeting - Columbus/OH
Solutions Available • Hardware Solutions • Dedicated Router Monitoring Board • Example: Juniper’s Monitoring Services PIC • Manufacturer dependent • Very expensive • Dedicated Link Monitoring Box • Example: BSD box using Bro • Scalability issues • Real-time information about routing tables • Software Solutions • Example: NetFlow • Adopted by several router and switch products (Cisco, Juniper, etc) • May require huge computing power to process data from large networks ESCC Meeting - Columbus/OH
NetFlow Characteristics (1) • What is a Flow? • A flow is defined as a unidirectional stream of packets. It is uniquely identified as the combination of the following seven key fields: • Source IP address • Destination IP address • Source port number • Destination port number • Layer 3 protocol type • ToS byte • Input logical interface (ifIndex) • It’s not a TCP flow. ESCC Meeting - Columbus/OH
NetFlow Characteristics (2) Packet Count Byte Count Source IP Address Destination IP Address Start sysUpTime End sysUpTime Source TCP/UDP Port Destination TCP/UDP Port Input ifIndex Output ifIndex Next Hop Address Source AS Number Destination AS Number Source Prefix Mask Destination Prefix Mask Type of Service TCP Flags Protocol NetFlow Packet Version 5 ESCC Meeting - Columbus/OH
Network Statistics System (Linux Cluster) Collectors Web Servers Computing Nodes Disk Storage Software Tools Flow-Tools (OSU) Perl MySQL Data Flow Processing Router sends Netflow Collectors scale up and store raw data Cluster performs: Intercloud filtering Aggregation Sorting Truncation (Top 100) SQL Store Display Data System Architecture ESCC Meeting - Columbus/OH
Data Accuracy • ESnet has a variety of router models from Cisco and Juniper. Both companies have different approaches to generate NetFlow information. • Cisco • Conditions for end of a flow • end of TCP connection (RST/SYN) • traffic not seen on a flow for 15 seconds • 30 minutes after the flow starts • when the flow table fills • No sampling for models lower than 12000 • Juniper • Statistical sampling per interface • We used SNMP data to compare the information obtained from NetFlow data ESCC Meeting - Columbus/OH
SNMP Comparison (Juniper) ESCC Meeting - Columbus/OH
SNMP Comparison (Cisco) ESCC Meeting - Columbus/OH
User Interface • Long Term Analysis • Use data stored in SQL database • Trend analysis • Short Term Analysis • Use raw data collected from routers • Network troubleshooting ESCC Meeting - Columbus/OH
Top Flows Screenshot - 1 ESCC Meeting - Columbus/OH
Top Flows Screenshot - 2 ESCC Meeting - Columbus/OH
Top Flows Screenshot - 3 ESCC Meeting - Columbus/OH
Top Flows Screenshot - 4 ESCC Meeting - Columbus/OH
Trouble-Shooting Example (1) • Topology GE • Hypothesis • Traffic from FNAL GE connection (FNAL CE -> FNAL-RT1) was over-running OC12 POS (FNAL-RT1 -> CHI-RT1) OC12 POS CHI-CR1 FNAL-RT1 FNAL CE • Issue • Regular egress discards on OC12 POS between FNAL-RT1 router and CHI-CR1 router. ESCC Meeting - Columbus/OH
Trouble-Shooting Example (2) • Flow Analysis • Isolate flows within discard time window • Mark time window by referencing “originating file” • Sort by “octets” field # --- ---- ---- Report Information --- --- ---## Fields: Total# Symbols: Disabled# Sorting: Descending Field 3# Name: Source/Destination IP## Args: flow-stat -f10 -S3### src IPaddr dst IPaddr flows octets packets originating file#129.105.21.229 198.49.208.10 193 1140264700 1014000 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125129.105.21.229 198.49.208.10 174 1138227500 1014600 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125198.49.208.10 129.105.21.229 196 1106719500 1114000 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125129.105.21.229 198.49.208.10 175 1086035800 980500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925198.49.208.10 128.100.190.11 182 1085264900 980500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925198.49.208.10 128.100.190.11 213 1062479100 960000 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125198.49.208.10 129.105.21.229 180 1051220800 1093500 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925128.100.190.11 198.49.208.10 242 1012027800 842100 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125198.49.208.10 128.100.190.11 206 1007483100 916300 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125128.100.190.11 198.49.208.10 200 1001671900 842300 fnal-rt1.burst.2004-06-23.1920-2004-06-23.1925128.100.190.11 198.49.208.10 231 989225200 817700 fnal-rt1.burst.2004-06-24.0120-2004-06-24.0125198.49.208.10 129.105.21.229 211 957567200 1050100 fnal-rt1.burst.2004-06-23.2120-2004-06-23.2125131.215.144.227 198.49.208.10 198 946292400 876500 fnal-rt1.burst.2004-06-23.2050-2004-06-23.2055131.215.144.227 198.49.208.10 209 936021800 882900 fnal-rt1.burst.2004-06-24.0850-2004-06-24.0855131.215.144.227 198.49.208.10 196 932688300 857700 fnal-rt1.burst.2004-06-24.0250-2004-06-24.0255131.215.144.227 198.49.208.10 206 904774900 848500 fnal-rt1.burst.2004-06-24.0650-2004-06-24.0655… • Verification • Reroute 198.49.208.10 (dmzmon0.deemz.net) via an alternate route ESCC Meeting - Columbus/OH
Present State of Development • Porting application to Cluster • Some problems on the OS and Disk Array • Testing Scalability of the System • Amount of disk space necessary per day to store data for all border routers • CPU and Memory necessary to process data • Other issues • Developing a Web Interface to display the stored data ESCC Meeting - Columbus/OH
Small Flows Percentage ESCC Meeting - Columbus/OH
Flow Rate ESCC Meeting - Columbus/OH