1 / 16

CINBAD CERN/HP ProCurve Joint Project on Networking

CINBAD CERN/HP ProCurve Joint Project on Networking. 26 May 2009 Ryszard Erazm Jurga - CERN Milosz Marian Hulboj - CERN. Outline. Introduction to CERN network CINBAD project and its goals Data - sources, collection and analysis Results and conclusions.

marc
Télécharger la présentation

CINBAD CERN/HP ProCurve Joint Project on Networking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CINBADCERN/HP ProCurve Joint Projecton Networking 26 May 2009 RyszardErazmJurga - CERN Milosz Marian Hulboj - CERN

  2. Outline • Introduction to CERN network • CINBAD project and its goals • Data - sources, collection and analysis • Results and conclusions

  3. Simplified overall CERN campus network topology Computer Center Vault Vault EXT Computer Center LCG GPN Farms 230x BB20 Firewall RB20 PG1 PL1 SL1 SG1 IB1 Internet BL7 RL7 40x BB16 BB17 50x RB16 RB17 PG2 SG2 IB2 Monitoring BB51 ZB51 BB52 ZB52 BB53 ZB53 ZB54 BB54 ZB50 BB50 ZB55 BB55 BB56 ZB56 AB51 AB52 AB53 RB54 AB54 RB50 AB50 RB55 AB55 AB56 Meyrin area Prevessin area TN RB51 RB52 RB53 RB56 BL5 2-S 10-1 40-S2 376-R 513-C 874-R 887-R 513-CCR 212 BT2 354 BT3 376 BT13 513 BT12 BT1 874 BT16 874 RL5 RT3 RT4 RT14 RT13 RT17 RT2 513-CCR BL4 513 874 RL4 100x BT15 or 874-CCC LHC area BL3 RL3 BL2 RL2 1000x RT16 BT4 RT5 BT5 RT6 BT6 RT7 BT7 RT8 BT8 RT9 BT9 RT10 BT10 RT11 BT11 RT12 BL1 RL1 TT1 TT2 TT3 TT4 TT5 TT6 TT7 Minor Starpoints 100x 2175 2275 2375 2475 2575 2675 2775 2875 TT8 CERN sites 21x 433x AG AG AG ATCN 15x Control 15x Control 10x Control 10x 15x TN 15x TN 10x TN 10x TN AG AG 8x AG CDR AG CDR AG CDR CDR AG AG HLT DAQ Tiers'1 ATLAS DAQ ALICE DAQ CMS LHCB DAQ 25x 90x 90x

  4. CINBAD codename deciphered CERN Investigation of Network Behaviour and Anomaly Detection Project Goal “To understand the behaviour of large computer networks (10’000+ nodes) in High Performance Computing or large Campus installations to be able to: Detect traffic anomalies in the system Be able to perform trend analysis Automatically take counter measures Provide post-mortem analysis facilities “

  5. What is anomaly? (1) • Anomalies are a fact in computer networks • Anomaly definition is very domain specific: • But there is a common denominator: • “Anomaly is a deviation of the system from the normal (expected) behaviour (baseline)” • “Normal behaviour (baseline) is not stationary and is not always easy to define” • “Anomalies are not necessarily easy to detect”

  6. What is anomaly? (2) • Just a few examples of anomalies: • The network infrastructure misuse • unauthorised DHCP/DNS server (either malicious or accidental) • network scans • worms and viruses • Violation of a local network/security policy • NAT, TOR usage (not allowed at CERN)

  7. CINBAD project principle data sources storage analysis collectors

  8. sFlow – main data source • Based on packet sampling (RFC 3176) • on average 1-out-of-N packet is sampled by an agent and sent to a collector • packet header and payload included (max 128 bytes) • switching/routing/transport protocol information • application protocol data (e.g. http, dns) • SNMP counters included • low CPU/memory requirements – scalable • For more details, see our technical report http://openlab-mu-internal.web.cern.ch/openlab-mu-internal/Documents/2_Technical_Documents/Technical_Reports/2007/RJ-MM_SamplingReport.pdf

  9. sFlow data usage • Typically: traffic accounting (e.g. for billing, network planning, SLA etc.) • Useful for post-mortem network analysis • CERN-wide network tcpdump • access to the whole CERN sampled network traffic • information where and when the traffic has been seen • particularly useful in detecting packets ‘that should not be there’ (policy violation) • Rare examples of sflow usage for anomaly detection

  10. Other data sources • Packet sampling data is not enough! • Data is partial, cannot provide 100% accuracy • Not always easy to identify the anomaly • More data to understand flow of data in the network • External sources provide useful information and time triggers • Correlation between various data sources • Example: • Central Antivirus Service at CERN

  11. CINBAD sFlow data collection • Current collection based on traffic from ~1000 switches • ~6000 sampled packets per second • ~ 3500 snmp counter sets per second

  12. Multistage data storage • Stage 1 • sFlow datagram tree-like format unpacked into CINBAD file format to enable fast direct access • Minimal space overhead introduced • Stage 2 • Oracle DB as a long-term storage • data aggregation • tradeoff between sFlow randomness and space, data lifetime and anomaly detection requirements • e.g. number of destination IPs for a given source IP

  13. Data analysis • Various approaches are being investigated • Statistical analysis methods • detect a change from “normal network behaviour” • selection of suitable metrics is needed • can detect new, unknown anomalies • poor anomaly type identification • Signature based • we ported SNORT to work with sampled data • performs well against know problems • tends to have low false positive rate • does not work against unknown anomalies

  14. Synergy from both detection techniques Translation Rules Sample-based SNORT evaluation engine Anomaly Alerts Incoming sflow stream Statistical Analysis engine New Signatures New baselines Traffic Profiles

  15. Current results • Campus and Internet traffic analysis • Identified anomalies which went undetected by the central CERN IDS • Detected number of misbehaviors • both statistical and pattern matching approaches used • TOR, DNS abuse, Trojans, worms, network scans, p2p applications, rogue DHCP servers, etc. • Findings reported to the CERN security team • Security team adapted their policies

  16. Achievements and next steps • It has been demonstrated that pattern matching for anomaly detection is possible with sflow data • Payload data is a key advantage of sflow • sflow allows distributed detection at the edge of the network • Combining statistical analysis with pattern matching is providing encouraging initial results • Integration of the two mechanisms holds promise for zero-day detection

More Related