1 / 34

Internet Measurement Masterclass 2006

Internet Measurement Masterclass 2006. 10:00 Session 1: Kick off, problem space, thinking ahead, you and the law Andrew Moore - Queen Mary, University of London 11:00 Morning tea 11:15 Session 2: Monitoring with Windows and how not to be deluged with data

mniles
Télécharger la présentation

Internet Measurement Masterclass 2006

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet Measurement Masterclass 2006 10:00 Session 1: Kick off, problem space, thinking ahead, you and the law Andrew Moore - Queen Mary, University of London 11:00 Morning tea 11:15 Session 2: Monitoring with Windows and how not to be deluged with data Dinan Gunawardena - Microsoft Research Cambridge 12:15 Hardware selection for monitoring Fabian Schneider - TU Berlin 12:45 Lunch + concurrently with Endace hardware demonstration 13:45 Session 3: Netflow, and routing data as a source of measurement Steve Uhlig - Delft University of Technology 14:45 Afternoon tea 15:00 Session 4: Statistics for the measurement community Steven Gilmour - Queen Mary, University of London 15:45 Wrap-up 16:00 beer / NGN ProgNet06 workshop starts

  2. Kick-off Andrew Moore Queen Mary, University of London www.dcs.qmul.ac.uk/~awm

  3. What we won’t cover • Active measurement (AMP, ping, traceroute, rrt, planetlab) • Exhaustive survey of current measurement research • I’m happy to provide opinion on these things in a break, but I am not an active-measurement expert, I don’t even play-one on television.

  4. WHY Measure? • Measuring something helps you understand it Few would argue the Internet is important enough to understand • Good data outlives bad theory • Jeff Dozier • Measure what is measurable, make measurable what is not. • after Galelio

  5. Why?a non-exhaustive list • Measurements are inputs to • validate a model • drive a simulation • test a new approach • Measurements help understanding (fault-finding) • Measurements are often part of the accounting process

  6. Why so hard? Pick your (Endace1) Dag board, plug it in and go. Right? • Data on the wire is not the only first class measurement object • Hardware doesn’t work • Wrong Measurements • Wrong Interpretation • Wrong Problem • Wrong. • Law • Level 2 is not always • accessible • monitor-able • Operations staff hate you 1Other monitoring boards are available

  7. Where should I start? • Ask WHY are you measuring? “Measure twice & cut once” great for carpenters but “Think (at least) twice and measure once” is better for us.

  8. Pick the right tool for the right job • Measurement of packets on a wire in your lab • Great for observing once specific use of one set of applications in one place in the Internet • Terrible for telling you how many mobile devices are used for IPtv in China, or the connectivity among world ISPs, or ….

  9. Uh-Oh • Who are you going to measure? 1 user? 1000 users? • When? (what time of the day?) • Where? (your personal machine, a campus? a country?) • How? • How-long? a day? week? month? • What method are you going to use?

  10. Law(I am Not a Lawyer and this is UK Law) • If in doubt, seek out advice • Everything is illegal • Don’t ask a question you don’t want to know the answer to. • We care about • RIPA (Interception) • DPA (personal-data storage) Many Thanks to Richard Clayton and Andrew Cormack

  11. Data Protection Act 1998 • Overriding aim is protect the interests of (and avoid risks to) the Data Subject • Data processing must comply with the eight principles (as interpreted by the regulator) • All data controllers must “notify” (£35) the Information Commissioner (unless exempt) • Exceptions for “private use”, “basic business purpose”: see the website

  12. Data Protection act (1998) • Principle 7 is specially relevant • Appropriate technical and organization measures shall be taken against unauthorized or unlawful processing of personal data and against accidental loss or destruction of, or damage to personal data • The Information Commissioner advises that a risk-based approach should be taken in determining what measures are appropriate • Management and organizational measures are as important as technical ones • Pay attention to data over its entire lifetime

  13. RIP Act 2000 • Part I, Chapter I interception • Part I, Chapter II communications data • Part II surveillance & informers • Part III encryption • not as relevant for this • Part IV oversight • sets up tribunal and interception commissioner

  14. RIP Act 2000 - Interception • Tapping a telephone (or copying an email) is “interception”. It must be authorized by a warrant signed by the secretary of state. • SoS means the home secretary (or similar). Power delegation is temporary. Product is not admissible in court • Some sensible exceptions exist • Delivered data • Stored data that can be accessed by the production of an order • Techies running a network • “Lawful business practice”

  15. Lawful Business Practice • Regulations prescribe how not to commit an offence under the RIP act. They do not specify how to avoid problems with DPA (or other legislation) • Must make all reasonable efforts to tell all users of system that interception may occur

  16. Law One-slider • If in doubt - ask someone! • Why do you want to do this? • bare minimum, no “data for data’s sake” • the onus is on you at all times to justify what you are doing • Unless you want to keep the DPA happy; don’t keep any personal identifiers • Use your University ethics committee I am NOT a Lawyer!

  17. (Good) Measurement Principles • Check your methodology • Keep all Meta-data • Calibrate your experiments • Automate all processing • it’s a documentation trail • cache those intermediate results; they tell you where you went wrong • Visualize your data at every stage • this helps ensure you didn’t goof

  18. Check your Methodology • Talk to people around you, find a mentor and even an antagonist • Better they find something wrong than the external examiner or the reviewers of the paper • Consider the scope of a reasonable measurement and the claims you can make

  19. Meta-Data • the filter you used on tcpdump is meta-data. • your methodology is meta-data • the day/time of the week is meta-data • the hardware you used is meta-data • (possibly) how much alcohol in your blood-stream is meta-data Keep it all

  20. Calibrate your experiments • Test your assumptions • (been assuming the network is busiest at midday - okay this is the moment you find that 3:30 is the busy time) • “bench-test” your setup; this is just good science • test your processing scripts many (many) times • Most departments do not have good test equipment, this is no excuse

  21. Automate your processing • Make is your friend • intermediate processing (and the scripts/code that did it) are more meta-data • critical when you want to reproduce your results (and have others reproduce your results)

  22. Visualize your data • visualize your data early and often • scatter plots are always useful • identify/understand those outliers now • problem? or expected result?

  23. My first network monitor • configurations • monitor and method • gotcha • backhaul network • storage, archive, index

  24. Configuration • Hardware selection • How are you going to remote-admin this machine? • OS / Software selection • Much work in unix domain; that doesn’t make it good-work; Dinan • tcpdump/pcap is standard and lots of tools • Not fast, loss-error prone, timestamps are junk, • divorce the data representation from the method • tcpdump is a useful offline tool but dagtools, CoMo and others (nprobe, etc) are simply better online • consider the right tool for the task

  25. Hardware (getting the traffic) • Passive taps • invasive installation • no impact in operation • “stealing photons” • Port Mirrors (e.g. Cisco SPAN) • be vewy vewy careful. • jitter, loss, reordering • fantastic for multiple/redundant links • multiple copies of packets

  26. Hardware 2 • Remember about physical layers? • Observing traffic at end systems is pretty easy (but imposes an overhead) • intermediate networks may not be trivial to monitor: • Packet over Ethernet, Packet over Sonet are not the only possibilities • Aside from weird layer-2s, maybe encrypted,

  27. Getting the data to somewhere useful • Out of Band backhaul • Co-schedule Measurements • FedEx the disks (realistically - postgrad-u-haul) • Co-locate storage/processing • storage & processing = heat/power • Dedicated backhaul e.g. using (a piece of) the dedicated research net

  28. Tools • tcpdump (libpcap) - but know the limitations • no records of loss • microsecond accuracy only - and RARELY that • simultaneous arrival times are possible • no record of precision or accuracy or filter or conditions or monitor-circumstance or equipment failure or … • gnuplot (or any plotting packet) scatter plot are always useful (combined with eye-squared)

  29. SharingProviding Access to the data • Law may prevent access • Either need to control who gets data OR • Ship code to monitor (Mogul et al, MineNet 2005/6) • One Platform CoMo http://como.sourceforge.net

  30. These guys do run the Internet(or why I should be nice to my ops guys) • Looking for a real problem? • Wondering about actual impact? • Talk to your front line • Sysadmins and Operators are front-line • They are rarely stupid • Don’t have the time to “think outside the box” • they will be honest with you (brutally honest in most cases) • www.nanog.org • www.ripe.org

  31. Next…. • Lets examine hardware and Operating Systems issues, specifically: • Windows: the other operating-system • Data-management: how to prevent success-disaster • So you want to monitor 10Gbps?

  32. Suppliers • NetOptics - fibre splitters • Endace - capture hardware

  33. UK specific resources • Janet’s NDA and AUP: http://www.ja.net/development/traffic-data/ • Data Protection Act: http://www.hmso.gov.uk/acts/acts1998/19980029.htm • RIPA http://www.legislation.hmso.gov.uk/acts/acts2000/20000023.htm

  34. Specific references • Mark Crovella & Bala Krishnamurthy, Internet Measurement, Wiley 2006 • Walter Willinger, Pragmatic Approach to Dealing with High Variability, IMC 2004 • Vern Paxson, Sound Internet Measurement, IMC 2004 Very early “what I did with my measurements” paper; these papers grandparent much Internet measurement work • kc claffy, etal, A parameterizable methodology for Internet traffic flow profiling, IEEE JSAC, 1995 • V. Paxson, End-to-End Routing Behavior in the Internet.IEEE/ACM Transactions on Networking, Vol.5, No.5, pp. 601-615, October 1997

More Related