1 / 15

  Evaluation of Header Field Entropy for Hash-Based Packet Selection

  Evaluation of Header Field Entropy for Hash-Based Packet Selection. Christian Henke, Carsten Schmoll, Tanja Zseby Fraunhofer Institute FOKUS, Berlin, Germany. Outline. Introduction Multipoint Sampling Problem Statement Approach Measurement Setup Measurement Results Conclusion.

thora
Télécharger la présentation

  Evaluation of Header Field Entropy for Hash-Based Packet Selection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1.   Evaluation of Header Field Entropy for Hash-Based Packet Selection Christian Henke, Carsten Schmoll, Tanja Zseby Fraunhofer Institute FOKUS, Berlin, Germany

  2. Outline • Introduction Multipoint Sampling • Problem Statement • Approach • Measurement Setup • Measurement Results • Conclusion

  3. Introduction Multipoint Sampling Passive Multipoint Measurements • at observation points a packet ID and timestamp exported for each packet • trace observable based on occurrence of packet ID • delay = timestamp A – timestamp B of packets with equal ID Multipoint Collector Point B Point A Point C

  4. Introduction Multipoint Sampling CChallenge in Passive Multipoint Measurements • immense amounts of measurement data • High infrastructure costs: processing, storing, exporting Random Packet Selection and Estimation Random Sampling (n-out-of-N, probabilistic) unsuitable -> inconsistent sample at observation points Duffield and Grossglauser in “Trajectory Sampling for Direct Traffic Observation” propose hash-based packet selection.

  5. Introduction Multipoint Sampling Hash-Based Paket Selection IP Header Transport Header Payload hash input hash function packet selected packet not selected consistent selected subset if x, h and S are equal at all observation points

  6. Problem Statement Which packet content to use as hash input? Requirements for header fields • static between network nodes ( IP TTL and checksum) • variable among packets Challenge: • HBS is deterministic; but goal is to emulate random selection • choice of hash input can introduce bias to the selection

  7. Problem Statement How bias is introduced • packets in a hash input collision have same hash input • selection decision is not independent • the more packets in collision the more grievous the bias • unsuitable to use whole packet because hash value calculation time increases with hash input length

  8. Approach Approach • packets differ more often in high variable bytes • entropy per byte used to measure variability Entropy InformationEfficiency pi probability that hash value i occurs H(B) entropy dependent on discrete Variant of Byte Values

  9. Measurement Setup Evaluation dependent on analyzed traces • 6 IPv4 trace groups – 1 IPv6 • geographical locations (NZ, AUT, FR, NED – 2 LEO) • network location (university, peering point, large ISP) • application mix

  10. Measurement Results Entropy IPv4

  11. Measurement Results High Entropy Header Fields • IPv4: Identification, Length LSB, Src/Dst Address 2 LSB • TCP: Chksum, SeqNo, AckNo, Src/Dst Port 2 LSB • UDP: Chksum, Length LSB, Src/Dst Port 2 LSB • ICMP: Chksum, Bytes 12,13,18,19 • IPv6: Length LSB • more IPv6 traces required for further evaluation • Addresses anonymized and no transport header - only 8 bytes could be evaluated Recommended 8 byte Configuration IP ID field + 6 Transport Header Bytes: • TCP (Checksum, 2 LSB of Seq and AckNo) • UDP (Checksum, Source Port, LSB Destination Port, LSB Length) • ICMP (Checksum, Bytes 12,13,18,19)

  12. Measurement Results Empirical Hash Input Collisions Evaluation • 4 configurations used • whole IP and transport header (minimum reachable collisions) • only IP header (bad configuration) • 8 high entropy bytes • Molina‘s 16 bytes • sum of packets on 20 largest collisions of each trace • Large collision: all or none decision of all packets that have same attributes • Small collisions: packets equal in one collision but different between

  13. Measurement Results Hash Input Collision Comparison • recommended 8 bytes better than Molina’s 16 bytes • LEO2 traces include a large VPN traffic flow with UDP Checksum==0 – more high entropy bytes should be used

  14. Conclusion Outcome • give a recommendation of 8 bytes for use as hash input for HBS • 8 recommended bytes sufficient to gain unique hash inputs Henke, Schmoll, Zseby “Empirical Evaluation of Hash Functions for Multipoint Measurements” • hash calculation time linear increase with input length • hash functions are able to select representative subset based on 8 bytes

  15. Future Work Correlation between Bytes Correlation between address bytes entropy of combined bytes expected to be average of entropy IPv6 entropy evaluation of IPv6 addresses transport headers

More Related