360 likes | 385 Vues
Explore barriers and solutions for flexible, high-speed networking, including software and hardware problems, language barriers, and network speed challenges. Learn about FFPF toolkits, kernel structures, and distributed processing. Enhance network monitoring capabilities with dynamic port applications and efficient packet processing techniques.
E N D
uspace u k n kspace nspace OS support for (flexible) high-speed networking Herbert Bos Vrije Universiteit Amsterdam
CPU MEM NIC we’re hurting because of • hardware problems • software problems • language problems • lack of flexibility • network speed may grow too fast anyway • multiple apps problem
What are we building? different barriers flows
uspace kernel What are we building? if (pkt.proto == TCP) then if (http) then scan webattacks; else if (smtp) scan spam;scan viruses fi else if (pkt.proto == UDP) then if (NFS traffic) mem = statistics (pkt); else scan SLAMMER; fi • reconfigure when load changes • dynamic optical infrastructure: • - configure at startup time- construct datapaths
- no ‘vertical’ copies - no ‘horizontal’ copies within flow group - more than ‘just filtering’ in kernel (e.g.,statistics) Application B ‘filter’ reduce copying • FFPF avoids both ‘horizontal’ and ‘vertical’ copies Application A U K
minimise copying, context switching flowgraph = DAG different copying regimes endianness new languages combine multiple requests control dataplane different buffer regimes
Software structure applications FFPF toolkit MAPI FFPF userspace libpcap userspace-kernel boundary FFPF kernelspace host- NIC boundary FFPF ixp2xxx
current status • working for Linux PC with • NICs • IXP1200s and IXP2400s (plugged in) • remote IXP2850 (with poetic license) • rudimentary support for distributed packet processor • speed comparable to tailored solutions
Thank you! http://ffpf.sourceforge.net/
Three flavours • Three flavours of packet processing on IXP & host • Regular • copy only when needed • may be slow depending on access pattern • Zero copy • never copy • may be slow depending on access pattern • Copy once • copy always 11
combining filters (dev,expr=/dev/ixp0) >[ [ (BPF,expr=“…”) >(PKTCOUNT) ] | (FPL2,expr=“…”) ]> (PKTCOUNT)
Example 1: writing applications • int main(int argc, char** argv) { • void *opaque_handle; • intcount, returnval; • /** pass the string to the subsystem. It initializes and returns an opaque pointer */ • opaque_handle = ffpf_open(argv[1], strlen(argv[1])); • if(start_listener(asynchronous_listener, NULL)) • { • printf (WARNING,“spawn failed\n"); • ffpf_close (opaque_handle); return -1; • } • count = ffpf_process(FFPF_PROCESS_FOREVER); • stop_listener(1); • returnval = ffpf_close(opaque_handle); • printf(,"processed %d packets\n",count); • return returnval; • }
Example 1: writing applications • void* asynchronous_listener(void* arg) { • int i; • while (!should_stop()) { • sleep(5); • for (i=0;i<export_len;i++) { • printf("flowgrabber %d: read %d, written%d\n", i, get_stats_read(exported[i]), get_stats_processed(exported[i])); • while ((pkt =get_packet_next(exported[i]))) { • dump_pkt(pkt); • } • } • } • return NULL; • }
Example 2: FPL-2 • new pkt processing language: FPL-2 • for IXP, kernel and userspace • simple, efficient and flexible • simple example: filter all webtraffic • IF ( (PKT.IP_PROTO == PROTO_TCP) && (PKT.TCP_PORT == 80)) THEN RETURN 1; • more complex example: count pkts in all TCP flows • IF (PKT.IP_PROTO == PROTO_TCP) THEN R[0] = Hash[ 14, 12, 1024]; M[ R[0] ]++;FI
FPL-2 • all common arithmetic and bitwise operations • all common logical ops • all common integer types • for packet • for buffer ( useful for keeping state!) • statements • Hash • External functions • to call hardware implementations • to call fast C implementations • If … then … else • For … break; … Rof • Return
Example application: dynamic ports 1. // R[0] stores no. of dynports found (initially 0) 2. IF (PKT.IP_PROTO==PROTO_TCP) THEN 3. IF (PKT.TCP_DPORT==554) THEN 4. M[R[0]]=EXTERN("GetDynTCPDPortFromRTSP",0,0); 5. R[0]++; 6. ELSE // compare pkt’s dst port to all ports in array – if match, return pkt 7. FOR (R[1]=0; R[1] < R[0]; R[1]++) 8. IF (PKT.TCP_DPORT == M[ R[1] ] ) THEN 9. RETURN TRUE; 10. FI 11. ROF 11. FI 12. FI 12. RETURN FALSE;
efficient userspace ? • reduced copying and context switches • sharing data • flowgraphs: sharing computations kernel ? network card ? ? x “push filtering tasks as far down the processing hierarchy as possible”
spread of SAPPHIRE in 30 minutes -process at lowest possible level-minimise copying -minimise context switching -freedom at the bottom Network Monitoring • Increasingly important • traffic characterisation, securitytraffic engineering, SLAs, billing, etc. • Existing solutions: • designed for slow networksor traffic engineering/QoS • not very flexible • We’re hurting because of • hardware (bus, memory) • software (copies, context switches) demand for solution:-scales to high link rates - scales in no. of apps - flexible
UDP with CodeRed eth0 TCP SYN bytecount HTTP U RTSP TCP IP “contains worm” UDP RTP UID 0 generalised notion of flow Flow: “a stream of packets that match arbitraryuser criteria” Flowgraph
Application B ‘filter’ reduce copying • FFPF avoids both ‘horizontal’ and ‘vertical’ copies Application A U K
Extensible • modular framework • language agnostic • plug-in filters (device,eth0) -> (sampler,2) -> (BPF,”..”) -> (packetcount) (device,eth0) | (device,eth1) -> (sampler,2) -> (FPL-2,”..”) | (BPF,”..”) -> (bytecount)
R O O O O O O O W Buffers • PacketBuf • circular buffer with N fixed-size slots • large enough to hold packet • IndexBuf • circular buffer with N slots • contains classification result + pointer
Buffers O O O • PacketBuf • circular buffer with N fixed-size slots • large enough to hold packet • IndexBuf • circular buffer with N slots • contains classification result + pointer O O O O W R
Buffers X X X • PacketBuf • circular buffer with N fixed-size slots • large enough to hold packet • IndexBuf • circular buffer with N slots • contains classification result + pointer X X O O W R
R1 Buffer management O O O O O O O O O O what to do if writer catches up with slowest reader? • slow reader preference • drop new packets (traditional way of dealing with this) • overall speed determined by slowest reader • fast reader preference • overwrite existing packets • application responsible for keeping up • can check that packets have been overwritten • different drop rates for different apps O O O O O O R1 W
IF (PKT.IP_PROTO == PROTO_TCP) THEN // reg.0 = hash over flow fields R[0] = Hash (14,12,1024) // increment pkt counter at this // location in MBuf MEM[ R[0] ]++FI • simple to use • compiles to optimised native code • resource limited (e.g., restricted FOR loop) • access to persistent storage (scratch memory) • calls to external functions (e.g., fast C functions or hardware assists) • compiler for uspace, kspace, and nspace (ixp1200) Languages • FFPF is language neutral • Currently we support: • BPF • C • OKE Cyclone • FPL
packet sources • currently three kinds implemented -netfilter -net_if_rx() -IXP1200 • implementation on IXPs: NIC-FIX -bottom of the processing hierarchy -eliminates mem & bus bottlenecks uspace kspace nspace
zero copy copy once on-demand copy Network Processors “programmable NIC”
Performance results pkt loss: FFPF: < 0.5% LSF: 2-3%
Performance results pkt loss: LSF:64-75% FFPF: 10-15%
Summary concept of ‘flow’ generalised copying and context switching minimised processing in kernel/NIC complex programs + ‘pipes’ FPL: FFPF Packet Languages fast + flexible persistent storage flow-specific state authorisation + third-party code any user flow groupsapplications sharing packet buffers
Thank you! http://ffpf.sourceforge.net/
CPU MEM NIC we’re hurting because of • kernel: too little, too often • vertical + horizontal copies • context switching • popular ones: not expressive • expressive ones: not popular • same for speed • no way to mix and match • hardware problems • software problems • language problems • lack of flexibility • network speed may grow too fast anyway • multiple apps problem • heterogeneous hardware • programmable cards, routers, etc. • no easy way to combine • build on abstractions that are too hi-level • hard to use as ‘generic’ packet processor • 10 40Gbps: what will you do in 10 ns? • beyond capacity of single processor