RDMA vs TCP experiment
250 likes | 540 Vues
RDMA vs TCP experiment. Goal Environment Test tool - iperf Test Suits Conclusion. Goal. Test maximum and average bandwidth usage in 40Gbps(Infiniband) and 10Gbps(iWARP) network environment Compare CPU usage between TCP and RDMA data transfer mode
RDMA vs TCP experiment
E N D
Presentation Transcript
Goal • Environment • Test tool - iperf • Test Suits • Conclusion
Goal • Test maximum and average bandwidth usage in 40Gbps(Infiniband) and 10Gbps(iWARP) network environment • Compare CPU usage between TCP and RDMA data transfer mode • Compare CPU usage between RDMA READ and RDMA WRITE mode
Environment 40 GbpsInfiniband 10 Gbps iWARP Netqos03/client Netqos04/server
Tool - iperf • Migrateiperf 2.0.5 to the RDMA environment with OFED(librdmacm and libibverbs). • 2000+ Source Lines of Code added. • From 8382to 10562. • iperf usage extended • -H: RDMA transfer mode instead of TCP/UDP • -G: pr(passive read) pw(passive write) • Data read from server. • Server writes into clients. • -O: output data file, both TCP server and RDMA server • Only one stream to transfer
Test Suits • test suits 1: memory -> memory • test suits 2: file -> memory -> memory • test case 2.1: file(regular file) -> memory -> memory • test case 2.2: file(/dev/zero) -> memory -> memory • test case 2.3: file(lustre) -> memory -> memory • test suits 3: memory -> memory -> file • test case 3.1: memory -> memory -> file(regular file) • test case 3.2: memory -> memory -> file(/dev/null) • test case 3.3: memory -> memory -> file(lustre) • test suits 4: file -> memory -> memory -> file • test case 4.1: file ( regular file) -> memory -> memory -> file( regular file) • test case 4.2: file(/dev/zero) -> memory -> memory -> file(/dev/null) • test case 4.3: file(lustre) -> memory -> memory -> file(lustre)
File choice • File operation with Standard I/O library • fread, fwrite, Cached by OS • Input with /dev/zero wants to test the maximum application data transfer include file operation – read, which means disk is not the bottleneck • Output with /dev/null wants to test the maximum application data transfer include file operation – write, which means disk is not the bottleneck
Buffer choice • RDMA operation block size is 10MB • RDMA READ/WRITE one time • Previous experiment shows that, in this environment, if the block size more than 5MB, there is little effect to the transfer speed • TCP read/write buffer size is the default • TCP window size: 85.3 KByte (default)
Test case 2.1: (fread)file(regular file) -> memory -> memory CPU
Test case 2.1: (fread)file(regular file) -> memory -> memory Bandwidth
Test case 2.2 (five minutes) file(/dev/zero) -> memory -> memory CPU
Test case 2.2 (five minutes) file(/dev/zero) -> memory -> memory Bandwidth
Test case 3.1 (200G file are generated): memory -> memory -> file(regular file) CPU
Test case 3.1 (200G file are generated): memory -> memory -> file(regular file) Bandwidth
Test case 3.2: memory -> memory -> file(/dev/null) Bandwidth
Test case 4.1:file(r) -> memory -> memory -> file(r) Bandwidth
Test case 4.2:file(/dev/zero) -> memory -> memory -> file(/dev/null) CPU
Test case 4.2:file(/dev/zero) -> memory -> memory -> file(/dev/null) Bandwidth
Conclusion • For one data transfer stream, the RDMA transport is twice as fast as TCP, while the RDMA has only 10% of CPU load compare with the CPU load under TCP, without disk operation. • FTP includes two components: Networking and File operation. Compare with the RDMA operation, file operation (limited by the disk performance) takes most of the CPU usage. Therefore, a well-designed file buffer mode is critical.
Future work • Setup Lustre environment, and configure Lustre with RDMA function • Startup FTP migration • Source control • Bug database • Document • etc (refer to The Joel Test)
Memory Cached Cleanup # sync # echo 3 > /proc/sys/vm/drop_caches