1 / 32

ZHT

ZHT. A Fast, Reliable and Scalable Zero-hop Distributed Hash Table. Tonglin Li, Xiaobing Zhou, Kevin Brandstatter , Dongfang Zhao, Ke Wang, Zhao Zhang, Ioan Raicu Illinois Institute of Technology, Chicago, U.S.A.

mairi
Télécharger la présentation

ZHT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ZHT A Fast, Reliable and Scalable Zero-hop Distributed Hash Table Tonglin Li, Xiaobing Zhou, Kevin Brandstatter, Dongfang Zhao, Ke Wang, Zhao Zhang, IoanRaicuIllinois Institute of Technology, Chicago, U.S.A

  2. A supercomputer is a device for turning compute-bound problems into I/O-bound problems. Ken Batcher

  3. Big problem: file systems scalability • Parallel file system (GPFS, PVFS, Lustre) • Separated computing resource from storage • Centralized metadata management • Distributed file system(GFS, HDFS) • Specific-purposed design (MapReduce etc.) • Centralized metadata management

  4. The bottleneck of file systems • Metadata Concurrent file creates

  5. Proposed work • A distributed hash table (DHT) for HEC • As building block for high performance distributed systems • Performance • Latency • Throughput • Scalability • Reliability

  6. Related work: Distributed Hash Tables • Many DHTs: Chord, Kademlia, Pastry, Cassandra, C-MPI, Memcached, Dynamo ... • Why another?

  7. Zero-hop hash mapping

  8. 2-layer hashing

  9. Architecture and terms • Name space: 264 • Physical node • Manager • ZHT Instance • Partition: n (fixed) • n = max(k)

  10. How many partition per node can we do?

  11. Membership management • Static: Memcached, ZHT • Dynamic • Logarithmic routing: most of DHTs • Constant routing: ZHT

  12. Membership management • Update membership • Incremental broadcasting • Remap k-v pairs • Traditional DHTs: rehash all influenced pairs • ZHT: Moving whole partition • HEC has fast local network!

  13. Consistency • Updating membership tables • Planed nodes join and leave: strong consistency • Nodes fail: eventual consistency • Updating replicas • Configurable • Strong consistency: consistent, reliable • Eventual consistency: fast, availability

  14. Persistence: NoVoHT • NoVoHT •  persistent in-memory hash map • Append operation • Live-migration

  15. Failure handling • Insert and append • Send it to next replica • Mark this record as primary copy • Lookup • Get from next available replica • Remove • Mark record on all replicas

  16. Evaluation: test beds • IBM Blue Gene/P supercomputer • Up to 8192 nodes • 32768 instance deployed • Commodity Cluster • Up to 64 node • Amazon EC2 • M1.medium and Cc2.8xlarge • 96 VMs, 768 ZHT instances deployed

  17. Latency on BG/P

  18. Latency distribution

  19. Throughput on BG/P

  20. Aggregated throughput on BG/P

  21. Latency on commodity cluster

  22. ZHT on cloud: latency

  23. ZHT on cloud: latency distribution ZHT on cc2.8xlarge instance 8 s-c pair/instance DynamoDB: 8 clients/instance ZHT 4 ~ 64 nodes DynamoDB write DynamoDB read 0.99 0.9

  24. ZHT on cloud: throughput

  25. Amortized cost

  26. Applications • FusionFS • A distributed file system • Metadata: ZHT • IStore • A information dispersal storage system • Metadata: ZHT • MATRIX • A distributed many-Task computing execution framework • ZHT is used to submit tasks and monitor the task execution status

  27. FusionFS result: Concurrent File Creates

  28. Istore results

  29. MATRIX results

  30. Future work • Larger scale • Active failure detection and informing • Spanning tree communication • Network topology-aware routing • Fully synchronized replicas and membership: Paxos protocol • More protocols support (UDT, MPI…) • Many optimizations

  31. Conclusion • ZHT :A distributed Key-Value store • light-weighted • high performance • Scalable • Dynamic • Fault tolerant • Versatile: works from clusters, to clouds, to supercomputers

  32. Questions? Tonglin Li tli13@hawk.iit.edu http://datasys.cs.iit.edu/projects/ZHT/

More Related