1 / 34

Memory –efficient Data Management Policy for Flash-based Key-Value Store

Memory –efficient Data Management Policy for Flash-based Key-Value Store. Wang Jiangtao 2013-4-12. Outline. Introduction Related work Two works BloomStore [MSST2012] TBF[ICDE2013] Summary. Key-Value Store. KV store efficiently supports simple operations: Key lookup & KV pair insertion

aden
Télécharger la présentation

Memory –efficient Data Management Policy for Flash-based Key-Value Store

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory –efficient Data Management Policy for Flash-based Key-Value Store Wang Jiangtao 2013-4-12

  2. Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary

  3. Key-Value Store • KV store efficiently supports simple operations: Key lookup & KV pair insertion • Online Multi-player Gaming • Data deduplication • Internet services

  4. Overview of Key-Value Store • KV store system should provide high access throughput (> 10,000 key lookups/sec) • Replaces traditional relational DBs for its superior scalability & performance. • prefer to use KV store for its simplicity and better scalability • Popular management (index + storage) solution for large volume of records – often implemented through an index structure, mapping Key-> Value

  5. Challenge • To meet high throughput demand, the performance of index access and KV pair (data) access is critical • index access : search the KV pair associated with a given “key” • KV pair access: get/put the actual KV pair • Available memory space limits the maximum number of stored KV pairs • Using in-RAM index structure can only address index access performance demand

  6. DRAM must be Used Efficiently • 1 TB of data • 4 bytes of DRAM for key-value pair 32 B( Data deduplication) => 125 GB! Index size(GB) 168 B(Tweet) => 24 GB 1 KB(Small image) => 4 GB Per Key-value pair size (bytes)

  7. Existing Approach to Speed up Index & KV pair Accesses • Maintain the index structure in RAM to map each key to its KV pair on SSD • RAM size can not scale up linearly to flash size • Keep the minimum index structure in RAM, while storing the rest of the index structure in SSD • On-flash index structure should be designed carefully • Space is precious • random writes are slow and bad for flash life (wear out)

  8. Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary

  9. Bloom Filter • Bloom Filter利用位数组表示一个集合,并判断一个元素是否属于这个集合。初始状态时,m位的位数组的每一位都置为0,Bloom Filter使用k个相互独立的哈希函数,它们分别将集合中的每个元素映射到{1,…,m}的范围中。对任意一个元素x,第i个哈希函数映射的位置hi(x)就会被置为1(1≤i≤k)。注意,如果一个位置多次被置为1,那么只有第一次会起作用,后面几次将没有任何效果。 • 错误率 • Bloom Filter参数选择 • 哈希函数的个数k、位数组大小m、元素的个数n • 降低错误率

  10. FlashStore[VLDB2010] • Flash as a cache • Components • Write buffer • Read cache • Recency bit vector • Disk-presence bloom filter • Hash table index • Cons • 6 bytes of RAM per key-value pair

  11. SkimpyStash[SIGMOD2011] • Components • Write buffer • Hash table • Bloom filter • using linked list • a pointer to the beginning of the linked list of flash • Storing the linked lists on flash • Each pair have a pointer to earlier keys in the log • Cons • Multiple flash page reads for a key lookup • High garbage collection cost

  12. Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary

  13. MSST2012

  14. Introduction • Key lookup throughput is the bottleneck for data application • Keep an in-RAM large-sized hash table • Move index structure to secondary storage(SSD) • Expensive random write • High garbage collection cost • Bigger storage space

  15. BloomStore • BloomStore Design • An extremely low amortized RAM overhead • Provide high key lookup/insertion throughput • Componets • KV Pair write buffer • Active bloom filter • a flash page for write buffer • Bloom filter chain • many flash pages • Key-range partition • a flash “block” BloomStore architecture

  16. KV Store Operations • Key Lookup • Active Bloom filter • Bloom filter chain • Lookup cost

  17. Parallel lookup • Key Lookup • Read the entire BF chain • Bit-wise AND resultant row • High read throughput h1(ei) h1(ei) ... h1(ei) Bit-wise AND eiis found Bloom filters in parallel

  18. KV Store Operations • KV pair Insertion • KV pair Update • Append a new key-value pair • KV pair Deletion • Insert a null value for the key

  19. Experimental Evaluation • Experiment setup • 1TB SSD(PCIe)/32GB(SATA) • Workload

  20. Experimental Evaluation • Effectiveness of prefilter • Per KV pair is 1.2 bytes • Linux Workload • Vx Workload

  21. Experimental Evaluation • Lookup Throughput • Linux Workload • H=96(BF chain length) • m=128(the size of a BF) • Vx Workload • H=96(BF chain length) • m=64(the size of a BF) • A prefilter

  22. ICDE2013

  23. Motivation • Using flash as a extension cache is cost-effective • The desired size of RAM-cache is too large • Caching policy is memory-efficient • Replacement algorithm achieves comparable performance with existing policies • Caching policy is agnostic to the organization of data on SSD

  24. Defects of the existing policy • Recency-based caching algotithm • Clock or LRU • Access data structure and index

  25. Defects of the existing policy • Recency-based caching algotithm • Clock or LRU • Access data structure and index

  26. System view • DRAM buffer • An in-memory data structure to maintain access information (BF) • No special index to locate key-value pair • Key-value store • Provide a iterator operation to traverse • Write through BF Key-Value cache prototype architecture

  27. Bloom Filter with deletion(BFD) • BFD • Removing a key from SSD • A bloom filter with deletion • Resetting the bits at the corresponding hash-value in a subset of the hash functions X1 Delete X1

  28. Bloom Filter with deletion(BFD) • Flow chart • Tracking recency information • Cons • False positive • polluting the cache • False negative • Poor hit ratio

  29. Two Bloom sub-Filters(TBF) • Flow chart • Dropping many elements in bulk • Flip the filter periodically • Cons • Keeping rarely-accessed objects • polluting the cache • traversal length per eviction

  30. Traversal cost • Key-Value Store Traversal • unmarked on insertion • marked on insertion • longer stretches of marked objects • False positive

  31. Evaluation • Experiment setup • two 1 TB 7200 RPM SATA disks in RAID-0 • 80 GB FusionioDrive PCIE X4 • a mixture of 95% read operations and 5% update • Key-value pairs:200 million(256B) • Bloom filter • 4 bits per marked object • a byte per object in TBF • hash function:3

  32. Outline • Introduction • Related work • Two works • BloomStore[MSST2012] • TBF[ICDE2013] • Summary

  33. Summary • KV store is particularly suitable for some special applications • Flash will improve the performance of KV store due to its faster access • Some index structure need to be redesign to minimize the RAM size • Don’t just treat flash as disk replacement

  34. Thank You!

More Related