1 / 49

An Ensemble of Replication and Erasure Codes for CFS

An Ensemble of Replication and Erasure Codes for CFS. Ma, Nandagopal , Puttaswamy , Banerjee. Presenter: Sen Yang. Outline. Background CAROM Evaluation Discussion. Amazon S3. Windows Azure. a single failure. a DC is completely inaccessible to the outside world .

renata
Télécharger la présentation

An Ensemble of Replication and Erasure Codes for CFS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Ensemble of Replication and Erasure Codes for CFS Ma, Nandagopal, Puttaswamy, Banerjee Presenter: Sen Yang

  2. Outline • Background • CAROM • Evaluation • Discussion

  3. Amazon S3

  4. Windows Azure

  5. a single failure • a DC is completely inaccessibleto the outside world. • In reality, data stored in a particular DC is internally replicated (on different networks in the same DC) to tolerate equipment failures.

  6. k-resiliency • data is accessible even in the event of simultaneous failures of k different DCs.

  7. Cloud • Amazon S3 and Google FS provide resiliency against two failures, using AllReplica model. • a total of three copies of each data • item are stored

  8. AllReplica • Given any k, store k+1 full copies of each file in the system. • One copy is stored in a primary DC and k other copies are stored in k backup DCs. The primary DC serves read and write requests directly. • Upon write operations, k backup DCs are updated on file close.

  9. Cloud • To reduce storage cost, cloud file systems (CFSs) are transitioning from replication to erasure codes

  10. Erasure code

  11. Erasure code-Cloud

  12. Allcode • divide an object into multiple data chunks (say m data chunks), and use encoding schemes to produce k redundant chunks . • both data chunks and redundant chunks, are stored in different locations . • Only m out of m + k chunks are necessary to reconstruct the object.

  13. AllCode • Upon a read request, • if a full copy of the file has already been reconstructed after file-open, the primary DC serves the read directly. • Otherwise, the primary DC contacts other backup DCs to get requested data and then it serves read.

  14. AllCode • For the first write operation after file-open, the primary DC contacts other m−1 backup DCs to reconstruct the whole file while it performs write on the file. • The full copy is kept until file-close and subsequent read/write can be served directly from this copy until file closes. • The primary DC updates the modified data chunks and all the redundant chunks on file-close

  15. AllCode • efficient because instead of storing whole copies, it stores chunks and thus reduces the overall cost of storage. • requires reconstructing the entire object and updating all the redundant chunks upon write requests.

  16. Allcode • When erasure codes are applied to CFSs , they lead to significantly high access latencies and large bandwidth overhead between data centers.

  17. Goal • provide an access latency close to that of the AllReplicascheme. • and storage effectiveness close to that of the AllCode scheme.

  18. DESIGN OVERVIEW • Consistency • Data accesses

  19. Consistency • replication-on-close consistency model. as soon as a file is closed after a write, the next file open request will see the updated version of the file. • Very Hard in Cloud

  20. Consistency • In a cloud, this requires that the written parts of the file be updated before any other new file open requests are processed. • May cause large delays.

  21. Assumption • The four authors assume that all requests are sent to a primary DC that is responsible for that file. • All file operations are essentially serialized via the primary DC, thus assuring such consistency.

  22. Assumption

  23. In case of a failure • between a file-close operation and a file-update operation, the written data is inaccessible. AllCode&AllReplicause a journaling file system that writes updates to a different DC. Given that the amount of written data is very small, we can safely assume that such a file system can be used in our model without compromising latency or storageany more than it would to an AllReplica or AllCode based CFS.

  24. Data accesses • in file systems, bulk of data is not accessed at all, or is accessed after very long intervals. • file sizes do not have an impact on whether a file is written after the initial store.

  25. data

  26. So, • Regardless of the file sizes, around two-third of the files stored are untouched. • not always necessary to store multiple copies of a file.

  27. CAROM • Cache A Replica On Modification uses a block-level cache at each data center. “Cache” here refers to local file accesses that can be in a hard-disk or even DRAM at a local DC. • Each DC acts as the primary DC for a set of files in the CFS. Whenfiles are being accessed, accesses are sent to this primary DC. • Anyblock accessed for the first time is immediately cached.

  28. Observation • the same block is often accessed in a very short span of time. • caching reduce access latencyof successive operations on the same data block.

  29. Read request • For a read request, retrieve requested data blocks from data chunks and cache them in the primary DC. • Ifa subsequent read request arrives for those cached blocks, thenuse the cached data to serve this request, withoutretrieving from the data chunks.

  30. Write • When a write is performed, two operations in parallel: 1.)the written blocks are stored in the cache of the primary DC and the write is acknowledged to the user. 2.)reconstruct the entire file from any m out of m+k chunks and update it in cache.

  31. File-close • redundant chunks are recalculated using erasure code and the updated chunks are written across the other DCs. • However, still keep the entire file in cache even after the update is complete, until Cache is full. Because the same file can be accessed again

  32. Cache size

  33. Cache size adaptation:

  34. Hybrid • store one complete primary copy of the file, and the backup copies are split using a (m + k,m) erasure code. • the primary replica answer the read/write queries from users. The chunks are mainly used to provide resiliency. • Upon write requests, all modified data chunks plus redundant chunks are updated on file-closeas in AllCode.

  35. Summary • Storage: save cost up to 60% over replication schemes. • bandwidth: save cost up to 43%over erasure code based schemes

  36. Trace details • Use CIFS network trace to collect from two larger-scale, enterprise-class file servers deployed in the NetApp. • One file server is deployed in the corporateDC that hosts data used by over 1000 marketing, sales, and finance employees, • another is deployed in the engineering DC used by over 500 engineering employees .

  37. Table total of 3.8 million unique files representing over 35 TB of data

  38. Storage-time

  39. Storage-time

  40. Bandwidth cost

  41. Bandwidth cost

  42. Monetary cost • Cost: • (1) the cost of storing the data for a month in terms of GBs. • (2) the cost of I/O operation, and the cost of the bandwidth while transferring data in and out of the data centers.

  43. Cost

  44. Cache size & Cost

  45. Cache size & Cost

  46. Access latency • hit : data is available at the local disk in the primary DC. • miss : data has to be fetched from a backup DC • (A, B) denote local disk access latency (A) and remotedisk access latency (B) in units of milli-seconds,

  47. Access latency CAROM provides read access latencies that are close to those of AllReplica (local disk access latencies) , rather than those of AllCode (remote disk access latencies).

  48. XorbasHadoop

  49. Hadoop • XORing Elephants: Novel Erasure Codes for Big Data. • Use erasure code to reduce the large storage overhead of three-replicated systems

More Related