1 / 18

DeDu : Building a Deduplication Storage system over Cloud computing

DeDu : Building a Deduplication Storage system over Cloud computing. Speaker: Yen-Yi Chen MA190104 Date: 2013/05/28. This paper appears in : Computer Supported Cooperative work in Design(CSCWD) ,2011 15 th International Data of Conference: 8-10 June 2011 Author(s):

taipa
Télécharger la présentation

DeDu : Building a Deduplication Storage system over Cloud computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DeDu: Building a Deduplication Storage system over Cloud computing Speaker: Yen-Yi Chen MA190104 Date:2013/05/28 This paper appears in : Computer Supported Cooperative work in Design(CSCWD) ,2011 15th International Data of Conference: 8-10 June 2011 Author(s): Zhe Sun, Jun Shen, Fac. of inf., Univ. of Wollongong, Wollongong, NSW, Australia Jianming Yong, Fac. of bus., Univ. of Southern Queensland, Toowoomab, QLD ,Australia

  2. Outline • Introduction • Two issues to be addressed • Deduplication • Theories and approaches • System design • Simulations and Experiments • Conclusions

  3. Introduction • 雲端運算興起、分散式系統架構 • 資訊爆炸、資料海量 • 儲存設備成本上升 • 增加資料傳輸與減緩佔用網路頻寬

  4. Introduction • System name:DeDu • Front-end: deduplication application • Back-end: Hadoop Distributed File System • HDFS • HBase

  5. Two issues to be addressed • How does the system identify the duplication? *hash function-MD5 and SHA-1 • How does the system manage the data? *HDFS and HBase

  6. Deduplication Data Store Data Store Data Store A C A C a C B b C A C A c B A B B B B A C A b a a 1. Data chunks are evaluated to determine a unique signature for each 2. Signature values are compared to identify all duplicates 3.Duplicate data chunks are replaced with pointes to a single stored chunk. Saving storage space

  7. Theories and approaches A. The architecture of source data and link files B. Architecture of deduplication cloud storage system

  8. Source data and link files

  9. Deduplication Cloud storage system

  10. System design • Data organisation • Storage of the files • Access to the files • Deletion of files

  11. Data organisation

  12. Storage of the files

  13. Access to the files

  14. Deletion of files

  15. Simulations and Experiments

  16. Performance evaluations

  17. Conclusions • 1. The fewer the data nodes, the writing efficiency is high; but the reading efficiency is low; • 2. The more data nodes, the writing efficiency is low, but reading efficiency is hight; • 3. single file is big, the time to calculate hash values becomes higher ; but transmission cost is low; • 4.single file is small, the time to calculate hash values becomes lower ; but transmission cost is high.

  18. Thanks for your listening

More Related