1 / 21

Archival Storage Venti : A new approach to archival storage Sean Quinlan and Sean Dorward

Archival Storage Venti : A new approach to archival storage Sean Quinlan and Sean Dorward. CS 599 Special topics in OS and Distributed Storage Systems Rohit Kulkarni 4 th Feb 2004. Outline. Archival Storage Venti : key ideas Applications Implementation Performance

MikeCarlo
Télécharger la présentation

Archival Storage Venti : A new approach to archival storage Sean Quinlan and Sean Dorward

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Archival StorageVenti : A new approach to archival storageSean Quinlan and Sean Dorward CS 599 Special topics in OS and Distributed Storage Systems Rohit Kulkarni 4th Feb 2004

  2. Outline • Archival Storage • Venti : key ideas • Applications • Implementation • Performance • Reliability and Recovery • Conclusion & Questions?

  3. Archival Storage • Storing data for long periods of time - forever • Tape backup • Central server for no. of clients • Restoring data painful • Full backup Vs Incremental backup • Snapshots • consistent read-only view of file system at some point in past • Maintains file system permissions • Can be accessed by standard tools – ls, cat, cp, grep, diff • Avoids tradeoff between full Vs incremental backup • Looks like full backup • Implementation resembles incremental backup – share blocks

  4. Venti • GOAL: “To provide a write-once archival repository that can be shared by multiple client machines and applications” • Block level network storage system • actually a backend storage for client apps • Blocks addressed by hash of their contents • uses SHA-1 algorithm • SHA-1 output is 160 bit (20 byte) fingerprint of data block • Write once policy • once written cannot be deleted • Multiple writes of same data coalesced • data sharing – saves storage capacity • makes write operation idempotent

  5. Venti (2) • Multiple clients can share a Venti server • Hash fn gives an universal namespace • Inherent integrity checking of data • Fingerprint computation on retrieval • Caching is simplified • Uses magnetic disks as storage technology • access time comparable to non-archival data

  6. Data divided into blocks App needs fingerprint for retrieval Fingerprints packed together -> pointer blks Above repeated recursively to get single fingerprint -> root of tree Data Organization

  7. New or modified data blocks are stored Unchanged sections of tree reused Data organization (2)

  8. Data organization (3) • More complex data structures • Mixing data and fingerprints in a block • e.g. structure for storing file system • 3 types of blocks • Directory – has file meta info + root fingerprint • Pointer • Data

  9. Venti application 1 : Vac • Similar to zip & tar – storing collection of files and directories as single object • tree of blocks formed for selected files • vac archive file -> 45 bytes long • 20 bytes for root fingerprint • 25 bytes fixed header string • any amount of data compressed down to 45 bytes • Unvac – to restore file from archive • Duplicate copies of file coalesced on server • Multiple users vac same data – only 1 copy stored • vac on changed contents

  10. Venti application 2 : Physical Level Backup • Copy the raw disk blocks to Venti • No need to walk file hierarchy • Gives higher throughput • Duplicate blocks are coalesced • User sees full backup of device • Storage space advs of incremental backup retained • Random access possible • Directly mounting a backup file system image • lazy restore – restore on demand

  11. Venti application 3 : Plan 9 File System • Plan 9 FS on top of Venti • Primary location for data • Small amount of read/write storage • Stores daily changes to file system • Smaller than active file system • Venti stores permanent changes

  12. Append-only log of data blocks RAID array Separate index maps fingerprints to log location Fingerprint location in index is random Striped across multiple disks Write buffering Block cache Hit -> index lookup & data log bypassed index cache Hit -> index lookup bypassed Implementation

  13. Implementation (2)

  14. no. of fingerprints >> no. of blocks on a server Index as disk-resident hash table Hashing fn maps fingerprints to index buckets Implementation (3)

  15. Performance: computing environments • 2 plan 9 file servers, bootes and emelie • Spanning 1990 to 2001 • 522 user accounts, 50-100 active all the time • Numerous development projects hosted • Several large data sets • Astronomical data, satellite imagery, multimedia files

  16. Performance (2)

  17. Performance (3) • When stored on Venti, size of jukebox data reduced by 3 factors • Elimination of duplicate blocks • Elimination of block fragmentation • Compression of block contents

  18. Reliability and Recovery • Tools for integrity checking & error recovery • Verifying structure of arena • Checking there is an index entry for every block in data log, vice versa • Rebuilding index form data log • Copying arena to removable media • Data log on RAID 5 disk array • Protection against single drive failures • Off-site mirrors for server • Storing to write-once read-many optical jukeboxes

  19. Future Work • Load balancing • Distribute Venti across multiple machines • Replicate server • Use of proxy server to hide it from client • Better access control • Currently just authentication to server • Single root fingerprint gives access to entire file tree • Use of variable sized blocks as in LBFS

  20. Conclusion • Addressing block by SHA-1 hash of contents • Write once model • Reduces accidental or malicious data loss • Simplifies administration • Simplifies caching • Allows sharing of data • Magnetic disks as storage technology • Large capacity at low price • Random access • Performance comparable to non-archival data

  21. Questions ? Thank You

More Related