60 likes | 172 Vues
OceanStore presents an innovative approach to data storage, addressing the rising need for a global-scale, reliable, and scalable utility to manage increasingly inconsistent data. It simplifies data maintenance by integrating web services, emails, filesystems, and databases under one umbrella, enabling seamless sharing among users irrespective of their location. The design focuses on efficient data dissemination through promiscuous caching and redundancy, enhancing availability while addressing security concerns. With its vision for a sustainable data economy, OceanStore sets the foundation for resilient, decentralized information management.
E N D
The Oceanic Data Utility:(OceanStore)Global-Scale Persistent Storage John Kubiatowicz
Properties of the ODU • Motivation: • Growing quantity of inconsistent data • Widespread mobility of producers and consumers • Simplicity: subsume Web, email, filesystems, databases • Nomadic Data: Serverless, Homeless • Sharing of information between anyone, anywhere • Promiscuous caching of data enabled by tacit information (option 5/introspection) • Efficient dissemination of information (multicast) • Federation of many different companies, just like phone service or electric grid. • Highly-available: data always duplicated • Higher-probability access • Copies placed with low probability of correlated failure • Shares technology with options 1,4,5, and 8
Technical Challenges • Scalability: performance easy to destroy • vast number of entities: ~billions • cross-administrative domains • Security is not optional: data never cleartext • Availability • Should bootstrap redundancy available on global scale • Economies of scale applied to achieving data reliability • Maintainability • Too large for human intervention in normal operation • Naming: How to maintain global namespace? • Indexability • Must enable efficient location/searching of data • Consistency/Conflict resolution • Multiple copies must have well-defined relationship
State of the Art? • Remote file-system community: NFS, AFS • All have single points of failure • Only caching at endpoints • Mobile computing community: Coda • Small scale, fixed coherence mechanism • Web caching community: Inktomi, others? • Specialized, incremental solutions • Caching along client/server path, various bottlenecks • Database Community: Mariposa • Still small scale, specialized types of queries • Economic model not quite right but on right track • Internet backup companies: Medley • Very limited in scope and flexibility • PalmPilot: inspired general conflict-resolution
Our Enabling Technologies • Data Economy • User pays monthly fee to a primary utility provider who is responsible for reliability of data • Utilities buy and sell capacity (both data and bandwidth); prices set for quantity and reliability • Authoritative naming servers paid per query? • Underlying database organization • User-visible structure (e.g. filesystem) synthesized • Federation of overlapping data location structures (indices) + Introspection • Separate the absolute authority for data location from moment-to-moment “hearsay” authorities • Partially consistent indices continually adapted to improve performance • Conflict Resolution, not consistency • policies set via domain specific language
3 Year Plan for Success • Year1: • Initial design and refinement of four components: • naming & security scheme (security based on name) • fluid, partially coherent index structures • introspection for intelligent migration of data • initial take on economic models • Begin prototype implementation with all components • Year2: • Finish prototyping and refinement of first-generation • Client implementation for Windows and/or UNIX • Year 3: • Second-generation prototype on Millenium infrastructure • formulate plan for large-scale test • Final evaluation and usability results