150 likes | 287 Vues
This paper discusses an innovative approach to using a PostgreSQL database as a backend for NFS servers. It outlines the motivation and goals behind integrating NFS with a database, highlighting the benefits of transaction idempotency, graceful backup, and recovery. The implementation strategy involves adapting the standard UNIX file API and modifying the NFS server to communicate with the database. Key results indicate that efficient implementations are feasible and performance is comparable to native file systems, given proper caching strategies and schema design.
E N D
NFS on a Database: Structure and Performance Alan Halverson Babis Samios
Motivation • Goal: NFS Server / Database Backend • Why Database? • Transactions provide idempotency naturally • Graceful backup/recovery • Why NFS? • Nearly universal client availability • Transparent access for existing applications • Ease of implementation
Approach • Implement standard UNIX file API • open(), read(), write(), etc. • All routines talk to the database • Modify NFS server to use new API • … • Profit!
Main Results • Efficient Implementation is Possible • Same order of magnitude with native file system for read/write operations • Choice of Database Schema is Important • Server Cache Usage is Critical • Avoids database round-trips
Roadmap • Approach • NFS server choices • Databases choices • Architecture/Design • Experimental Setup & Results • Summary/Conclusions
Database Choices • Many available DBMS’s • We chose PostgreSQL • Free, open source • Inspiration for our work was the Inversion File System – also implemented on top of Postgres • Uses client/server model
NFS Server Choices • Kernel mode • Pros: included in Linux, supports NFS v3 • Cons: difficult to debug • User mode - UNFSD • Pros: Easier to debug, comm. with PostgreSQL possible! • Cons: Only supports NFS v2 • Our choice: User mode
Database Schema • meta-data -> file_attributes • dir hierarchy -> naming • data -> Many options • Table/File (used by Inversion FS) • Single Table (avoids table creation overhead) • Intermediate solutions (e.g. table/dir)
Single Table Schema file_attributes 1 1 1 N N N naming all_files
Caching • Old Story: Client Side Caching • Buffer cache • New Story: Server Side Caching • Minimize the number of round-trips to the DB by maintaining three different caches: • Stat cache • Naming cache • Buffer cache (significantly beneficial only in a multi-client environment) Major Contribution
Binary Data • SQL statements issued to PostgreSQL must contain ASCII data only • Provides escaping function • escape(data) ≤ 4 x data • We used base64 encoding • base64(data) = 4/3 x data • Best case raw write performance is 4/3 of native file system write performance
Summary/Conclusions • Design and implementation of NFS operating on top of PostgreSQL • Use of 3-tier architecture for maximum flexibility • Performance comparable to native UNIX FS for read/write operations • Factors that affect performance • Caching (both server and client side) • Chunk size and NFS r/w message size • Database Schema
Things we will not do • Asynchronous database writes (for both data and meta-data) • Compare recovery times with both ext2 and ext3 • Test multi-client environment • Add mechanism for querying system meta-data