1 / 24

Using DSVM to Implement a Distributed File System

Using DSVM to Implement a Distributed File System. Ramon Lawrence Dept. of Computer Science umlawren@cs.umanitoba.ca. Background Work on DFS. Extensive research in the late eighties Research focused on using replication to improve efficiency of file access

kimn
Télécharger la présentation

Using DSVM to Implement a Distributed File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science umlawren@cs.umanitoba.ca

  2. Background Work on DFS • Extensive research in the late eighties • Research focused on using replication to improve efficiency of file access • Work at Cornell produced a system called Deceit which allowed default and user specified replication of files • Deceit used a client/server architecture which allowed multiple servers

  3. Distributed Shared Virtual Memory (DSVM) • a global address space accessible by any number of processes distributed across a network • instead of explicit message passing, DSVM processes read and write to the shared memory space, and it is the responsibility of the DSVM manager to insure the information they see is consistent

  4. Why use DSVM? • distribution transparency • process designer does not have to code explicit IPC • increased network bandwidth makes efficiency vs. development cost trade-off more realistic • easier implementation of distributed or parallel algorithms on generic workstations

  5. Treadmarks • commercial implementation of DSVM • designed for running parallel algorithms on a network of workstations • implemented as a C++ library • increased portability • allows allocation of memory in DSVM region using malloc() • DSVM access is similar to regular dynamic memory access

  6. Treadmarks (cont.) • provides barriers and locks for synchronization primitives which must be used if the DSVM region is to remain consistent • uses lazy release consistency which guarantees that DSVM is consistent only after a lock acquire • allows multiple-writers of the same page of DSVM

  7. Treadmarks Limitations • all processes accessing the DSVM region must be homogeneous and must be started at the same time • Treadmarks uses UNIX signals to detect access to DSVM which limits its usefulness for system programming as signals interrupt system calls • despite its limitations, still useful for prototype demonstrations

  8. Environment Specification • network domain is a series of interconnected PCs on an Ethernet • an application is assumed to have a unique name across the network • other files can be given a unique name by concatenating the machine name, directory path, and filename • a global name table (GNT) manages the files on the network

  9. Global Name Table • provides a flat name space to identify files distributed across the network • managed by the OS • provides a table look-up mechanism to find file locations by the unique name • enforcing the relative path constraint may allow transparent access to files by application and users

  10. Relative Path Constraint • all files access is done relative to home directories • absolute paths pose problems as they are not the same across machines • applications and users have home directories and all file access should be specified relative to them • most files are location independent

  11. Benefits of a GNT • presents all users with a consistent and identical view of the network independent of their site location • all applications appear to be local • enhances user familiarity with the system • provides a similar transparency to icons in windowing systems except that the view is defined by the user not by the site location

  12. Benefits of a GNT (cont.) • makes applications more movable • instead of reconfiguring all icons or links, just have to update one table entry • files can be moved by a user or the OS without effecting the views of other users • if used with a standardized display mechanism, application execution also becomes transparent • allows for load balancing, replication, etc.

  13. Distributing the GNT • the GNT provides a mechanism for individual sites to find files on the network • the GNT must be accessible by all sites • Two architectures: • client/server • DSVM

  14. Client/Server Architecture • every machine has a client process which handles user/application requests • one machine has a dedicated server process which stores the GNT and responds to requests from the clients • communication between the clients and server is done using sockets

  15. Client/Server Architecture

  16. Client/Server Analysis • distinguishable client and server processes • explicit communication • problem of dividing tasks (e.g. buffering) • single point of failure • pessimistic sharing - data is only shared by explicit requests to the server • efficient with a single server as network communication is minimized

  17. DSVM Architecture • the GNT is allocated in DSVM shared by all processes which act as both a client and a server • an update to the GNT by any process is reflecting at all other processes without explicit communication • all communication is handled by the DSVM manager

  18. DSVM Architecture

  19. DSVM Architecture Analysis • optimistic sharing - everything in DSVM is shared by all processes • transparent sharing • processes do not know they are sharing the GNT with other processes • communication details are hidden from implementation of GNT

  20. DSVM Architecture Analysis (cont.) • hidden costs • implementation of DSVM still requires communication to maintain consistency • frequent updates and false sharing may be a problem • overhead in determining when a process accesses the shared memory region

  21. Architectural Differences • the main trade-off is efficiency vs. ease of implementation • very similar to object-oriented environment • costs of encapsulation and implementation transparency vs. generality and simplicity • sharing methodology • pessimistic vs. optimistic • DSVM provides illusion of isolation similar to a transaction in a DB system

  22. Architectural Differences • amount of replication • DSVM - full replication • Client/Server - single point of failure • overhead • DSVM - must trap DSVM accesses and false sharing at the page level • Client/Server - communication minimized

  23. Conclusions • DSVM is a higher level IPC protocol • like an object is a higher level data structure • DSVM provides an easier programming environment and a standardized mechanism for IPC at the cost of higher communication overhead • increased bandwidth may justify overhead to achieve decrease in development costs

  24. Future Work • expanding the functionality of the GNT • defining a display standard to allow for application execution transparency • future work on DSVM including integration into OS

More Related