270 likes | 356 Vues
This comprehensive guide explores the importance of file systems in managing information storage on disks, covering topics such as file subsystems, device drivers, UNIX file characteristics, physical storage mediums, and UNIX file system structure. Gain insights into logical and physical file organization, system implementation, and differences between UNIX and MS-DOS/Windows file systems.
E N D
Phones OFF Please File Systems Parminder Singh Kang Home: www.cse.dmu.ac.uk/~pkang Email: pkang@dmu.ac.uk
Introduction • File systems are needed: • Because main memory is not big enough • To maintain permanent copies of information • File system should be device independent to allow programs to use the same • commands with different devices, • Disks provide the bulk of secondary storage on which a file system is maintained. • The advantages of using disks are; • Ability to rewrite at same place. • User can access directly given block of information • And it improves efficiency and performance of operations by transferring data in unit of blocks instead of byte by byte.
1 File Subsystem • Provides users/applications with a logical interface • Impose uniform structure on storage, • i.e. typically hierarchical directory structure • Refer to elements by meaningful names • Specify operations on storage in application terms, e.g. read a real number • Maps logical organisation to a physical storage media • Hides details of the physical organisation – • Using device drivers allows all I/O to be treated alike
1.1Device drivers • Transfer from/to peripheral requires series of steps : • Check current status • Initiate status change • Request transfer • Receive notification complete • 1.2 File Subsystem requirement • Specifying Logical File characteristics • Cataloguing/locating files • Mapping physical to logical • Supporting file operation • Controlling access
1.3 File System Structure • I/O Control: • Consist of device drivers and interrupt handlers to transfer information between main memory and disk system. • The basic function of I/O controller is to access specific location on device. • Basic File System: • Main function is to issue generic commands to appropriate device driver to read and write physical block on disk. • Basic file system uses concept of physical address space. Each block is identified by numeric disk address (e.g. drive, cylinder, track, sector) • File Organization Module: • track of file allocation used and mapping between logical and physical blocks. • By knowing type of file allocation and physical address; file organization module translates logical block address to physical address.
Application programs Logical File System File-Organization module Basic File System I/O Control Devices • Logical File System: • Contains metadata (file structure information, Inode information and file control block information). • All information is managed by file control block (FCB). FCB includes information about file name, inodes, permissions, location of file content etc.
1.4 File System Structure Implementation • File system implementation refers to disk and memory. • Implementation varies with operating system and file system use. • On Disk Implementation: • Disk Label (VTOC) • Boot Control Block • Primary Super Block • Backup Super Block • Inode
In Memory Implementation: • Contains information about each mounted partition. • In memory directory structure holds information about recently accessed directories. • Open file table contains copy if Inodes for each open file. disk and memory file system implementation and why they needed?
2. UNIX file characteristics • Structure • File is flat sequence of bytes • UNIX imposes no structure • Other O/S sometimes do • Naming conventions • Char sequence as name • Max length can vary • Type • UNIX does not infer type from name • UNIX has different file type • Regular • Directories • Symbolic Links (Hard or Soft) • Device
Organisation • Defines whether file is accessed sequentially or randomly • Both supported by current offset pointer • Access • Define who can do what to file • Record when it was last done • 3. Physical Storage • Physical storage is mainly • hard disks • CD-Rom/DVD • Floppy disks • Magnetic tape (backup)
4. UNIX File System • btree structured • only one tree (one root) • may be multiple disks • i.e. uses a device independent hierarchical which is regarded as a tree: • root • user • user files • bin etc dev usr • Device file systems can be attached to the tree using the program /etc/mount
Once device is mounted files can be accessed using directory name; • the device does not have to be known, e.g. • cp /user/test.dat mytest.dat • to copy a file to the current directory. • This has the advantage that a file system can be moved to a different device without • the programs, which use it needing modification. • MS-DOS and Windows on the other hand, are not device independent, • i.e. one has to use device names, e.g. • copy a:test.dat c:\user.
4.1 Types of file • Unix has: • regular files - users programs and data, etc. • directories • special files - I/O devices, e.g. /dev/tty and /dev/hd1 • MS-DOS/Windows has • regular files - users programs and data, etc. • Directories • special files – prn: con:, com1:, etc. • 4.2 File names • Standard UNIX allows up to 14 characters in a filename with combinations • of name and extension as required, e.g. test.data. • In UNIX the extension is solely for the programmers convenience, • i.e. to identify types of files at a logical level, • e.g. a user may end all data files with. data
4.3 Using files • two basic operation needed; read and write. • At a program level one usually has a set of language or library or • systems calls to access files • 4.4 Directories (only OS can write into Directory, Justify?) • a directory is a special system file which contains details of other files • provide a logical interface to user to keep track of files. • Simple file systems, e.g. CP/M, have one directory per device: • These become large with many users • one can have name conflicts. • Alternatives are to have one directory per user (RSX) or • many directories per user (UNIX, MS-DOS, Windows). • absolute path names (from the root) • e.g. on UNIX /usr1/stf/bb/public/opsys/progs/pipe1.cc • path names relative to the current directory, • e.g. network/programs/terminal or ../../test_data/test.data
4.5 File structure • At the lowest (physical) level one can read/write blocks from/to a device. • At a logical level one reads/writes records; where a record would be anything • from a byte (read a character) to N bytes (read a large structure). • 4.6 Disk space management • The surface of a disk is divided into a number of cylindrical tracks each of • which is divided up into a number of sectors.The unit reading/writing from/to a • disk is a block. (where a block may be anything from a sector to a track). • Disk I/O transfer speed is made up from (depends upon rotation speed of disk): • start-up time (for floppies) ~ 0.5sec • track seek time (time to move head to require track) ~ 5-20msec for hard disk • latency (time for required sector to appear under head) ~ 0.5-5msec • transfer time ~ 0.05 - 0.1 mSec for 1Kblock • larger the block size the faster the transfer rate will be but the more space is wasted. • a disk cache can improve transfer rates significantly.
4.7 Disk Partitions • To assist in the overall organisation of file system large disks can • be split into logical disks by partitions. • Boot block (Master boot record) • Partition 1 • Partition 2 etc • Each partition can be either “raw” containing no file system • or “cooked” containing file system. • Raw disk is used when no file system is appropriate. • For example; UNIX swap space uses raw partition, as it uses its own format and does not use a file system. • Boot information can be stored in a separate partition. (Why?) • Root Partition, Contains operating system kernel and system files are mounted • at boot time.
5 Unix File storage • A file system needs to keep track of • where every file is stored • details of each file • where next block of file is • which blocks are free • which blocks are in use • 5.1 Allocating Space to Files • allocate a contiguous sequence of blocks • unused small areas • problem with file growth • Linked list of blocks • problem with random access • index blocks (UNIX uses a version of this)
5.2 UNIX file storage structure • only O/S can write to a directory • each entry hasinode number and name • ┌───────────────┬────────── ─┐ • │ i-node number │ file name │ • └───────────────┴──────── ───┘ • command df (disk free) will tell you how many i-nodes are free. • An i-node contains the following information on the file: • file mode (indicates type of file - normal file, special file, etc.) • number of links to file (e.g. from other directories) • owners user id • owners group id • access permissions for each user type, e.g. read, write, and execute • file size in characters • time created, last accessed, last modified • location of first 10 blocks (if file < 10 blocks contains address of the file) • single indexed, double indexed, triple indexed
5.3 Locating a file • To locate a file’s data requires the following loop • Inode-data-inode-data…. • Note: • Root i-node is always i-node 2 • Each directory has entries for . (current directory) and . . (parent directory)
root dir user dir staff dir sam dir ┌───────────┐ ┌────►┌───────────┐ ┌────►┌───────────┐ ┌────►┌───────────┐ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ├───────────┤ │ ├───────────┤ │ ├───────────┤ │ ├───────────┤ ┌─┤ user │ │ ┌─┤ staff │ │ ┌─┤ sam │ │ ┌─┤ x.data │ │ ├───────────┤ │ │ ├───────────┤ │ │ ├───────────┤ │ │ ├───────────┤ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └───────────┘ │ │ └───────────┘ │ │ └───────────┘ │ │ └───────────┘ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ user i-node │ │ staff i-node │ │ sam i-node │ │ x.data i-node │ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ └►│ │ │ └►│ │ │ └►│ │ │ └►│ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ location ├──┘ │ location ├──┘ │ location ├──┘ │ location ├──► file │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘
The sequence of events is: • the root directory is searched for the user directory file entry • the i-node number is extracted and the location of the user directory found from the i-node • the user directory is searched for the staff directory file entry • the i-node number is extracted and the location of the staff directory found from the i-node • the staff directory is searched for the sam directory file entry • the i-node number is extracted and the location of the sam directory found from the i-node • the sam directory is searched for the x.data file entry • the i-node number is extracted and the locations of the x.data file found from the i-node
5.4 The i-node and data addressing • Addresses of data blocks of file stored in inode: • 10 direct pointers • 1 single indirect pointer to a block of addresses • 1 double indirect pointer to a block, which contains pointers to blocks of addresses. • Some systems have a third level pointer. • 5.4.1 how big should the pointers be? • Early versions of Unix used 16 bit pointers. • With 1K blocks this meant you were limited to 65Mbyte as the largest disk. • Later versions of Unix used 32 bit pointers and 4K (or 8K blocks) • This gives a maximum file/disk size of 4K x 4Gbytes = 16Tbytes. • Modern versions of Unix are using 64 bit pointers.
5.4.2 File Operations • Opening an existing file • Creating a file • Reading from a file • Writing to a file • Closing a file • Deleting a file • Changing access permissions • Renaming a file
5.4.3 Links in Unix • Each file has one i-node but may have many directory entries • Each name entry is a link to an i-node • links may be hard or soft • Hard links • each directory entry points directly at same i-node • the i-node maintains count of links to it • this only operates on a single device • Soft Links • Special file containing the path to the target file • Separate i-node • Can span devices
5.5 Efficiency and Performance • Unix uses a buffer cache to hold large block of memory • As blocks are read they are stored in the cache, Reading next block can go on • while current block is being processed • If cache is sufficiently large or not? • further improvement using Delayed write (can be problem if system crashes) • i-nodes written back immediately • Written data blocks are flushed after a few sec’s • written to disk but marked delayed write. • Block can be modified further before it reaches the head of the list • when it is then written. Useful if file is deleted before block written. • Eventually cache fills • Block that was accessed longest ago is flushed • Read ahead improves efficiency
6 Log Structured File Systems • CPU’s are getting faster • Memory is getting faster • Disks are getting bigger, but not much faster • This creates bottleneck in the file system - especially for large file servers (Solution?) • Log structured file systems • Most accesses are to the cache • writes slow the system (small quantity of data) • Disks operate most efficiently with large writes (one or more tracks) therefore; • Collect writes together and write them all at once as a log record. • If record is big (~ 1Mbyte) disk will operate efficiently. • Record contains i-nodes, directories, data mixed up. • Need a table to keep track of where every i-node is. Keep this in memory and on disk.
Note: • Much more complex to administer. • Eventually disk fills. • Have a garbage collection process which goes through log records and • compacts them.Disk operates like a very large circular buffer. • 7 DOS/Windows • The file system has • Boot sector • FAT • Root directory and Data blocks • The directory entry contains all the details about the file including the name. • It has a pointer to the first block. • To find the next block the system uses a FAT (File Allocation table). • The FAT is a large one dimensional array. There is an entry for each block • which contains either • The address of the next blockor End Of File marker