Auxiliary Storage Management: Chapter 10 • File attributes: • Name • Type (often defined by extension such as .doc, .c, .java, .txt, .gif, .jpg, etc) • Location – or at least where it begins • Size • Protection (access rights) • Time • Date (expiration, last access/update/modification – changing a file’s data or characterisitics)
Note ls –l and chmod command. • Program iostat.c shows how to display many of a files attributes. • Often an attribute specifies the program that created the file. • This allows double clicking on a file and having it opened by the correct program. • Linux touch command: • Ex. touch –t 0412250606filename changes last access date to Christmas of 2004 at 6 minutes after 6:00 am • Can see results via ls –l or ls -lc
File operations: • Creating, writing, reading, repositioning (like a seek), deleting, truncating (remove contents, leave attributes intact), append. • Modes for opening a file in C • rfor read; r+ for reading and writing (positioned at beginning of file which must exist or error occurs) • wcreating a new file • w+ reading and writing (file created if it does not exist) • afor append (positioned at end of file) • a+ reading and writing (positioned at end of file). See man fopen.
Information associated w/ an open file (stuff in FILE *) • disk location • # procs which opened file • current record position, etc.
File Types • Figure 10.1 on page 410 • Extensions do not require the file to be of a particular type – they just help with organization. • NOTE : A directory is also a file type.
File locking • shared locks for reading; • exclusive locks for writing • Try to open the same file on two different machines.
Access methods - Section 10.2 • Sequential access • Information processed in order. • Reads in order; writes by appending. • Direct Access • information stored by record/block. • Access by specifying record/block #. • Demo program fseek.cuses fseek function. • Not strictly direct access but is something other than sequential.
B+ tree indexes • Complex collection of indexes stored as a B+ tree structure. • Bottom of the tree has the records. • Allows a search based on key value • Hashed storage • A hash function transforms a key value into a location. Record is stored there. • To retrieve a record based on key value, apply the hash function to it and look in the resulting location. • There are issues dealing with good hash functions and what happens if two keys “hash” to the same location • Both of these are more appropriate for a data structures course
Directory structures – section 10.3 • Single level (linear structure) some old computers did not support directories • Tree structured (most, if not all, systems today). • System-wide file name • consists of path to current directory (command pwd) followed by file name inside the subdirectory. • NOTE: May be implemented using subdirectories or graphic folder icons. They really are the same.
subdir1 subdir2 shared file Acyclic-Graph subdirectories • A file may exists in two separate subdirectories. • shortcuts(windows) – create a shortcut and store it in a different folder. • alias?? (Macintosh) • May be implemented by links (Linux)
Two ways to create links in Linux: • Hard link • Ex. ln/home/shayw/452/storage/test.txt hlink and • Soft link (or symbolic link) • Ex. ln–s /home/shayw/452/storage/test.txt slink • We'll describe the difference later.
Mount point File system Mounting Section 10.4 • A file system must be mounted before it can be used. • Example: one file system (say on a CD ROM) is merged into another. • Allows file system to be spread over multiple devices (see Linux man mount) • Mount point – location where file system will be attached. New File system
Section 10.6 Protection Access lists Linux ls -l command ownergroupworld - or d or l rwxrwxrwx links owner size date filename eg octal 777 means rwx for all 754 means rwx(7) for owner, rx (5) for group, r(4) for world (some use universewhich I guess is a little broader in scope). Windows has something similar. Right click on a file (or a share folder) and select properties.
Chapter 11: File Systems • Applogical file systemfile-org. modulebasic file systemI/O controldevices • App: Like a read or write in any language • Logical file system: • How is directory organized? • Does file exist? • Does user have access? • Read access? • Write access? Etc. • Where is the logical location of the file (logical block 0 through N?
File-org module: • How is file allocated to disk? • Determine the physical address on disk (disk, track, and sector) from the logical block?
See also [http://www.ntfs.com/hard-disk-basics.htm] • Seek time: time to move head from one track to another • Rotational delay (latency): time for proper sector to rotate past the head • Transfer time: time to transfer bits in a sector
Basic file system • generate simple commands to driver: • ex. read surface i, track j, sector k. • I/O control: • Consists of device drivers and interrupt handlers. • Issues low level hardware specific commands to a controller. • Can test status of controller or an operation. • Usually done by writing/testing certain bit patterns in a controller register. • Devices: as previously described.
File System: • Collection of FCBs (File control blocks) each of which describes a file. • Contains: permissions, dates, ownership, size, location of blocks (or inodes using Linux terminology). • Boot control block: • contains code necessary to boot an OS from the volume or partition • Volume control block: • contains #blocks, size of blocks, #free blocks (also called a superblock or master file table-MFT)
Open a file (What does open do?) • Search system wide file table to see if file already open. If so, a process open-file table entry is created, pointing to the system-wide entry (Fig 11.3). • Otherwise, search directory for file name. Does file exist? Do permissions allow access? Where is it? • Copy FCB into system-wide open-file table. Table also knows how many process have file open. • Create an entry in the process open-file table and have it point to the system-wide open file table entry. • Open returns a pointer to the process open-file table entry (file descriptor or file handle)
Raw disk: • no file system (swap space, some databases, RAID systems) • Disk may have multiple partitions each with its own file system. • Old Windows 98. When a disk increased beyond a particular size, had to have different drive letters to use entire disk space. • Partition may be spread over multiple disks.
Section 11.4 Allocation methods: • Contiguous allocation: • Stored in consecutive disk blocks: • problem w/ external fragmentation (no contiguous space big enough) • See Figure 11.5. • Linked allocation: • Stored as a linked list of disk blocks: • Initial sectors optimized to reduce seek times; many additions/deletions lead to internal fragmentation (unused space inside block). Need to defragment drive. • Mainly useful for sequential access files.
FAT (File Allocation Table)-System • originated with MS/DOS • through win 95, 98
Directory (list of entries on each disk). Each directory entry includes: • File name (DOS originally had 8 bytes for this) • File extension • Attribute vector (bits indicating read only, system file, hidden file, directory) • Time and date of creation & last update. • File size
FAT entry number 23 24 25 26 27 28 29 30 31 32 33 34 35 36 34 23 This file is allocated to clusters 23, 34, 28, 31, and 25. -1 Directory entry 31 25 28 • Location of 1st FAT entry (also 1st cluster number or disk block)
See also figure 11.7 • NOTE: lots of disk head movements unless the FAT is cached.
Indexed Allocation • First block is an index block - an array of indexes, each pointing to another block (of data) • Essentially a two-level hierarchy • Can expand to multiple levels • Figure 11.8.
mode owner timestamps Size block count Data block Direct blocks Data block Up to 12 Data block Data block single indirect Pointers to data blocks double indirect Data block triple indirect Pointers to indirect blocks Pointers to second indirect blocks The Unix inode (Directory entry associates a filename with an inode) See also figure 11.9
One for each open file system-wide. Contains: location of inode, current position in the file, mode (rwx), # of fds (file descriptors) pointing to it. fid = fopen("filename", ...) stdin 0 1 2 stdout Inode Contains file attributes System File table stderr ----- File descriptor table: one entry for each open file • A disk directory is a list of FCBs, each locating one inode.
When inodes correspond to a directory • Directories are files. • data blocks contain a collection of (entry, inode#) pairs. • In Linux, can open directories and read through them much as you would a file • See commands opendir and readdir • Each entry is the name of a file in that directory. • Also contains entries for “.” and “..”. • Program directory.c demonstrates how this works.
Linux hard links: • Two directory entries that refer to the same inode. • The inode keeps track of how many directory entries reference it. • Removing the original file just removes the directory entry but leaves the inode (and the data) intact. • A hard link continues to point to data after the original file is deleted • Takes up only a directory slot.
Soft links: • A file with a separate inode. • Data is just the pathname of the actual file. • An original can be removed and re-created and the slink behaves accordingly. • takes up directory slots, inode, and data block. • Takes up more space than a hard link despite what the command ls –l shows.
Example: • Create a test file /home/shayw/452/io/test.c. • Enter directory /home/shayw/452/memory and type • ln /home/shayw/452/storage/test.chlink and • ln –s /home/shayw/452/storage/test.cslink • The first creates a hard link – the second a soft or symbolic link. • Do ls –l and note the results. • change the test.c file and display the hlink and slink files.
Do ls –i in each subdirectory. Note that hlink has the same inode numbers as the original file. • Now remove test.c file. Then display the contents of hlink and slink from the respective directories. The soft link is removed. • Recreate the test.cfile. The hard link is unchanged; the soft link reflects the new contents. • Do ls –i again. The inode numbers are all different.
NOTE: Can set up links to subdirectories also. Makes it appear that one subdirectory can exists within two separate parent directories. • NOTE: must be a soft link.
Other allocation methods. • Keyed file (VSAM) • defined by a hierarchy similar to 2-3 trees or B+ trees • Hash strategy • combination of open/closed hashing • Packing • storing logical records inside of physical records (block) • Tapes • physical vs. logical records. • Inter-record gaps.
Windows • create a small textfile using, say notepad. • Right click on it and select properties. • NOTE: Size and Size on Disk differ by potentially a LARGE amount. • Why? • Min file size is 4K bytes (size of cluster). • Can also do ls –s in Linux to show the number of blocks for a file.
Section 11.5 Free space management • Bit map or bit vector: • sequence of bits, each associated with a block. • 1 means free; 0 means allocated. • Simple approach-finding the first free block is easy. Find the first 1. • Could be a LOT of bits for anything but small disks – and most disks are large anyway. • Needs to be in memory for quick handling of output requests that need new space. • A 40GB disk (small) w/ 1K blocks would require 40 million bits or 5 MB of storage.
Linked list of free blocks. Use first free block when more space is needed (Figure 11.10) • Grouping: First free block has addresses of n more blocks. That last of those contains n more addresses, and so on. Can find multiple free blocks more quickly.
Controller cache OS cache User memory disk Performance • Caching used to speed up performance. • First read causes a physical transfer from disk to OS cache. Subsequent "reads" do not cause a physical I/O if data can be retrieved from cache.
Recovery & consistency checking • See helpchkdsk. • Run chkdsk C:from the command prompt. • Linux fsckcommand. • See man fsck.
Both can be used to fix some errors and check for consistency. Inconsistencies can occur as a result of a crash or problem during a file write.
Example, fsck will • Create two tables each containing a counter for each physical disk block. • Read through each inode; access each block from the inode; increment counter for that block in the first table. • Tracks which blocks are part of some file. • read through free list (or bit-map vector); increment counter for each block that is found. • Tracks which blocks are not part of some file • Consistent means: each block has a 0 in one table and a 1 in the other.
Possible problems/responses • a block has a 0-count in both tables. Missing block. • Usually add it to the free list. • A block has a count > 1 in the second table, meaning it appears twice in a free list • can't happen with bit map vector. • Adjust list to remove redundant entry
A block has a count > 1 in the first table meaning it appears twice (or more) in a file or in two or more files; • or the block has a non-0 count in both tables. • Copy data into a free block and adjust file system accordingly. • Probably notify user or administrator.
File directory consistency • Table of counters, one for each file. • Start at root directory and do a recursive search of the hierarchy (file system). • Inspect each directory. For each i-node in a directory, increment a counter for that file. Recall a file may appear in > 1 directory due to hard links.
When done the checker knows how many directories have each file. • Inodes contain a link count • set at 1 when file is created and incremented whenever a hard link is created. • These counts must agree.
If the link count is higher, all files could be deleted and the inode would not be removed. • Set link count in inode to proper value. • If the link count is lower, an inode is removed when its count goes to 0 but there could still be a reference to it from another directory entry. • This is bad. • Again, adjust the link count.
Bizarre protection levels. What if a file has a protection of 007? • User and group have no access but world has rwx access? • Linux allows this.