Lecture 7 Data Storage and Access Methods
Access Methods and Storage Organisation This lecture will introduce some aspects associated with Data Access Methods (disk based) and the associated Storage Organisations. These factors are part of the Implementation Planning stage and can be regarded as ‘tuning’ in that they provide a sound basis for performance and therefore response times. There are also additional Integrity control functions which can be introduced into the database at this stage.
Terminology MEDIA : Magnetic Disks, Optical Disks, CD Roms, Other devices (Smart Cards) TERMS : Seek Time, Rotational Delay, Cylinder, Track, Sector, Block, Page, Device ACCESS TIMES ~ 400ms or less - floppy disk 15 to 20 ms or less large fast disk (is a CD-Rom faster ?) AIM OF Storage Structures and DBMS : To reduce the number of I/O’s STORAGE STRUCTURE : An arrangement of data on a storage medium Data Access Software 1. Disk Manager (page level) 2. File Manager (record level)
Storage Terms • Track : Concentric division of the surface of a disk • Sector : Fixed size component of a track • Cylinder : Set of tracks in the corresponding position of all surfaces of a disk pack • Page : Physical amount of data transferred between memory and disk. Can be 1024, 2048, 4096 bytes depending on the disk unit and the controller • Bucket : Used to identify a physical storage address. Normally holds a number of logical records • Packing Density : Ratio of stored records to the number of spaces
Storage Terms • Data is accessed a ‘page’ at a time • 1 disk Input/Output process (I/O) is required per page • Keyed storage structures allow direct access to a page • Main pages are those originally allocated to a storage • structure to hold rows • Overflow pages are added as the table grows and the • main page is full. Duplicate rows can cause overflows • Choosing the appropriate structure is important for: • performance • concurrency • disk space availability
Storage Terms Tables are stored in files, files are divided into pages. An Oracle Page = 2048 bytes Approx. 2008 bytes for user data and 40 bytes for Ingres overheads Pages are divided into records. Records cannot span pages Record width = width of row + 2 bytes. There are 2 bytes used for start of row. This is not reused when (if) the row is deleted
Storage Terms • Fill Factor : Percentage of page which should be occupied by rows of data • Min Pages : Minimum number of pages (of storage) a table is to be allocated (watch for defaults) • Max Pages : Maximum number of pages a table is allocated • Index : A table or other data structure that is used to determine the location of rows in a tables (or tables) that satisfy some condition
Internal (Physical) Storage Directories: Disk Table of Contents Disk Directory Page Set Directory Index Directory
Storage Structures Application Buffers DBMS Buffers Operating System Logical records Logical/Physical Physical Records LR1 read LR1read LR2 LR2 PR1 LR3 write LR3 PR2 LR4 LR4 write
Storage Structures • At this level a database consists of physical records (aka blocks or pages) • These are organised into files A file is a collection of physical records organised for efficient access • A physical record is a collection of bytes which are transferred between volatile storage (memory) in main memory and stable storage on disk
Storage Structures • A physical record contains multiple logical records • The size of a physical record is a power of two - such as 1024 (210) , 4096 (212) • A large logical record may be split over multiple physical records - and logical records from more than one table may be stored in the same physical record
Data Base Access DBMS Stored record returned Functions Request for stored record Retrieve Replace Add Remove Create Destroy File Manager Stored page Returned Request for stored page Disk Manager Disk I/O Data read from disk Data Base
A Possible Problem ? Main memory capacity is always less than Disk Storage capacity But in both media, each ‘piece of data’ is addressable How can a ‘piece of data’ on a hard disk range be directed into memory and ‘alter’ its address and be locatable ? And how can memory located data be directed to the high order areas of a mass storage disk device ? There is a technique known as swizzling which addresses this interchange
Storage Structures Objective ; To minimise the number of Disk Accesses ( I/O's) Access times are in the range 15ms to 300ms STORAGE STRUCTURES : Arrangement of data on the storage medium NO storage structure will optimise ALL Application requirements DBMS systems should provide for a number of structures Requires a sound understanding of the uses of the database (as determined in the LOGICAL design.) Some Terms PAGE MANAGEMENT - DISK MANAGER RECORD MANAGEMENT - FILE MANAGER
Access Calculation Number of Pages: 300,000 No. of Disk I/O’s per second 30 Assume 1 disk I/O per page Assume HEAP Storage structure select * from employee where employee_name = “Johnson”; (and remember the closure feature of SQL) Time taken = 300,000 / 30 = 10,000 seconds or approximately 3 hours
Access Methods LINEAR and NON-LINEAR 1. Indexing - Sequential Index Keys - Direct Primary Index (index on Primary Key) - Inverted Lists Secondary Index on non- primary key 2. Hashing - Direct Access Attribute(s) value Computed Disk Address A database can have any number of Indexes, BUT only ONE HASH STRUCTURE
File Organisations A file organisation is a technique for physically arranging the records of a file on a secondary storage device. File organisations Sequential Indexed Direct Sequential Nonsequential Relative- Addressed Hash- Addressed (block index) (full index) Hardware- dependent (ISAM) Hardware- independent (VSAM)
Sequential Access • Arrangement in physical sequence -dependent on some attribute (sequencing attribute) • Can be ordered • Serially (as the rows occur) New additions (inserts) are placed after the last row (known as a HEAP structure) • Increasing / Decreasing order by value of the sequencing attribute Inserts, Deletes and Updates handled by rewriting the entire table • Requires progression beyond a starting key value
Indexed Sequential Access There are two basic implementations of the indexed sequential organisation: - hardware-dependent uses block index on the key, disk address to the prime area which contains the data records and the track index for the cylinder - hardware-independent uses a control interval which may be considered a virtual track, free space for new records is provided by distributed free space.
Indexed Sequential File Organisation Creates Tables - the Index tables Methods 1. related to the exact disk address of every record (records are held sequentially) Known as a DENSE index 2. The beginning disk address of a group of records (also held sequentially) Known as a SPARSE index
Indexed Sequential File Organisation Some terms to note: Track, Cylinder and Volume These lead to 1. Track Index 2. Cylinder Index ISAM provides Sequential Access Direct Access
Direct Access File Organisation Objective : To provide rapid, direct, non-sequential access to records Index tables are not created The organisation does not readily permit of sequenced output Based on deriving a TARGET address (disk address) Hashing algorithm on record keys(s)
Direct Access File Organisation 2. Hashed or Calculated address Based on transforming a key of each record to create a ‘unique’ value, which then becomes a disk address Used for Open or Sparse populations There are some possible problems: (a) collisions - more than 1 key transform generating the same ‘address’ (b) collisions - caused by record lengths being larger than a sector Possible Solutions: Overflow areas with Pointer mechanism with serial loading and searching
Reorganisation Access performance and Disk occupancy statistics will indicate when a database reorganisation should be performed Very time consuming (and requires additional disk space) Data is redistributed - any existing indexes will need to be remade Any existing pointers, or pointer chains, will need to be reset
Sequential Access In sequential access, record storage starts at a designated point, usually the beginning, and proceeds in a linear sequence through the file. Each record can only be retrieved by accessing all the records that physically precede it. Random Access In random access, a given record is accessed "out of the blue" without referencing other records in the file. Access Methods
A File organisation is established when the file is created, and is rarely changed. However, record access mode can change each time the file is used. Record access mode Sequential Random File Organisation Sequential Yes No (impractical) Indexed Seq. Yes Yes Direct-Relative Yes Yes Direct-Hashed No Yes (impractical) Access Comparisons
Hashing Routines Records are assigned to buckets by means of a hashing routine, or transformation, which is an algorithm that converts each primary key value into a relative disk address. An example of one that consistently performs best under most conditions is: * division/remainder method 1. Determine the number of buckets to be allocated to the file. 2. Select a prime number that is approximately equal to this number. 3. Divide each primary key value (usually the ASCII sum) by the prime number. 4. Use the remainder as the relative bucket address.
Binary Trees A non-linear data structure, each element having several "next" elements ( branching ). A binary tree has a maximum of two branches per element or node. A node consist of some data and a maximum of two pointers, a left pointer to the left branch and right pointer to the right branch. If there is no left or right branch then a nil pointer is used.
A Binary Tree Diagram PRODUCT# LLINK RLINK Primary Key Less Than Pointer Greater Than Pointer Data 1000 < 1000 1000 > > 1600 0350 1600 (1) Initial tree (2) Insert 1000 (3) Insert 1600 (4) Insert 0350 1000 1000 < 1000 < > > 0350 1600 < > 0350 1600 > > 0350 1600 > > > 0975 2000 < 2000 0975 2000 0625 (5) Insert 2000 (6) Insert 0975 (7) Insert 0625
An Example of a Binary Tree 1000 < > 1600 0350 > > < < 0975 2000 0100 1250 < > < 0625 1425 1775 Task: Indicate the different traversals on this diagram.
Binary and B Trees The problem with Binary Trees is balance, the tree can easily deteriorate to a linked list. Consequently, the reduced search times are lost. This problem is overcome in B-trees. B for Balanced, where all the leaves are the same distance from the root. B-trees guarantee a predictable efficiency.
B+ Trees There are several varieties of B-trees, most applications use the B+-tree. A B+-tree of degree m has the following properties: 1. Every node has between [m/2] and m children (where m is an integer > 3 and usually odd), except the root which is not bound by a lower limit. 2. All leaves are at the same level, that is the same depth from the root. 3. A nonleaf node that has n children will contain n-1 keys.
B+ Tree Node Structure A high level node P K P K .. . . . . . P K P n-1 n-1 n 1 1 2 2 Pointer to subtree for keys< K Pointer to subtree for keys>= K & < K Pointer to subtree for keys>= K & < K Pointer to subtree for keys< K n-2 n-1 n-1 1 1 2 A leaf node P K P K . . . . . . . P K P n-1 n-1 n 1 1 2 2 Pointer to leaf with smallest key greater than K Pointer to record (block) with key K Pointer to record (block) with key K Pointer to record (block) with key K n-1 n-1 1 2
B+ Tree Leaves 1250 0625 1000 1425 2000 0625 1000 1250 1300 1425 1600 2000 0350 0350 0625 1000 2000 1250 1425 1300 1600 Actual Data Records
A review of Trees Can permit rapid retrieval of data for both random and sequential processing. Can be used on primary or secondary keys. Trees are special cases of networks; in networks, records from different files are joined without a strict hierarchy being observed. This is addressed in the hierarchical and network model lectures.
Some Index Terminology • The attributes which contribute to the index are called the Indexed fields • If these attributes are built on the Primary Key, • then the Index is called a Primary Index • If the index is built on any other attributes, it is called a secondary index and, the attributes may not be unique • Clustering Index: The index is built on NON-UNIQUE attributes and includes one index entry for each distinct value of the attribute. The index entry points to the first data block which contains corresponding values. The file must be ordered on the chosen non-unique attributes
Multi Level Indexes A single level index is an ordered file. It is possible therefore to create a non-dense index to an index. This is known as a Second Level Index The process can be repeated until the ‘highest’ level of the index processes can fit into main memory This will probably be 1 page and has the effect of reducing the number of I/O’s by 1 This concept is known as a non-linear or tree structure
Index Usage An index is used to optimise retrievals E1 E2 E3 E20 E30 E40 E45 E56 p1 p2 p3 key page no E1 1 E20 2 E40 3 select * from emp where eno = ‘E4’; Total number of accesses = Index Accesses + 1 Data Access = 1 + 1 = 2 Average no. of Serial Accesses for a Table with 3 pages is n/2 = 2
Good/Bad Candidates for Indexes (Ingres specific) Create indexes on attributes used in predicates:- Read only and frequently accessed tables > 3 tables Attributes of a predicate in frequently executed transactions High update tables > 6 pages Attributes used in joins Attributes where aggregates are frequently calculated On FK’s if using RI. Integrity violations or cascade speed Good Candidates
Good/Bad Candidates for Indexes Poor Candidates: Attributes with a small number of unique values High update attributes - keep to 2 or 3 if possible
Storage Structures 1 Requirements Heap Hash ISAM BTree Need Pattern Matching 4 4 1 1 Need Range Searches 4 4 1 1 Exact Match Key Retrieval 4 1 2 2 Sorted Data 4 4 2 1 Concurrent Updates 4 1 1 2 Add Data - No Modify 2 3 3 1 Sequential Addition of Data 1** 2 5 1
Storage Structures 2 Requirements Heap Hash ISAM BTree Initial Bulk Copy of Data 1 2 2 2 Table Growth - nil/static N/A 1 1 2 Table growth - low (15%) N/A 1 1 2 Plan to modify periodically Table growth - high 3 3 3 1 Too fast to modify
Storage Structures 3 Requirements Heap Hash ISAM BTree Table size : small 2 1 1 3 Table size : medium (modify 4 1 1 1 disk space available) Table size : large (>1/2 disk 2** 4 4 1 Deletes Frequent 4 1 1 3 Updates Frequent 4 1 1 2 Secondary Index Structure N/A 1 1 1 ** secondary indexes used with a heap structure
Storage Structures HEAP * Supported by Ingres and Oracle HASH * Supported by Ingres and Oracle ISAM * Supported by Ingres BTREE * Supported by Ingres and Oracle * indicates that Compression is available SORTED HEAP Other techniques: Bitmaps Partition Indexing / Reverse Key Indexing Data Clustering - Indexed Data Clusters - Hash Clusters What does Microsoft Access support ?
1 Row 1 Row 2 Row 3 Row 4 1 1 Other Methods Bit Mapping • This is another table and its contents are • a bit to indicate the presence of some value • a row i.d. to reference the row (rowid) Row I.D. GreenRed % of distinct values to total values should be low useful for DSS/data warehouse applications Not good for frequent update or insert applications
Other Methods Clustering - Oracle Is a technique which ‘clusters’, or groups together, related rows of one or more tables in the same data block. The objective is to store (on disk) rows of an application which are used together (e.g. Orders and Items ) - this saves disk I/O on analysis applications A cluster key is necessary for each cluster. Not very successful for high volume processing
Ranking the Storage Structures • heap hash isam btree best used for • 1 - - 2 bulkloading table with data • - 1 1 1 removing duplicate rows • - 1 2 2 exact match • - - 1 1 range/pattern matching • 1 3 2 2 sequential searches • - - 1 1 partial key • - - - 1 access to sorted data • - - - 1 joins on large tables • - - - 1 index grows as table grows • 1 - - - very small tables • - - - 1 very large tables • number indicates ranking of the various structures for the given task • dash indicates that the structure is not appropriate for the particular task
We are now nearing the Semester break. See you again on the 1st May.