250 likes | 377 Vues
V. Data Storage Management. Yunsheng Liu. Software College, HUST 2007. 11. Data satisfying request. CPU. Request for data. Cache. Primary storage. EEPROM. Main Memory. Fresh Memory. Magnetic Disk. Secondary storage. CD-ROM. Optical Disk. Magnetic Tape. Tertiary storage.
E N D
V Data Storage Management Yunsheng Liu Software College, HUST 2007. 11
Data satisfying request CPU Request for data Cache Primary storage EEPROM Main Memory Fresh Memory Magnetic Disk Secondary storage CD-ROM Optical Disk Magnetic Tape Tertiary storage 5.1 The Memory Hierarchy 5.1.1 The Storage Levels Yunsheng Liu
Spindle Disk arm Disk head Tracks Platter Cylinder Sector Gaps Block 5.1.2 Disk Property— electricalerasable Yunsheng Liu
5.1.2 Disk Property access time=seek time+rotational delay+transfer time Yunsheng Liu 2. Performance property of disks 1). Data must be in main memory for the DBMS to operate on it 2). The unit for data transfer between main memory and disk is a block. R/W a disk block is called an I/O 3). Block access time—from when an R/W is issued to when the block appears in MM:
3. Performance measures of disks • Capacity • Access latency( seek time + rotational latency time) • Data transfer rate 4. Optimization of disk-block access • File organization • Scheduling (Disk-arm, i.e. I/O) • Nonvolatile RAM for writing: Battery-backed-up RAM • Log disk—devoted to writing log in much the same way as non-V RAM. Log-based file system 5.1.2 Disk Property Yunsheng Liu
Level 0: Nonredundant striping • Level 1: Mirrored disks • Level 2: Error-correcting code (ECC) 5.1.3 RAID Concept • Increase performance—data striping • Improve reliability—redundancy • RAID Levels Yunsheng Liu RAID—Redundant Arrays of Independent(historically Inexpensive) Disks A Disk array—an arrangement of several disks, organized so as to
5.2 Stored Data Management 5.2.1 Introduction 1. Stored Data Kinds: User database Data Dictionary/ Directory, Log 2. Stored Data Structures: Arrangement: sequential, random Connection: address adjacent, chaining 3. Access modes: sequential, indexed, hashing 4. I/O Buffer management 5. Interface to OS Yunsheng Liu
5.2.2 Storage Management Structures 5.2 Stored Data Management 2. Physical structure Stored structure: stored file stored record stored item Device structure: device volume cylinder track physical record/sector • Logical structure: • Logical file page record field 3. Allocation structure Extent block Yunsheng Liu
Allocation structure Physical structure Logical structure Stored structure Volume Logic File Stored File Extent Cylinder Track Block Page Sector Stored Record Logic Record Stored Item Field 5.2 Stored Data Management 4. Mapping From Logical Structure to Physical Structure Yunsheng Liu
Heap File Sequential File Sorted File General Index File File Org. Indexed File B-Tree Tree Index File Random File B+-Tree Static Hash Hash File Dynamic Hash 5.2 Stored Data Management Yunsheng Liu 5.2.3 Overview of File Organizations 1. Stored Data Arrangement 2. Access Modes
Storage structure Heap Sorted Indexed Hashed Chain Access mode Chained Sequential Adjacent Sequential processing Indexed sequential Static hash Tree- structural Random processing Dynamic hash 5.2 Stored Data Management Yunsheng Liu 3. Classification of File Organizations
Frame 1 Frame N Frame 2 Frame N+m Fid P Data block 1 Data block 2 Data block N • • • File Head Free blocks 5.3 Sequential File Structure - How to organize blocks/pages in a file to support to create, destroya file, and get, insert, delete a record andscanall records in the file • Conjunctive arrangement of blocks • Problems: how to insert, delete? how many free slots/pages? Yunsheng Liu
S# SName SAge Dept 200103001 李红光 23SW 200405840 何清溪 19MS 200203123 刘要武 20CS 200101015 李光 22EE 200203101 刘 民20CS 200305103 张一清21MS 200403123 张扬名18SW 200201123 王克勤21EE S# SName SAge Dept 200101015 李光 22EE 200103001 李红光 23SW 200201123 王克勤21EE 200203101 刘 民20CS 200203123 刘要武 20CS 200305103 张一清21MS 200403123 张扬名18SW 200405840 何清溪 19MS Student Student (b). Ordered Sequence Structure (a). Natural Sequence Structure 5.3 Sequential File Structure • Example Yunsheng Liu
File Header … Fid P Data page Data page Data page (a). Hybrid Chain Data block File Header Data block Data block P1 P2 Fid … … Data block Data block (b) Separated Chain Data block 5.4 Chained List File Structure • The space for pointers of the chains • Virtually, the full list will be empty in variable record Yunsheng Liu
5.5 Index Structures 5.5.1 Overview of Indexes 1. Concepts • An index is an auxiliary data structure that is intended to help us find Rids of records with given search key value • An index is a file/collection of records, referred as indexentries, which are usually pairs (k, Rid) and Rid is a pointer to a record with search key value k • An index is a mechanism of KTA(Key to Address) Yunsheng Liu
kj Indexing on SK ki Search key SK kr ki ridi Domain of SK k kr ridr kj ridj Index entries Data File Index The records with the value k of SK 5.5.1 Overview of Indexes Yunsheng Liu 2. Generic index structure
5.5.1 Overview of Indexes 3. Index file organizations • How to organize index entries to support rapid retrieval of entries with a given search key value? e.g. • Sequential indexes • Various tree-structural indexes, • Hash-based indexes—Scatter Table Yunsheng Liu
5.5.2 Properties of Indexes • Dense vs. Sparse • Dense: an index entryindividual data record • Sparse: an index entrya set (usually, a block/page) • of data records Yunsheng Liu Clustered vs. Unclustered Clustered— the ordering of data records is the same as (or close to ) the ordering of index entries - The two orderings are matched with each other Unclusterd —not match with each other
5.5.2 Properties of Indexes Yunsheng Liu Primary vs. Secondary Primary index Primary key Secondary index Candidate/Secondary key Simple vs. Composite Key Composite key more than one fields Simple key single field
-- … … Root Node … Kn rn Pn P0 K1 r1 P1 K2 r2 P2 • • • Pn-1 Inner Nodes R(K2) R(K1) Data Records 5.6 B-Tree Structured Indices Yunsheng Liu Nonleaf Node structure
60 10 50 70 20 - - … K1 r1 K2 r2 … Km rm 3 5 - - 52 54 55 61 63 65 74 78 - 69 11 13 14 - 22 23 - … R(K1) R(K2) … - R(50) R(20) R(10) R(60) R(70) 5.6 B-Tree Structured Indices Leaf Node Structure A B-tree structure Yunsheng Liu
P0 K1 P1 K2 P2 • • • Pn-1 Kn Pn … K1 r1 K2 r2 … Km rm … 5.7 B+-Tree Structured Indices Nonleaf Node structure Leaf Node Structure Yunsheng Liu
Random access Index Set Sequential access B+-tree Sequence Set Record Set Data File 5.7 B+-Tree Structured Indices A B+-tree structure Yunsheng Liu
Major Data Area Overflow Area Hash Function B0 B1 ki h(ki) Block •• • •• • Bn-1 Record slot 5.8 Hashing File Structures General Hashing Structure Yunsheng Liu
Primary blocks Bu1 Key Hashing Function Bu2 • • • Buckets KTA Transformation • • • Bun • • • 5.8 Hashing File Structures Bucket Hashing Yunsheng Liu