5. File Structure and Indexing

5. File Structure and Indexing • File Structure • Background • Overall System Structure • Disk Manager • File Manager • Buffer Manager • Indexing • Introduction • B-tree • Hasing

Background • The database technology is very useful to deal with very large • volume of data in many applications such as on-line transaction • processing, file management systems, mailing systems, etc. • A DBMS has a data access system as its lower subsystem. • A data access system manages various storage structures such as • sequential file and indices to be used for efficient key-associative • accesses to large volume of data. And it provides method to access • the storage structures for upper subsystem, i.e., a query processor. • Moreover, data access system has concurrency control and • recovery facility to support transaction concepts (atomic action).

Background (cont'd) • Data Access Systems • (1) RSS (Relational Storage System) - 1976 Astrahan • - Data access system of System R (IBM's first RDBMS) • - B+ tree, concurrency control, recovery facility • - generally accepted as one of pioneers • (2) WiSS (Wisconsin Storage System) - 1985 Chou • - Interface to UNIX • - B+ tree, extensible hash as indices • - Support a storage structure to store long data items. • - MWiSS (Gamma Relational Database Machine, 1989) • (3) RSAM (Relational Storage Access Method) • - 1987 Informix Inc. • - Lower subsystem of Informix-Turbo • - Manage disk instead of UNIX file system.

Overall System Structure user query parser strategy selector log user transaction recovery manager concurency controller buffer manager buffer (main memory) lock table file manager DISK MANAGER (OS) data files system catalog indices

Overall System Structure (cont'd) • Query Parser • translates statements in a query language into lower-level • language • Strategy Selector • transform a user's request into an equivalent and efficient • access plan, • thus finding a good strategy for executing the query. • Buffer Manager • replace pages using LRU algorithm. • File Manager • manage database file such as data files, system catalog. • * Indices are usually managed by Index Manager.

Overall System Structure (cont'd) • Recovery Manager • ensures that the database remains in consistent (correct) • state despite system failures. • Concurrency Controller • ensures that the database interactions with the database • proceed without conflicting with one another. • Data Files • database itself. • Indices • provide for fast access to data items holding particular values. • System Catalog • store information about the structure of database, and • authorization information, constraints and statistical data.

Database Access • Objectives of DBMS Performance • - Minimize disk access time • (micro floppy disk : 400 ms, Main Frame Disk : 30 ms) • --> needed good storage structure and access method • Access Procedure user request (set of records) FM decides pages which have given records request records records File Manager request pages pages Disk Manager DM decides the place in disk device and move it in memory I/O operation data move Disk

Disk Manager • Functions • - Mapping page number into physical disk address. • Operations • - Rerieval page P from a page set • - Insert page P from a page set • - Delete page P from a page set

File Manager • Functions • - Manage stored files • Operations • - Rerieval record R from a stored file (set of records) • - Insert record R into a stored file • - Delete record R from a stored file • - Create a stored file F • - Delete a stored file F

Data Clustering • Basic Idea • - Locate related data in the same or adjacent disk place • - very important in system performance • Example • We have read record R1. Next, We need record R2. Suppose R1 • is stored in page P1 and R2 is stored page P2. • (1) If R1 and R2 are the same page, • we have only one access. • (2) If R1 and R2 are not same page but adjacent, • we cn reduce the I/O seek time. R1 R2

Page Management • Example of Page Management COURSE STUDENT Sno Year Year Dept 100 Na 4 CE 200 Lee 3 EE 300 Jung 1 CE Cno Cname Credit Lecturer C123 PL 3 Kim C312 DS 3 Hwang C324 File 3 Lee S1 S2 S3 C1 C2 C3 ENROL Sno Cno Grade 100 C413 A 100 E412 A 200 C123 B 300 C312 A 300 C324 C UNIV database E1 E2 E3 E4 E5 * Suppose one record is stored in one page

Page Management (cont'd) • Disk Structure disk directory page number 0 1 3 2 4 S1 S2 S3 C1 • page = block • Transfer unit Memory • to Disk 5 7 9 6 8 C2 C3 E1 E2 E3 11 10 12 13 14 E4 E5 free page

Page Management (cont'd) • Loading UNIV database • - Disk Manager allocate the STUDENT table, • COURSE table, ENROL table. • Insertion Record • - Insert S4 (Sno = 600) • => allocate S4 in page #12 (free page), connect pointer • Deletion Record • - Delete S2 (Sno = 200) • => delete S2 (page #2) return to free page set. • * if another insertion is happened, the record is allocated in • front of free page set (page #2).

Page Management (cont'd) • Disk Structure after insertion and deletion operations disk directory 0 1 3 2 4 page number S1 S3 C1 5 7 9 6 8 C2 C3 E1 E2 E3 11 10 12 13 14 S4 E4 E5 free page

Page Management (cont'd) • Disk Directory : control information to manage pages 0 next page pointer / null page number Page set Address Free page set 2 STUDENT 1 COURSE 4 ENROL 7

Record Management • Example of Record Management : • Suppose several records are stored in one page • Loading STUDENT file in a page p next page pointer page number S1 S2 S3 Free space

Record Management (cont'd) • Insertion Record • - Insert S4 (Sno = 400) • => allocate S4 in free space of a page • Records are identified by RID (Record IDentifier) • - RID = Page Number + Offset p next page pointer page number S1 S2 S3 S4 Free space

Record Management (cont'd) • Deletion Record • - Delete S2 (Sno = 200) and Compact remaing records p next page pointer page number S1 S3 S4 Free space

Record Management (Implementation View)) • Structure of Data File Data Page FileDesc LastPageID FirstPageID .... record 1 record 2 ...... DATA Control Information recor Slot[2] Slot[1] RIDCnt Free FileID ThisPage PrevPage NextPage Slot[2] Slot[1] RIDCnt Free FileID ThisPage PrevPage NextPage

Record Management (Implementation View)) • Field Information typedef struct { char Data[ ] Data area short Slot[ ] Start address (offset) of records short RIDCnt Number of records short Free Start address of free space T_FID FileID File Identifier including this page T_FID ThisPage This page Identifier T_FID PrevPage Previous page address T_FID NextPage Next page address } T_DATAPAGE;

Data Page new record 1 record 2 DATA Slot[3] Slot[2] Slot[1] RIDCnt Free FileID ThisPage PrevPage NextPage Record Management (Implementation View)) • Record Management Functions (1) Add Record - insert new record in Free Space - update Free pointer and Slot - increase RIDCnt

Record Management (Implementation View)) (2) Delete Record - delete record 2 - Fill negative integer in record's length field - decrease RIDCnt Data Page -Length / Data record 3 record 1 DATA Slot[3] Slot[2] Slot[1] RIDCnt Free FileID ThisPage PrevPage NextPage

Data Page record 2 record 3 record 1 updated data Slot[3] Slot[2] Slot[1] RIDCnt Record Management (Implementation View)) (3) Update Record - Update record 1 - if length of new and old record is equal, overwrite record 1 else, delete old record and add new record in the end of last record. Free FileID ThisPage PrevPage NextPage

Indexing

Introduction • To find some information in a book • Sequential Scan Operation • - search the entire book • Index Scan Operation • - search the index and go to the page • To find some data in the database • Sequential Scan 화일자체가순서화, 색인존재안함 • Index Scan 데이타화일은무순, 별도색인화일존재 • + faster than a sequential scan • - more slow than a sequential scan in insert/delete operation • much storage occupation than sequential file

Classification • Three types of access method • Sequential File Method • Index File Method • (B+ tree index structure) • Hashing • Sequential File Method • - Logical order is equivalent to Physical order • - All the records may be moved to maintain the physical order • for Insertion / Deletion operation • - application : batch processing • - See Fig. 6-12 in Lee S. H.

Classification (cont'd) • Index File Method • - Concept : Index File + Data File • - We have several indices in one data file (multikey file) • (1) Inverted File • - There exist many index in a data file • - See Fig. 6-16 • (2) Multilist File • - records connected by pointer • - See Fig. 6-17

B+ tree • (3) B+tree • - most popular techniques for organizing an index structure • - duplication of the keys in index part and sequential order • - direct access + sequential access • - balanced tree • Component • Index part • - the internal nodes • - for rapid direct access to the keys in the leaves • Sequence set • - the leaves • - the linked list of the keys in sequential order

B+ tree (Definition) • Definition • Root have 0, 2, or n/2 to n subtree • All the node except root have n/2 ~ n subtree • All leaf nodes have the same depth • Internal node have (n - 1) key values • Leaf node connected in sequential 차수, 포인터수

B+ tree (operations) • Structures • Leaf node • Internal node P1 K1 P2 K2 P3 K3 .... Pn-1 Kn-1 Pn next leaf node file record with search-key Ki P1 K1 P2 K2 P3 K3 .... Pn-1 Kn-1 Pn bucket pointers (node) which have key value > Kn-1 bucket pointers (node) which have key value <= K1

B+ tree (operations) • Operations • Retrieval • log n/2 (K) n : # of pointer, • K : # of total key value in a file • (ex) K = 1,000,000 n = 20 ~ 200 • Number of Retrieval? 3 ~ 6 • Insertion (Fig. 6.21) • (1) search and store (insertion ‘115’) • (2) search and reorder (insertion ‘129’) • (3) split (insertion ‘90’) • * n/2 기존노드, 나머지  새노드 • 기존 노드의 끝 데이터는 부모 노드에 첨가 • Also see Korth book p. 256

B+ tree (operations) • Deletion • (1) search and delete key only leaf node(deletion ‘132’) • (2) merge • 삭제하니 노드가 empty 인 경우(부모에 포함): • 부모 노드에 삭제된 데이터가 있으면 삭제 • (deletion ‘130’) •  삭제하니 노드가 empty 인 경우 (부모에 미포함): • (deletion ‘110’) • 부모 노드에 삭제된 데이터가 없으면 • 부모 노드를 합병, 경우에 따라 리프 노드도 조정 • Also see Korth book p. 256

Hashing - Hash 함수값으로원하는값(데이타)에한번만에접근 - Hash 함수값의범위가대체로크고(무한대), 키값의집합은적다. - Hash 함수를잘설계하여가능한키값이고르게분포되게한다. open problem (overflow) 종류 Static Hashing - 버켓주소집합B가고정 - 초기에는공간낭비, 데이타가증가하면성능저하 - 주기적으로hash 구조재조정(소요시간, 사용금지등단점) Dynamic Hashing (Extendible Hashing 대표적; Fagin 1979, ACM TODS) - DB가축소, 확장됨에따라버켓을분리융합하여 DB의변화에대처한다. - 저장공간의효율성 - 버켓의분리, 융합에따른부단이다소있으나, static hashing에비하면경미하다.

Hashing (operation) (1) 초기화 depth: 접근에 필요한 bit depth: 같은 비트 수 0 0 0 Directory (주소 테이블) bucket (데이타) 1 (2) 입력 Round (0101) Red(1011) ? 1 1 0 1 Perry(1111) down(1010)

Hashing (operation) 1 (2) 입력 0 10 11 Round (0101) 2 2 down(1010) Red(1011) ? 00 01 10 11 2 Perry(1111)

연습문제 1. 사용자가 데이타베이스에 접근하는 과정을 단계별로 설명하라. 2. B+ 트리의 특성은? 3. 확장성 해싱의 동작 방법? 4. 페이지 관리 기법을 설명하라.

5. File Structure and Indexing