340 likes | 470 Vues
This session by Dr. S. Sudarshan and Dr. D.B. Phatak from IIT Bombay explores essential database technologies, focusing on storage devices, file and index structures, and the evolution from legacy systems like COBOL to modern relational databases and SQL. Key topics include ACID properties of transactions, system architectures, security, and the importance of RAID systems for data reliability. The session provides insights into programming paradigms, performance tuning, and access security, offering a comprehensive understanding of database management in contemporary applications. **Relevant
E N D
Database Technology Session II 10:15 - 11:00 Dr S Sudarshan & Dr D B Phatak IIT Bombay
OVERVIEW • Storage devices • Files and Index Structures • Legacy Systems and Cobol • Relational Databases and SQL • Transactions and ACID properties • System Architectures • Security and Audit Data Base Technologies
Storage Devices • Main memory • volatile, lost on power failure • expensive and relatively small • Hard disk • non-volatile, reasonably fast access • relatively cheap, and large • main storage system for databases • Mean time to Failure: ~5 years Data Base Technologies
RAID Systems • Goal: improve storage reliability • Data stored on multiple disks • if one disk fails, data still available on others • Essential for safety of data • Hardware RAID • expensive, very high availability • for 24x7 applications (24 hrs X 7 days/wk) • Software RAID • cheaper, use if some downtime is allowable Data Base Technologies
Storing Data on Disks • Byte: unit of information • one character • File: • sequence of bytes • File system: • stores multiple files • organizes files into directories/folders Data Base Technologies
File Structure • Information stored within large files • Sequential files • sorted on a key (e.g., account number) • Index needed for efficient access • e.g. find information of account 2345 • similar to library card catalogs Data Base Technologies
Traditional File Processing • COBOL: • Common Business Oriented language • Files contain sequence of records, • e.g. Record per account • Complex program for each task • e.g. withdrawal, deposit, average balance, … • File structures often very complex • motivated by efficiency, but become hard to understand Data Base Technologies
SEGMENT OF A SAMPLE COBOL PROGRAM Open Input SALARYTABLE-FILE. • Perform Varying I from 1 to 11. • Move zero to GROUPTOT [I]. • END-PERFORM. • PROCESS-NEXT. • Read EMPTAB-FILE • At End Go To End-job. • - - - - - • Add salary to Group tot [K]. • Go to Process-Next. • END-JOB. • - - - - Data Base Technologies
PROGRAMMING PARADIGMS • 4 GL: What to do • Set processing • 3 GL: How to do (Algorithm) • Record by Record Processing • 2 GL: Algorithm at the lowest level • Details of Individual Operations Data Base Technologies
Relational Databases • Motivation: • simplify storage structures • easy to use language for queries/updates • efficiency is job of system • automatic optimization • Legacy systems • Systems built using COBOL and older data models • Still in wide use, but declining usage Data Base Technologies
Relational Databases Provide: • Tabular Data model: simple, yet powerful • A Standard Query Language: SQL • Mature Products with Reliable, Fault-Tolerant Operations available • Good Performance • High number of transactions per second • Parallel operation for scalability (handle growth) • Distributed and Replicated Data Bases • Interoperation, High availability Data Base Technologies
Relational Model: Tabular Data Account Transactions Data Base Technologies
Querying the Database using SQL • select name, balancefrom accountwhere name = “D. B. Phatak” • insert into transactions values (3/5/99, cash, 5000, -, 14000) • select acct-num, avg(balance)from transactionswhere date between 1/4/99 and 30/4/99groupby acct-num Data Base Technologies
Transactions: ACID Properties • Transactions: • e.g. Debit/credit • Problems: • Failures (e.g., power, disk storage) • Concurrent transactions • Solution: • Support for ACID properties Data Base Technologies
ACID Properties • Atomicity: Transaction appears to either run completely or not at all -- no partial state • Consistency: Integrity checks (e.g., balance >= 0) • Isolation: Locks on data so that transactions do not step on each others toes • Durability: Data/updates are never lost Data Base Technologies
Database Application Classes • OLTP: Online Transaction Processing • supports many small transactions • Decision Support • Summaries/aggregates • OLAP: Online Analytical Processing Data Base Technologies
Performance Issues • Important for high volume systems • e.g., internet banking • Sizing / Performance tuning • deciding on CPUs, memory size, disk size, number of disks, etc • tuning transaction code • to reduce disk I/O • to reduce lock conflicts between concurrent transactions • Scalability via parallelism • Smoothly handling more traffic as the business grows Data Base Technologies
Access Security • Authentication • Identifying who a person is • Passwords • widely used, but quite insecure • Smart cards, biometrics, etc • More on this later • Access privileges • who is allowed to do what • Audit trail • Trace back what happened Data Base Technologies
Authorization Mechanisms • Privileges • e.g., read table, update table, insert row in table, delete row from table, privilege to grant privileges • Each user given specific set of privileges he/she needs • Roles • privileges given to roles • (e.g., teller, manager) • users authorized to play roles Data Base Technologies
Audit Trails • Database keeps track of all transactions in an audit table • what the transaction did, and who ran it • Mechanisms for tracing back what transactions affected a particular entity (such as an account) Data Base Technologies
MORE ON SECURITY • Encryption, Session Key • Data Encryption Standard (DES-3) • Shared Private key • RSA, DH Algorithms • Public/Private key Data Base Technologies
SQL • SQL is Intergalactic Dataspeak • Strong Data Definition Language (DDL) • Domain Definitions • Integrity Constraints • Security & Access Control Provided • Views, Permissions • Interactive Queries Data Base Technologies
SQL • Embedded SQL: • Use of SQL commands from within 3GL programs (programs in C/COBOL/…) • Open Data Base Connectivity (ODBC): • Standard for client server interconnectivity, using C language • JDBC: Like ODBC, for Java language • SQL Standards: 86, 89, 92, SQL-3 Draft Data Base Technologies
SQL • No Standards for User Interface • Screens (Forms & Menus) • Reports • Special Tools Available • Native to Product • Independent Vendors Data Base Technologies 24
DATABASE SERVERS • Major players • Oracle, IBM DB2, Microsoft SQL Server, Informix, Sybase, Ingress • Wide range of performance, features, and price Data Base Technologies
Database Architectures • Centralized • Dumb terminals connected to single server • Client Server • Smarter client machines connect to server • Main work still done at server • Parallel Servers • Work divided between multiple CPUs • Distributed • Multiple independent databases in cooperation Data Base Technologies
EVOLUTION OF CLIENT SERVER COMPUTING • Multiple Computers to Independently Handle Component Tasks of an Application • Need to Partition Tasks Judiciously Data Base Technologies
TWO TIER PARTITIONING Data Base Technologies
CLIENT - SERVER TERMINOLOGY • Service: Provided by the Server • Each Client Is a Consumer • Shared Resources : Managed by Server • Client : Initiator of a Request Data Base Technologies
SERVERS • File Servers : Novell • Database Servers : SQL • Transaction Servers : OLTP • TP lite (Stored Procedures) • TP heavy (TP monitors) Data Base Technologies
SERVER FUNCTIONS • Wait for Requests • Handle Concurrent Transactions • Take care of VIP requests • Assign priorities • Authentication, Authorisation • Audit trails Data Base Technologies
CLIENT FUNCTIONS • Typically Processes Running on Front-end Machine (PC) • Provide User Interface • Support Graphics, Multimedia Data Base Technologies
THREE TIER C/S • Tier One : Client • Tier Two : Application Server • Tier Three: Database Server Data Base Technologies
MORE C/S TERMINOLOGY • Thin or Fat Client • Fat Server • n-Tier Architecture • OOUI Data Base Technologies