400 likes | 778 Vues
CSE 480: Database Systems. Lecture 1: Introduction. Reference: Read Chapters 1 & 2 of the textbook. Database Systems are Pervasive. Retail. Banking. Law enforcement. Database-Driven Web Sites. What is a Database?.
E N D
CSE 480: Database Systems • Lecture 1: Introduction • Reference: • Read Chapters 1 & 2 of the textbook
Database Systems are Pervasive Retail Banking Law enforcement
What is a Database? • Collection of related data central to a given enterprise (mini-world or universe of discourse) • Examples: • Banking – savings/checking accounts, mortgage, etc • Vehicle registration – car registration, year, make, etc • Student registration – name, PID, GPA, last semester enrolled, etc • Electronic Medical Records – name, SSN, date of birth, address, symptoms, diseases, medication, test results, etc
Example of a Database • Mini-world: UNIVERSITY environment • What are the mini-world concepts that need to be captured by the database? • Entities: • STUDENTs • COURSEs • SECTIONs • DEPARTMENTs • INSTRUCTORs
Example of a Database • Relationships between entities of the mini-world: • SECTIONs are for specific COURSEs • STUDENTs take SECTIONs • COURSEs have prerequisite COURSEs • INSTRUCTORs teach SECTIONs • COURSEs are offered by DEPARTMENTs • STUDENTs major in DEPARTMENTs
Database Architect or Designer Example of a Database • Constraints on the entities and relationships • Each course must have a unique course number • GPA must be a real number between 0 and 4.0 • Each section has only one instructor but an instructor can teach more than one section • Database design (Lectures 2-4) • Specifying the entities, relationships, and constraints of a mini-world using the Entity-Relationship and Enhanced Entity Relationship models.
Database Management System (DBMS) • A collection of programs that enables users to create and maintain a database • Examples of DBMS • MS Access, MS SQL Server, IBM DB2, Oracle, Sybase, Postgres, mySQL, and many more • Why do we need a DBMS?
File Server Architecture (no DBMS) Thick client Source: Modern Database Management. 6th Edition, Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden
Client-Server DBMS Architecture Thin client Source: Modern Database Management. 6th Edition, Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden DBMS running on database server; performs all data storage and access operations
Three-tier Architecture Business rules stored on application server Source: Modern Database Management. 6th Edition, Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden
Typical DBMS Functionalities • Define a database • Specify the structure of the data records • Construct a database • Store the data on some storage medium controlled by the DBMS • Manipulate the database • Query the database to retrieve specific data, update the database to reflect changes, and generate reports • Support concurrent processing and sharing by users and applications • yet, keeping all the data valid and consistent • Support protection/security measures to prevent unauthorized access
Characteristics of DBMS • Self-Describing • Provides insulation between programs and data • Allows multiple views • Allows multi-user transaction processing
Characteristics of DBMS • Self-describing nature of a database management system • DBMS contains not only the data but also complete description of its structure and constraints • Structure: Student ID is 10 characters long, GPA is a real number • Constraints: GPA must be between 0 and 4.0 (non-negative) • A DBMS catalog stores the description of the database • The description is called meta-data • This allows the DBMS software to work with any types of data (banking, university, company, etc)
Example of DBMS Catalog Information in DBMS catalog are needed for query processing and optimization (to be discussed more in lectures 22-24)
Characteristics of DBMS • Insulation between programs and data • Program-data independence • Allows changing data storage structures and operations without changing the DBMS access programs • Program-operation independence • In OO and OR database systems, users can define operations (methods) on data using an interface; implementation of the operation (method) can be separately specified
Characteristics of DBMS • Support multiple views of the data • A database typically has many users, each of whom require different perspective (view) of the database • A common principle used by many organizations is that data must be accessible on a need-to-know basis • Example: • Student database may contain information about student’s name, SSN, courses taken and grades, salary, etc • Users of the database include registrar office and payroll department • Registrar doesn’t need to know what is student’s salary • Payroll doesn’t need to know what is student’s GPA
Characteristics of DBMS • Multi-user transaction processing • Database stores information about current state of an enterprise • Example: Bank database stores balance for each customer account • When an event in the real world changes, a transaction is executed to cause corresponding change in the database state • A transaction is an executing program or process that includes one or more database accesses, such as reading or updating database records • Each transaction is designed to maintain correctness of the relationship between database state and real-world enterprise it is modeling • Example: When a customer deposits $50 in a bank, a deposit transaction is executed to increase the account balance by $50 • Concurrency control of DBMS ensures correctness of the database when multiple concurrent transactions are executed
Database System Concepts • Data Models • Database Schema vs Database Instances • DBMS Languages
Abstraction • Data is actually stored as bits, but it is difficult to work with data at this level • DBMS provides a level of abstraction by hiding the details of data organization and storage • A data model is used to hide storage details and present the users with a conceptual view of the database
Data Model User/Program DBMS Data model Student Course Department (CSE480)(CSE331) (CSE, Engr)(ECE, Engr) (John, 21)(Mary, 19) Physical data storage 111000010010111011010111011011
Examples of Data Models • Network Model • Hierarchical Model • Relational Model (most widely used) • Object-Oriented Data Models • Object-Relational Models • More recently, NoSQL • Google BigTable • Amazon Dynamo • Facebook Cassandra
Relational Data Model • Proposed by Edgar Codd • E. F. Codd: A Relational Model of Data for Large Shared Data Banks. Commun. ACM 13(6): 377-387 (1970) • Model the data as relations (tables) • Advantages: • Simple • Mathematically based • Has a set of powerful, high-level operators to analyze relational expressions ( queries) • Queries are transformed to equivalent expressions automatically (query processing and optimization) • Transformed expressions can be executed more efficiently
Database Schemas versus Instances • In any data model, it is important to distinguish between description of the database from the database itself • Database Schema: • The description of a database • Includes descriptions of data elements, data types, and constraints • Schema Diagram: An illustrative display of a database schema • Database Instance (State/Snapshot): • The actual data stored in the database at a particular moment in time • Valid State: A state that satisfies the structure and constraints of the database
Database Schema vs. Database State • Distinction • The database schema changes very infrequently. • The database state changes every time the database is updated. • Schema is also called intension • State is also called extension
Internal Schema Three-Schema Architecture External schemas Physical storage for data about students, courses, employment, etc
Internal Schema/Level • Describes the details of how data is physically stored • Specify how data is stored in files, tracks, cylinders. • Specify the indices that support fast access to the rows of a table • Specify the machine that has the data (Data may be distributed)
Conceptual Schema/Level • Hides the details of physical data representation • In the relational model, the conceptual schema presents data as a set of tables (relations) • DBMS maps from conceptual to internal schema automatically • Physical data independence • Internal schema can be changed without changing the conceptual schema
External Schema/Level • External schema customizes the conceptual schema to the needs of various users • In the relational model, the external schema also presents data as a set of relations External schemas
External Schema • Application is written in terms of an external schema. • Different external schemas can be provided to different categories of users • DBMS maps external to conceptual schema automatically at run time • Logical data independence • Conceptual schema can be changed without changing external schema and application programs
DBMS Languages • Data Definition Language (DDL): • Used to specify the conceptual schema of a database • In many DBMSs, DDL is also used to define internal and external schemas (views). • In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas CREATE TABLE DEPARTMENT ( DNAME VARCHAR(10) NOT NULL, DNUMBER INTEGER NOT NULL, MGRSSN CHAR(9), MGRSTARTDATE CHAR(9) );
DBMS Languages • Data Manipulation Language (DML) • Used to specify database retrievals and updates • Both DML and DDL can be embedded in a general-purpose programming language, such as C, C++, Java or PHP INSERT INTO DEPARTMENT VALUES ( ‘Payroll’, 154, ‘123-11-2344’, ‘2005-06-22’); SELECT MgrSSNFROM DEPARTMENTWHERE DName = ‘Payroll’;
MySQL Account • Every registered student will have access to a MySQL account on mysql-user.cse.msu.edu • To log in, go to: • http://www.cse.msu.edu/facility/phpMyAdmin/index.php • Username is your CSE usernamePassword is your PIDServer Choice: mysql-user • Send an email to manager@cse.msu.edu if you have problems logging in