170 likes | 359 Vues
CS541 Database Systems. CS 541 Lecture Slides Sunil Prabhakar. Instructor. Sunil Prabhakar LWSN 2142C Office Hours: catch me or by appointment sunil@cs.purdue.edu http://www.cs.purdue.edu/homes/sunil/ Teaching Assistant: Yasin Silva ysilva@cs.purdue.edu Office hours: TBA
 
                
                E N D
CS541 Database Systems CS 541 Lecture Slides Sunil Prabhakar
Instructor • Sunil Prabhakar • LWSN 2142C • Office Hours: catch me or by appointment • sunil@cs.purdue.edu • http://www.cs.purdue.edu/homes/sunil/ • Teaching Assistant: Yasin Silva • ysilva@cs.purdue.edu • Office hours: TBA • Assignments and Projects Sunil Prabhakar
Course Information • Web page: • http://www.cs.purdue.edu/homes/sunil/syllabi/CS541_Fall2004.html • Projects, Assignments, Solutions, Slides • Email alias • Announcements: IMPORTANT • cs541@cs.purdue.edu • mailer add me to cs541 • WebCT • Grades • Check that you can log in Sunil Prabhakar
Course Description • Introductory graduate course on databases • Fundamental concepts & internals • Some coverage of use of databases (Oracle projects) • Will not teach use of databases!!! • Focus on Relational Databases Sunil Prabhakar
Topics • DBMS Concepts and Architecture • Relational Database Model • Relational Languages (Algebra, Calculus, SQL) • Storage and Indexing • Query Processing • Query Optimization • Transaction Processing • Concurrency Control • Recovery • Advanced Topics: TBD (Mining, Indexing, Sensors, …) Sunil Prabhakar
Pre-Requisites • Data Structures • Notions of trees, hashing, linked lists etc. • Operating Systems • I/O • Java • Project 3 will be done in Java • RMI • Simple GUI Sunil Prabhakar
Text • Database System Concepts (4th Edition) • Silberschatz, Korth, Sudarshan • ISBN: 0-07-228363-7 • McGraw Hill • Supplemental Text: • Concurrency Control and Recovery in Database Systems • Bernstein, Hadzilacos, Goodman. • Out of Print: Avaliable free on the Internet • Link from course web page. Sunil Prabhakar
Grading Policy • Tentative • Written Assignments (2) 20% • Programming Projects (3-4) 40% • Mid-term Exam 20% • Final Exam 20% • Final not comprehensive • Grading is curved • No extra credit assignments Sunil Prabhakar
Academic Integrity • CS Policy • IMPORTANT: visit, read and accept!!! • https://portals.cs.purdue.edu/student • Need CS login and password. • Cheating will be taken very seriously. • Make sure that you are familiar with what CS considers to be cheating!! • You may discuss the problems, but the final solution must be your own. Sunil Prabhakar
Course Policy • NO LATE SUBMISSIONS • NO LATE SUBMISSIONS • NO EXTENSIONS • NO EXTENSIONS *** Only on Documented Medical Reasons or Family emergency. Sunil Prabhakar
Databases • What is a database? • S/w to manage data. • Why do we need a database? • Ease of development, • Efficiency • Concurrency • Reliability • Ease of administration • Data independence • Importance of databases? • Increasing or decreasing? What is changing? Sunil Prabhakar
What is interesting? • Essential to modern applications? • Data is a valuable commodity. • Is there anything challenging? • Encompass PL, OS, Logic, Theory, … • Novel solutions with wider applicability: Transactions, Locking, … • What remains to be done? • Modern applications: Multimedia, Sensors, Streams, Data Warehouses, Data Mining, Privacy and Security, Knowledge, Data on the Web, XML, …. Sunil Prabhakar
Abstraction • How to provide a generic, application-independent solution? • Data Models • Abstract view of data • Database efficiently supports this model • Examples: Network, Relational, OO, O-R, … • Most successful model: RELATIONAL • Users access the database as a black box that supports the model. • Languages are used to interact with this Box: • Relational Algebra, SQL, Sunil Prabhakar
Independence • Databases allow applications and users to be shielded from the internal details: • Physical data independence • How data is stored (bits, pages, formats, etc.) • Compare with Flat file alternative • Logical data independence • How data is structured logically. • Allows applications to make changes to the logical organization of data without have to rebuild applications Sunil Prabhakar
Concurrency Control & Recovery • Two highly desirable requirements: • Enable multiple users to access the data at the same time. • Automatic recovery from crashes. • Challenge: • How to do this in an application-independent manner? • Solution: • Transactions • “Contract” between the DB Black Box and users. Sunil Prabhakar
Performance • Critical for databases • Research focus for many years • Must be transparent to the users • Query processing & Optimization • Indexing, storage organization (data independence) • Challenge: • How to optimize without understanding the semantics of an application? • Solution: • Relation data model -- clean mathematical abstraction, allows for alternative equivalent evaluations Sunil Prabhakar
This course • Study the relational model, ER model, languages. • Transactions • Concurrency Control • Recovery • Storage and File Structures • Indexing and Hashing • Query Processing and Optimization • Advanced Topics • New data types, applications, multi-dimensional data, data warehousing, data mining, design, … Sunil Prabhakar