1 / 24

Principles of Database Management Systems CSE 544

Principles of Database Management Systems CSE 544. Introduction March 29th, 2000. Staff. Instructor: Alon Levy Sieg, Room 310, alon@cs.washington.edu Office hours: by appointment. TAs: Bart Niswonger and Stefan Saroiu Office hours: also by appointment. Mailing list: cse544@cs

Télécharger la présentation

Principles of Database Management Systems CSE 544

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Principles of Database Management SystemsCSE 544 Introduction March 29th, 2000

  2. Staff • Instructor: Alon Levy • Sieg, Room 310, alon@cs.washington.edu • Office hours: by appointment. • TAs: Bart Niswonger and Stefan Saroiu • Office hours: also by appointment. • Mailing list: cse544@cs • Web page: (a lot of stuff already there) http://www.cs.washington.edu/education/courses/544/00sp/

  3. Course Times • In general, WF, 12-1:20pm (with a 3 minute breather in the middle). • Special dates: • Mondays, April 3, 10, 24. • No classes on week of May 15th.

  4. Goals of the Course • Purpose: • Foundations of database management systems. • Issues in building database systems. • Introduction to current research issues in databases. • Have fun: databases are not just bunches of tuples.

  5. Grading • Homeworks: 35% • Very little regurgitation. • Meant to be challenging (I.e., fun). • Project: 50% • More later. • Participation: 15% • No Exams.

  6. Textbook • Database Management Systems, Ramakrishnan and Gehrke.

  7. Other Useful Texts • Pair of books by Ullman, Widom and Garcia-Molina • Foundations of Databases (Abiteboul, Hull & Vianu) • Parallel and Distributed DBMS (Ozsu and Valduriez) • Transaction Processing (Gray and Reuter) • Database Systems (Silberschatz, Korth and Sudarshan) • Data and Knowledge based Systems (volumes I, II) (Ullman) • Readings in Database Systems (Stonebraker and Hellerstein) • Proceedings of SIGMOD, VLDB, PODS conferences.

  8. Prerequisites

  9. Operating systems Data structures and algorithms Distributed systems Complexity theory Mathematical Logic Knowledge Representation User interface design Programming languages Artificial Intelligence (Search) Greek, Hebrew, French, Romanian Real Prerequisites

  10. Why Use a DBMS? All programs manipulate data, so why use a database? • Large amounts of data (Giga’s, Tera’s) • Data is very structured • Persistent data • Valuable data • Performance requirements • Concurrent access to the data • Restricted access to data

  11. Functionality of a DBMS • Persistent storage management • Transaction management • Resiliency: recovery from crashes. • Separation between logical and physical views of the data. • High level query and data manipulation language. • Efficient query processing • Interface with programming languages

  12. Terminology Attribute names Product (relation name) Name Price Category Manufacturer gizmo $19.99 gadgets GizmoWorks Power gizmo $29.99 gadgets GizmoWorks SingleTouch $149.99 photography Canon MultiTouch $203.99 household Hitachi tuples (Arity=4) Product(name: string, Price: real, category: enum, Manufacturer: string)

  13. Querying a Database SELECT S.sname, phone FROM Purchase P, Person Q WHERE P.buyer=Q.name AND Q.city=‘seattle’ AND Q.phone > ‘5430000’ • SQL (Structured Query Language) • An acquired taste… • Datalog: kinder, gentler language

  14. Query update User/ Application Query optimizer Query execution plan Execution engine Record, index requests Index/record mgr. Page commands Buffer manager Read/write pages Storage manager storage

  15. Storage Management • Becomes a hard problem because of the interaction with the other levels of the DBMS: • What are we storing? • Efficient indexing, single and multi-dimensional • Exploit “semantic” knowledge • Issue: interaction with the operating system. Should we rely on the OS?

  16. Query Execution Plans Find names and phones of people who bought telephony products Buyer,phone Buyer,phone   Category=“telephony” Category=“telephony” (hash join) (hash join) prod=pname Buyer=name (sort-merge join) (hash join) Product Person Buyer=name prod=pname Person Purchase Product Purchase Imperative programs for evaluating queries. Many choices to make.

  17. Query Optimization Goal: Declarative SQL query Imperative query execution plan: buyer  City=‘seattle’ phone>’5430000’ SELECT S.sname,phone FROM Purchase P, Person Q WHERE P.buyer=Q.name AND Q.city=‘seattle’ AND Q.phone > ‘5430000’ (hash join) Buyer=name Person Purchase Plan:Tree of R.A. ops, with choice of alg for each op. Ideally:Want to find best plan. Practically:Avoid worst plans!

  18. TP and Recovery • For efficient use of resources, we want concurrent access to data. • Systems sometimes crash. • A “real” database guarantees ACID: • Atomicity: all or nothing of a transaction. • Consistency: always leave the DB consistent. • Isolation: every transaction runs as if it’s the only one in the system. • Durability: if committed, we really mean it. • Do we really want ACID?

  19. Data Integration Uniform query capability across autonomous, heterogeneous data sources on LAN, WAN, or Internet

  20. XML: Semi-structured Data eXtensible Markup Language: • Emerging format for data exchange on the web and between applications.

  21. Database Industry • Relational databases are a great success of theoretical ideas. • Oracle has a market cap of over $200B • Other players: IBM, MS, Sybase, Informix • Trends: • warehousing and decision support • data integration • XML, XML, XML.

  22. Course (Rough) Outline • The basics: (quickly) • The relational model • SQL • Views, integrity constraints • XML • Physical representation: • Index structures.

  23. Course Outline (cont) • Query execution: (Zack Ives) • Algorithms for joins, selections, projections. • Query Optimization • Data Integration • semi-structured data • Transaction processing and recovery (Phil Bernstein)

  24. Projects • Goal: identify and solve a problem in database systems. • (almost) anything goes. • Groups of 2-3 • Groups assembled end of week 2; • Proposals, end of week 3. • Touch base with me: every two weeks. • Example projects on web site. • Start Early.

More Related