1 / 100

CPSC 534A – Background

CPSC 534A – Background. Rachel Pottinger January 13 and 18, 2005. Administrative notes. Please note you’re supposed to sign up for one paper presentation and one discussion… for different papers Please sign up for the mailing list WebCT has been populated – make sure you can access it

casey-guy
Télécharger la présentation

CPSC 534A – Background

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CPSC 534A – Background Rachel Pottinger January 13 and 18, 2005

  2. Administrative notes • Please note you’re supposed to sign up for one paper presentation and one discussion… for different papers • Please sign up for the mailing list • WebCT has been populated – make sure you can access it • HW 1 is on the web, due beginning of class a week from today

  3. Overview of the next two classes • Relational databases • Entity Relationship (ER) diagrams • Object Oriented Databases (OODBs) • XML • Other data types • Database internals (Briefly) • An extremely brief introduction to category theory Metadata management examples are interspersed

  4. Relational Database Basics • What’s in a relational database? • Relational Algebra • SQL • Datalog

  5. Relational Data Representation Attribute names or columns PName Price Category Manufacturer gizmo $19.99 gadgets GizmoWorks Power gizmo $29.99 gadgets GizmoWorks SingleTouch $149.99 photography Canon MultiTouch $203.99 household Hitachi Relation or table Tuples or rows

  6. Relational schema representation • Every attribute has an atomic type (e.g., Char, integer) • Relation Schema: Column headings:relation name + attribute names + attribute types Product(VarChar PName, real Price, VarChar Category, VarChar Manfacturer)often types are left off:Product(PName, Price, Category, Manfacturer) • Relation instance: The values in a table. • Database Schema: a set of relation schemas in the database. • Database instance: a relation instance for every relation in the schema.

  7. Querying – Relational Algebra • Select()- chose tuples from a relation • Project ()- chose attributes from relation • Join (⋈) - allows combining of 2 relations • Set-difference ( ) Tuples in relation 1, but not in relation 2. • Union( ) • Cartesian Product(×) Each tuple of R1 with each tuple in R2

  8. Find products where the manufacturer is GizmoWorks Product Selection: σManufacturer = GizmoWorksProduct

  9. Find the Name, Price, and Manufacturers of products whose price is greater than 100 Product Selection + Projection:πName, Price, Manufacturer (σPrice > 100Product)

  10. Find the product names and price of products that cost less than $200 and have manufacturers where there is a Company that has a CName that matches the manufacturer, and its country is Japan Product Company πPName, Price((σPrice < 200Product)⋈ Manufacturer = Cname (σCountry= ‘Japan’Company))

  11. When are two relations related? • You guess they are • I tell you so • Constraints say so • A key is a set of attributes whose values are unique; we underline a key Product(PName, Price, Category, Manfacturer) • Foreign keys are a method for schema designers to tell you so • A foreign key states that an attribute is a reference to the key of another relationex: Product.Manufacturer is foreign key of Company • Gives information and enforces constraint

  12. SQL • Data Manipulation Language (DML) • Query one or more tables • Insert/delete/modify tuples in tables • Data Definition Language (DDL) • Create/alter/delete tables and their attributes • Transact-SQL • Idea: package a sequence of SQL statements  server

  13. Querying – SQL Standard language for querying and manipulating data Structured Query Language • Many standards out there: • ANSI SQL • SQL92 (a.k.a. SQL2) • SQL99 (a.k.a. SQL3) • Vendors support various subsets of these • What we discuss is common to all of them

  14. SQL basics • Basic form: (many many more bells and whistles in addition) Select attributes From relations (possibly multiple, joined) Where conditions (selections)

  15. SQL – Selections SELECT * FROM Company WHERE country=“Canada” AND stockPrice > 50 Some things allowed in the WHERE clause: attribute names of the relation(s) used in the FROM. comparison operators: =, <>, <, >, <=, >= apply arithmetic operations: stockPrice*2 operations on strings (e.g., “||” for concatenation). Lexicographic order on strings. Pattern matching: s LIKE p Special stuff for comparing dates and times.

  16. SQL – Projections Select only a subset of the attributes SELECT name, stock price FROM Company WHERE country=“Canada” AND stockPrice > 50 Rename the attributes in the resulting table SELECT name AS company, stockPrice AS price FROM Company WHERE country=“Canada” AND stockPrice > 50

  17. SQL – Joins SELECT name, store FROM Person, Purchase WHEREname=buyer AND city=“Vancouver” AND product=“gizmo” Product ( name, price, category, maker) Purchase (buyer, seller, store, product) Company (name, stock price, country) Person( name, phone number, city)

  18. Selection: σManufacturer = GizmoWorks(Product) Product What’s the SQL?

  19. Selection + Projection:πName, Price, Manufacturer (σPrice > 100Product) Product What’s the SQL?

  20. πPName, Price((σPrice <= 200Product)⋈ Manufacturer = Cname (σCountry = ‘Japan’Company)) Product Company What’s the SQL?

  21. More SQL – Outer Joins • What happens if there’s no value available? Product Company Select pname, Country From Product, Company Where Manufacturer = Cname Select pname, Country From Product outer join Company on Manufacturer = Cname

  22. Querying – Datalog • Enables expressing recursive queries • More convenient for analysis • Some people find it easier to understand • Without recursion but with negation it is equivalent in power to relational algebra and SQL • Limited version of Prolog (no functions)

  23. Datalog Rules and Queries A datalog rule has the following form: head :- atom1, atom2, …, atom,… You can read this as then :- if ... ExpensiveProduct(N) :- Product(N,M,P) & P > $100 CanadianProduct(N) :- Product(N,M,P) & Company(M, “Canada”, SP) IntlProd(N) :- Product(N,M,P) & NOT Company(M, “Canada”, SP) Arithmetic comparison or interpreted predicate Distinguished variable Subgoal or EDB Existential variables constant Negated subgoal - also denoted by ¬ Head or IDB

  24. Conjunctive Queries • A subset of Datalog • Only relations appear in the right hand side of rules • No negation • Functionally equivalent to Select, Project, Join queries • Very popular in modeling relationships between databases

  25. Selection: σManufacturer = GizmoWorks(Product) Product What’s the Datalog?

  26. Selection + Projection:πName, Price, Manufacturer (σPrice > 100Product) Product What’s the Datalog?

  27. πPname,Price((σPrice <= 200Product)⋈ Manufacturer = Cname (σCountry = ‘Japan’Company)) Product Company What’s the Datalog?

  28. Bonus Relational Goodness: Views Views are relations, except that they are not physically stored. (Materialized views are stored) They are used mostly in order to simplify complex queries and to define conceptually different views of the database to different classes of users. Used also to model relationships between databases View: purchases of telephony products: CREATE VIEW telephony-purchases AS SELECT product, buyer, seller, store FROM Purchase, Product WHERE Purchase.product = Product.name AND Product.category = “telephony”

  29. Summarizing Relational DBs • Relational perspective: Data is stored in relations. Relations have attributes. Data instances are tuples. • SQL perspective: Data is stored in tables. Tables have columns. Data instances are rows. • Query languages • Relational algebra – mathematical base for understanding query languages • SQL – very widely used • Datalog – based on Prolog, very popular with theoreticians • Views allow complex queries to be written simply

  30. Relational Metadata problems

  31. Data Integration:Planning a Beach Vacation Beach Good Weather Cheap Flight Fodors weather.com wunderground AAA Expedia Orbitz

  32. Data Integration System Architecture User Query Mediated Schema Virtual database “Airport” Local Schema 1 Local Schema N Local Database 1 Local Database N Expedia Orbitz

  33. Data Translation Data exists in two different schemas. You have data in one, and you want to put data into the other • How are the schemas related to one another? • How do you change the data from one to another?

  34. Data Warehousing Data Warehouses store vast quantities of data for fast query processing, but only batch updating. • Import schemas of data sources • Identify overlapping attributes, etc. • Build data cleaning scripts • Build data transformation scripts • Enable data lineage tracing

  35. Schema Evolution and Data Migration Schemas change over time; data must change with it. • How do we deal with schema changes? • How can we make it easy for the data to migrate • How do we handle applications built on the old schema that store in the new database?

  36. Outline • Relational databases • Entity Relationship (ER) diagrams • Object Oriented Databases (OODBs) • XML • Other data types • Database internals (Briefly) • An extremely brief introduction to category theory

  37. Entity / Relationship Diagrams Entities Attributes Relationships between entities Product address buys

  38. Keys in E/R Diagrams • Every entity set must have a key name category price Product

  39. name category name price makes Company Product stockprice buys employs Person name sin address

  40. Multiplicity of E/R Relations 1 2 3 1 2 3 1 2 3 a b c d a b c d a b c d • one-one: • many-one • many-many

  41. name category name price makes Company Product stockprice What doesthis say ? buys employs Person name sin address

  42. Roles in Relationships What if we need an entity set twice in one relationship? Product Purchase Store buyer salesperson Person

  43. Attributes on Relationships date Product Purchase Store Person

  44. Subclasses in E/R Diagrams name category price Product isa isa Software Product Educational Product platforms Age Group

  45. Keys in E/R Diagrams name category Underline: price Product No formal way to specify multiple keys in E/R diagrams Person name SIN address

  46. From E/R Diagramsto Relational Schema • Entity set  relation • Relationship  relation

  47. Entity Set to Relation name category price Product Product(name, category, price) name category price gizmo gadgets $19.99

  48. Relationships to Relations price name category Start Year name makes Company Product Stock price Makes(product-name, product-category, company-name, year)Product-name Product-Category Company-name Starting-year gizmo gadgets gizmoWorks 1963 (watch out for attribute name conflicts)

  49. Relationships to Relations price name category Start Year name makes Company Product Stock price No need for Makes. Modify Product: name category price StartYear companyName gizmo gadgets 19.99 1963 gizmoWorks

  50. Multi-way Relationships to Relations address name Product Purchase Store price name Person Purchase( , , ) sin name

More Related