1 / 23

VIEWS

VIEWS. Prof. Navneet Goyal Department of Computer Science & Information Systems BITS, Pilani. Topics. Query Modification View Materialization Which Views to Materialize? How to exploit Materialized Views to answer queries? View Maintenance. View Modification (Evaluate On Demand).

Télécharger la présentation

VIEWS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VIEWS Prof. Navneet Goyal Department of Computer Science & Information Systems BITS, Pilani

  2. Topics • Query Modification • View Materialization • Which Views to Materialize? • How to exploit Materialized Views to answer queries? • View Maintenance

  3. View Modification(Evaluate On Demand) CREATE VIEWRegionalSales(category,sales,state) ASSELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid View Query SELECT R.category, R.state, SUM(R.sales) FROMRegionalSales AS R GROUP BY R.category, R.state Modified Query SELECT R.category, R.state, SUM(R.sales) FROM (SELECT P.category, S.sales, L.state FROM Products P, Sales S, Locations L WHERE P.pid=S.pid AND S.locid=L.locid) AS R GROUP BY R.category, R.state

  4. View Materialization (Precomputation) • Suppose we precompute RegionalSales and store it with a clustered B+ tree index on [category,state,sales]. • Then, previous query can be answered by an index-only scan. SELECT R.state, SUM(R.sales) FROM RegionalSales R WHERE R.category=“Laptop” GROUP BY R.state SELECT R.state, SUM(R.sales) FROM RegionalSales R WHERE R. state=“Wisconsin” GROUP BY R.category Index on precomputed view is great! Index is less useful (must scan entire leaf level).

  5. Materialized Views • A view whose tuples are stored in the database is said to be materialized • Provides fast access, like a (very high-level) cache. • Need to maintain the view as the underlying tables change. • Ideally, we want incremental view maintenance algorithms. • Close relationship to Data Warehousing, OLAP,

  6. Issues in View Materialization • What views should we materialize, and what indexes should we build on the precomputed results? • Given a query and a set of materialized views, can we use the materialized views to answer the query? • How frequently should we refresh materialized views to make them consistent with the underlying tables? (And how can we do this incrementally?)

  7. View Materialization:Example SELECT P.Category, SUM(S.sales) FROM Product P, Sales S WHERE P.pid=S.pid GROUP BY P.Category Both queries require us to join the Sales table with another table & aggregate the result SELECT L.State, SUM(S.sales) FROM Location L, Sales S WHERE L.locid=S.locid GROUP BY L.State How can we use materialization to speed up these queries?

  8. View Materialization:Example • Pre-compute the two joins involved ( product & sales & Location & sales) • Pre-compute each query in its entirety • OR let us define the following view: CREATE VIEW TOTALSALES (pid, lid, total) AS Select S.pid, S.locid, SUM(S.sales) FROM Sales S GROUP BY S.pid, S.locid

  9. View Materialization:Example • The View TOTALSALES can be materialized & used instead os Sales in our two example queries SELECT P.Category, SUM(T.Total) FROM Product P, TOTALSALES T WHERE P.pid=T.pid GROUP BY P.Category SELECT L.State, SUM(T.Total) FROM Location L, TOTALSALES T WHERE L.locid=T.locid GROUP BY L.State

  10. View Maintenance • A materialized view is said to be refreshed when it is made consistent with changes ot its underlying tables • Often referred to as VIEW MAINTENANCE • Two issues: • HOW do we refresh a view when an underlying table is refreshed? Can we do it incrementally? • WHEN should we refresh a view in response to a change in the underlying table?

  11. View Maintenance • The task of keeping a materialized view up-to-date with the underlying data is known as materialized view maintenance • Materialized views can be maintained by recomputation on every update • A better option is to use incremental view maintenance • Changes to database relations are used to compute changes to materialized view, which is then updated • View maintenance can be done by • Manually defining triggers on insert, delete, and update of each relation in the view definition • Manually written code to update the view whenever database relations are updated • Supported directly by the database

  12. View Maintenance • Two steps: • Propagate: Compute changes to view when data changes. • Refresh: Apply changes to the materialized view table. • Maintenance policy: Controls when we do refresh. • Immediate: As part of the transaction that modifies the underlying data tables. (+ Materialized view is always consistent; -updates are slowed) • Deferred: Some time later, in a separate transaction. (-View becomes inconsistent; + can scale to maintain many views without slowing updates)

  13. Deferred Maintenance Three flavors: • Lazy: Delay refresh until next query on view; then refresh before answering the query. • Periodic (Snapshot): Refresh periodically. Queries possibly answered using outdated version of view tuples. Widely used, especially for asynchronous replication in distributed databases, and for warehouse applications. • Event-based: E.g., Refresh after a fixed number of updates to underlying data tables.

  14. View Maintenance:Incremental Algorithms • Recomputing the view when an underlying table is modified – straightforward approach • Not feasible to do so for all changes made • Ideally algorithms for refreshing a view should be incremental • Cost of refresh is proportional to the extent of the change

  15. View Maintenance:Incremental Algorithms • Note that a given row in the materialized view can appear many times (duplicates are not eliminated) • Main idea behind incremental algorithms is to efficiently compute changes to the rows of the view • New rows • Changes to count associated with a row • A row is deleted if its count becomes 0

  16. View Maintenance • The changes (inserts and deletes) to a relation or expressions are referred to as its differential • Set of tuples inserted to and deleted from r are denoted ir and dr • To simplify our description, we only consider inserts and deletes • We replace updates to a tuple by deletion of the tuple followed by insertion of the update tuple • We describe how to compute the change to the result of each relational operation, given changes to its inputs • We then outline how to handle relational algebra expressions

  17. Join Operation • Consider the materialized view v = r s and an update to r • Let rold and rnew denote the old and new states of relation r • Consider the case of an insert to r: • We can write rnew s as (rold ir) s • And rewrite the above to (rold s)  (ir s) • But (rold s) is simply the old value of the materialized view, so the incremental change to the view is just ir s • Thus, for inserts vnew = vold (ir s) • Similarly for deletes vnew = vold – (dr s)

  18. Selection & Projection Operations • Selection: Consider a view v = (r). • vnew = vold (ir) • vnew = vold - (dr) • Projection is a more difficult operation • R = (A,B), and r(R) = { (a,2), (a,3)} • A(r) has a single tuple (a). • If we delete the tuple (a,2) from r, we should not delete the tuple (a) from A(r), but if we then delete (a,3) as well, we should delete the tuple • For each tuple in a projection A(r) , we will keep a count of how many times it was derived • On insert of a tuple to r, if the resultant tuple is already in A(r) we increment its count, else we add a new tuple with count = 1 • On delete of a tuple from r, we decrement the count of the corresponding tuple in A(r) • if the count becomes 0, we delete the tuple from A(r)

  19. Aggregate Operations • count : v = Agcount(B)(r) ( count of the attribute B, after grouping r by attribute A) • Whena set of tuples ir is inserted • For each tuple t in ir, if the group t.A is present in v, we increment its count, else we add a new tuple (t.A, 1) with count = 1 • When a set of tuples dr is deleted • for each tuple t in ir.we look for the group t.A in v, and subtract 1 from the count for the group. • If the count becomes 0, we delete from v the tuple for the group t.A

  20. branch_name account_number balance Perryridge Perryridge Brighton Brighton Redwood A-102 A-201 A-217 A-215 A-222 400 900 750 750 700 branch_name sum(balance) Perryridge Brighton Redwood 1300 1500 700 Example • Relation account grouped by branch-name: branch_nameg sum(balance) (account)

  21. Aggregate Operations • sum: v = Agsum (B)(r) • We maintain the sum in a manner similar to count, except we add/subtract the B value instead of adding/subtracting 1 for the count • Additionally we maintain the count in order to detect groups with no tuples. Such groups are deleted from v • Cannot simply test for sum = 0 (why?) • To handle the case of avg, we maintain the sum and count aggregate values separately, and divide at the end

  22. Aggregate Operations • min, max: v = Agmin (B) (r). • Handling insertions on r is straightforward. • Maintaining the aggregate values min and max on deletions may be more expensive.We have to look at the other tuples of r that are in the same group to find the new minimum

  23. TEST IDNO Marks T1 T1 T2 T2 T3 A-102 A-103 A-102 A-103 A-104 15 20 25 10 10 TEST Min(marks) T1 T2 15 10 Example Testg Min(marks) (student_record))

More Related