240 likes | 345 Vues
Learn how to construct complex query expressions, combine operations effectively, and apply algebraic laws, while ensuring constraints and referential integrity. Explore relational operations on bags and extended operations. Understand outer joins, natural joins, and view extensions to the relational model.
E N D
The rest of Ch. 5 – Relational Algebra • Ch. 10 – Logical query languages
Combining Operations to Form Queries • Construct more complex expressions by applying operators to sub-expressions • Use parentheses to indicate operands grouping • Multiple ways to write equivalent queries • Expression tree for visualizing complex expression • Query optimizer • Example (ref. Fig. 5.8): title,year(length≥100(Movie) studioName=‘Fox’(Movie)) title,year(length≥100 ANDstudioName=‘Fox’(Movie)) or
Algebraic Laws • Associative: e.g. R (S T) = (R S) T • Commutative: e.g. R S = S R
DependentandIndependent Operations • Set operations:union, difference,intersection • Remove parts of a relation: • Selection – eliminates rows (tuples) • Projection – eliminates columns • Combine tuples of two relations :cartesian productandjoins • Renaming: changes relation schema (i.e. relation name, and/or attribute names)
Constraints on Relations Two ways to express constraints in relational algebra: • R = Ø // R Ø • R S // R - S = Ø where R and S are relational algebra expressions
Referential Integrity Constraints • Assertion: a value appearing in one context also appears in another, related context. • Example: Movie(title, year, length, inColor, studioName, producerC#) MovieExec(name, address, cert#, networth) Constraint: the producer of every movie is a certified movie executive, i.e. appear in the MovieExec relation producerC#(Movie) cert#(MovieExec) producerC#(Movie) - cert#(MovieExec) = Ø or
Other Constraints • Domain constraints example: MovieStars(name, address, gender, birthdate) Constraint: acceptable values for the “gender” attribute are ‘F’ or ‘M’ gender’F’ AND gender‘M’(MovieStar) = Ø • Other constraints example: MovieExec(name, address, cert#, networth) Studio(name, address, presC#) Constraint: president of a movie studio must have a net worth of at least $10,000,000 networth<10000000(Studio presC#=cert# MovieExec) = Ø • Functional dependency constraints
Relational Operations on Bags • Bag • a “set” that is allowed to have more than one occurrence of an element • => duplicate tuples in a relation • Constraint representations work with bags • Reason: • For implementation efficiency when duplication is acceptable • When actual no. of tuples is needed for aggregate • Example:
Relational Operations of Bags (continue) Given: R and S are bags, and tuple t appears in R n times, and in S m times • R S : contains n + m tuple t • R – S : contains max(0, n-m) tuple t • R S : contains min(n, m) tuple t • A,B(R) : each tuple is processed independently, resulting duplicate tuples are not eliminated • c(R) : apply the selection condition to each tuple independently, resulting duplicate tuples are not eliminated
Product and Joins of Bags Given: R and S are bags, and tuple r appears in R m times, and tuple s appears in S n times • R x S : the resulting tuple rs will appear mn times. • R S : each tuple of R is compared to each tuple of S to decide if the pair tuples joins successfully, do not eliminate duplicates • R CS : each tuple of R is compared to each tuple of S to decide if the condition C is met, do not eliminate duplicates
Extended Operation to Relational Algebra • Duplicate elimination : to convert a bag to a set • Aggregation: count, sum, max, min, average • Grouping • Extended Projection • Sorting • Outerjoins
OUTERJOINS • Dangling tuples: tuples that failed to match any tuple of the other relation in the common attributes. • An operator to augment the result of a join by the dangling tuples, padded with null values. • R S : Full outerjoin of R1 and R2 is a join that includes all rows from R1 and R2 matched or not. Unmatched rows are padded with special null symbols . • LEFT outerjoin of R1 and R2 is a join that includes all rows from R1, matched or not, plus the matching values from R2. Unmatched rows are padded with . • RIGHT outerjoin of R1 and R2 is a join that includes all rows from R2, matched or not, plus the matching values from R1. Unmatched rows are padded with . • The joining may be NATURAL or theta join
Outer Join Example R S Full Outer Join Natural join
Extensions to the Relational Model • Modifications : insert, delete, update • Views : relational expression with a name to be applied real relations to produce the relation defined by the expression. Views can used as arguments to other expressions. • Null values : common interpretations: • Value unknown • Value inapplicable • Value withheld
Ch. 10 Logical Query Languages • Motivation • Datalog • Relational Algebra to Datalog • Recursion in Datalog • Negation in Recursive Rules
Motivation • Logical rules is more natural in representing recursive queries • Logical rules form the basis of many information-integration applications
A Datalog rule example • Relation: Movie(title, year, length, inColor, studioName, producerC#) LongMovie(t, y) Movie(t, y, l, c, s, p) AND l 100 head subgoals body LongMovie = title, year(length≥100(Movie))
Datalog rule • Relational Atoms : predicate followed by arguments • Arithmetic Atoms : comparison between two arithmetic expressions (e.g. x ≠ Y) • Predicate = relation name or arithmetic comparison predicates (e.g. =, <, ≠, etc) • Head – a relational atom • Body – one or more atoms (subgoals) connected by AND • Subgoals (not head) may be optionally negated by NOT • Local variables – variables in body, not in head
Datalog • A logic based data model • The underlying mathematical model of data is essentially that of the relational model • Predicate symbols denote relations • Relational algebra operations are described by rules • Query : a collection of one or more rules. • The relation in the rule head is the answer to the query
Extensional and Intensional Predicates • Extensional Predicates (EDB) The set of relations which ARE defined as part of the actual database (i.e. physically stored). e.g. R = {1} • Intensional Predicates (IDB) The set of relations which are NOT defined as part of the actual database but are instead abstracted from logical rules. e.g. P (x) Q (x) Q (x) R (x) • A predicate must be IDB or EDB but not both. • IDB predicate can appear in the body or head of a rule • EDB predicate can appear in the rule body only
3 Different Interpretations of Logical Rules • Proof-Theoretic Interpretaton • Model-Theoretic Interpretation • Computational Interpretation The following discussions will use the EDB: R = {1}
Proof-Theoretic Interpretation As axioms to be used in a proof. • From the facts in the database, see what other facts can be proved using the rules in all possible ways. • All facts derivable using the rules are derivable by applying the rules in the forward direction only • Example: P = {1}, Q = {1}
Model-Theoretic Interpretation As definition of possible worlds or models. • To be a model, an interpretation must make the rules true, no matter what assignment of values is made for the variables in each rule. • Multiple models are possible • With no negations, a unique minimal model exists that gives the same result as the proof-theoretic interpretation. • Minimal model : cannot make any true fact false and still have a model consistent with the EDB • Example: • P = {1, 2, 3}, Q = {1, 2} • P = {1}, Q={1}
Computational Interpretation By providing an algorithm for “executing” the rules to determine whether a potential fact is true or false. • E.g. Prolog – uses a particular algorithm that involves searching for proofs of the potential fact. • Drawback: • the set of facts Prolog finds a proof is not necessarily the same as the set of all facts for which a proof exists • The set of facts Prolog finds true is not necessarily a model.