260 likes | 385 Vues
This paper by Joachim W. Schmidt discusses language constructs designed for efficient database operations, proposing a shift from conventional file-based methods to a more declarative style aligned with relational algebra principles. It examines persistent data handling, optimization in query execution, and the implementation of language mechanisms that enhance data manipulation, such as selection, projection, and modification operations. The author highlights the limitations of traditional approaches and illustrates the advantages of a type-safe environment integrating relational database functionalities within programming languages like Pascal.
E N D
Some High Level Language Constructs for Data of Type Relation Author: Joachim W. Schmidt (Hamburg university, west Germany) From ACM Transactions on Database Systems, September 1977 Presented by: Shnaiderman Lila
Motivation • Fact:(almost) All Programs Use Persistent Data • File Based Approach: open, close, get, put, printf, etc. • Pain in the neck! • Relational Algebra: • Traditional: invoke SQL from Pascal/C/ … • Cumbersome. • Not type safe, checked only at runtime. • DB types may be different from Pascal types. • This Work: Language mechanisms for • Data Query (a la relational algebra). • Data Modification (insert, delete, etc). • missing in Relational Algebra. • Reminder of RA • Selection (σ) - Selects a subset of rows • Cartesian-product(X) Allows us to combine two relations. • Projection (π) - Deletes unwanted columns from relation. • Set-difference (-) Tuples in R1, but not in R2. • Union(U) Tuples in R1 or in R2. • Renaming (ρ) – renames a column in a relation
Outline • To propose language constructs for performing database operations in a declarative manner: • A repetition statement controlled by relations. • Predicates as a generalization of Boolean expressions. • A constructor for relations using predicates.
Main Concepts • Tuple - implemented as a recordwith fields of type string or scalar. • Relation - a set of tuples, while some of the fields of the relation serves as a key • Key – Uniquely determine a tuple in a relation. • Key Order: automatically defined • Natural order of each primitive type (Integer, String) • Lexicographic order if multiple keys. • Example: for key <City,Price> • Jerusalem, $12 • Jerusalem, $13 • Jerusalem, $14 • Tel-Aviv, $11
Defining a Relation in Pascal-R • Example: instructors are characterized by the attributes: ID (unique), status and name: type iRecord = record ID, Status : integer; Name : string end; iRelation = relation<ID>of iRecord; var instructors : iRelation; Just like Pascal record type relation key
Elementary Operators on Relations Example: rel1 rel2 • Elementary Altering Operations: • Insertion operator:rel1 :+ rel2; • Deletion operator:rel1 :- rel2; • Replacement operator:rel1 :& rel2; • Assignment operator:rel1 := rel2; Result: rel1
Similarity to Sets in Pascal • Pascal sets operations: • Equivalents to operations on relations: • S1 := S2 Assign a set to a set variable. • S1 + S2 Union of sets. • S1 - S2 Difference between two sets. • Other: Intersection, Symmetric difference, comparison operators, membership test
Literals of Type Relation • An elementary Relation Constructor • Constructed from a record variable : ‘rel := [rec]’ • Empty relation : ‘[]’. Example: var instructors : iRelation; i : iRecord; begin … i.ID := 222; i.Status := 1; i.Name := ‘Michal’ instructors := [i] end.
Bad Old File-Like Operations on Relations • Elementary Retrieval Operation – allows tuple-wise reading of relation variables. Defined with: • rel↑ - buffer variable for the current tuple. • low(rel) –updates the buffer with the tuple with the lowest key value. • Next(rel) – updates the buffer with the tuple with the next highest key value. • aor(rel) – returns true if all the tuples already passed else false.
Implementation programDBuser(DB) type … iRecord = … ; iRelation = … ; var DB:database instructors:iRelation; … end; result1, … : iRelation; result7 : relation<rname, rtitle>of record … end; beginwith DB do … end. • Compiler was modified to accept the syntax • Run time library was added to handle the execution of the new constructs • Database counts as an external variable • Similar to PASCAL files • Can be connected to via a parameter in the program header
Elementary Retrieval example • Find instructors with status = 2 (senior lecturer). result = []; low(instructors); whilenotaor(instructors) do begin if instructors↑. Status = 2 then result :+ [instructors↑]; next(instructors) end
Problems with the old file model • The ordering of tuple access is irrelevant • Almost no possibilities for automatic optimization. • The language construct for relation is inadequate • File concept • Only one global file pointer per relation • The notation can be more declarative.
The Repetition Statement foreach • Example: Find instructors with Status = 2 (senior lecturer). result = []; low(instructors); whilenotaor(instructors) do begin if instructors↑.Status = 2 then result :+ [instructors↑]; next(instructors) end. • Example: Find instructors with Status = 2 (senior lecturer). result = []; foreach i in instructors do if i.Status = 2 then result :+ [i] Foreach in Java 5 void printAll(Collection<instructor> c) { for (instructor i : c) i.print(); } Control record variable Range relation variable for instructor i : c foreach i in instructors statement
Nested foreach • Example : Find all those instructors who has lecture on Friday type … tRecord = record ID, CourseID, Time : integer; Day, Room : string end; tRelation = relation< ID, CourseID, Day >of tRecord; var instructors,result : iRelation; timetable : tRelation; begin … result = []; foreach i in instructors do foreacht intimetable do if (e.ID = t.ID) and (t.Day = ‘Friday’) then result :+ [i] end. • The inner loop traversed too often… • Redundant operations performed in runtime… • The inner loop used only as test condition.
foreach as ∃ logical condition • Example: Find all those instructors whose lecture of Friday var … hasLecOnFriday : boolean; begin … result = []; foreach i in instructors do begin hasLecOnFriday := false; foreacht intimetable do hasLecOnFriday := hasLecOnFriday or (i.ID = t.ID) and (t.Day = ‘Friday’) if hasLecOnFriday then result :+ [i] end end. The logical condition can be replaced with: somerecin rel (<logical expression>)
foreach as ∀ logical condition • Example: Find all instructors who give no lecture. var … hasNoLecture : boolean; begin … result = []; foreach i in instructors do begin hasNoLecture := true; foreacht intimetable do hasNoLecture := hasNoLecture and (i.ID ≠ t.ID); if hasNoLecture then result :+ [i] end end. The logical condition can be replaced with: allrecin rel (<logical expression>)
Extending Pascal with advanced logic expressions • In “regular Pascal”: • A and B • A or B • Not A • Pascal-R has First Order Predicate Logic • ∃xs.p(x) • ∀xs.p(x) Lets rewrite the previous examples Predicate calculus
some Predicate • “whilenot aor(timetable) do”– sequential tuple-wise processing ordered by key values • “foreach t in timetable do”– still tuple-wise and sequentially but with no specific order • “some t in timetable” - Gives possibility for efficient implementation (processing tuple sets in parallel) • Example : Find all those instructors whose lecture is on Friday result = []; foreach i in instructors do if somet intimetable ((i.ID = t.ID) and (t.Day = ‘Friday’)) then result :+ [i]
All predicate • Example: Find all instructors who give no lecture. result = []; foreach i in instructors do if alltintimetable (t.ID ≠ t.instructorID); thenresult :+ [i]
Nested predicates • Example: Find those instructors who give no lectures above the first year (courses with level 1) result = []; foreach i in instructors do if alltintimetable ((i.ID ≠ t.instructorID) or some c in courses ((t.CourseID = c.ID) and (c.Level = 1))) then result :+ [i] end.
Generalizing the Relation Constructor • So far all literals have one or zero records • Construction of bigger relations – by adding one record at a time. • Sequential process… • Let’s use the constructs we already created: res = [each i in instructors : i.Status = 2] This is like‘SELECT’ statement in SQL: SELECT * FROM instructors WHERE Status = 2;
General Relation Constructor examples var … ,res3,res4,res5,res6 : iRelation; begin … res3:=[each i in instructors : i.Status = 2]; res4 := [ each i in instructors : somet intimetable ((i.ID = t.instructorID) and (t.Day = ‘Friday’))]; res5 := [each i in instructors : allt intimetable (i.ID ≠ t.instructorID)]; res6 := [each i in instructors : allt intimetable ((i.ID ≠ t.instructorID) or some c in courses ((t.CouseID = c.ID) and (c.cLevel = 1)))] end. • Structural equivalence Problem • Pascal uses name equivalence method. • The assignment of “each” constructor into the target relation requires structural equivalence method… • Requires changes that contradict the spirit of Pascal… SELECT * FROM instructors WHERE Status = 2; SELECT * FROM instructors i, timetable t WHERE (i.ID = t.instructorID) AND (t.Day = ‘Friday’);
Generalizing the Relation Constructor (continue) • Example: For those instructors who give lectures, find the names and the title and year of their publications. res : relation<rname, rtitle>of record rname, rtitle : string ryear : integer end; begin … res := [each(i.Name; p.title; p.year) for i, p in instructors, papers : (i.ID = p.instructorID) and some t in timetable ((i.ID = t.instructorID)] end. • Advantages of this constructor: • Declarative • similar to RA • opens the door for optimizations. • SELECT i.Name, p.title, p.year • FROM instructors i, papers p, timetable t • WHERE (i.ID = p.instructorID) and • (i.ID = t.instructorID)]
PLAIN RIGEL Adaplex SQL Pascal-R development DBPL Codd relational Model Pascal-R 1979/ 1983 1988 1970 1978 Additional topics of interest • Unsolved issues: • Tools for altering data consistently • Taking care of simultaneous processing of a database by several users • Security issues • Error handling • Similar modification in many modern languages (like we will see in the continuation of the course)
Conclusion • What was achieved • Declarative! • Leaves a lot of space for optimization! • Includes Powerful retrieval facilities on relations (tuples sets level)! • The expressive power of the proposed constructs is satisfactory! • Type checking is possible, but requires the database schema to be known at compile time and remain unchanged in runtime… • Contradicts the spirit of Pascal…