1 / 89

Data Management: Databases and Organizations Richard Watson

Data Management: Databases and Organizations Richard Watson. Summary of Chapters 3-6 prepared by Kirk Scott. Data Modeling and SQL. Chapter 3. The Single Entity Chapter 4. The One-to-Many Relationship Chapter 5. The Many-to-Many Relationship

tiana
Télécharger la présentation

Data Management: Databases and Organizations Richard Watson

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management: Databases and OrganizationsRichard Watson Summary of Chapters 3-6 prepared by Kirk Scott

  2. Data Modeling and SQL • Chapter 3. The Single Entity • Chapter 4. The One-to-Many Relationship • Chapter 5. The Many-to-Many Relationship • Chapter 6. One-to-One and Recursive Relationships

  3. Introduction • Large parts of these overheads will be somewhat repetitive • They cover in general terms some of the things that were specifically illustrated by concrete SQL examples • However, the repetition shouldn’t be harmful • It should put the examples into a broader context, and add new examples to flesh the ideas out • The ultimate goal is for the basic concepts and diagramming to be clear so that there will be no trouble considering design questions in unit 14

  4. Chapter 3. The Single Entity • The author starts with the entity relationship diagramming conventions and the concept of a single entity • The author represents an entity with a box containing its name in capital letters inside, at the top • Full field names are given after that in small letters • The primary key field is marked with an asterisk

  5. Different diagramming conventions are perfectly acceptable, as long as you are consistent • The name of the entity may be given above the box representing it • You may choose just to capitalize just the first letter of the name

  6. In theory, you could qualify field names, although this would be redundant, given the entity name at the top • You could also use short names for fields if space is at a premium • Primary keys could be marked with pk or underlined

  7. Chapter 4. The One-to-Many Relationship • The author uses the crow’s foot to mark a one-to-many relationship in a ER diagram • In a simple ER diagram fields may not be listed, just entity names and crow’s feet • In a more complete diagram, fields can be listed

  8. The author does not include the embedded pk/fk in the list of fields in the fk/many table because it is redundant • I do not follow this convention • I believe that in the interests of clarity it is worthwhile to include the fk in the list of fields

  9. Chapter 5. The Many-to-Many Relationship • As is known, the many-to-many relationship is the most “complicated” of the relationships • The book presents some interesting examples that arise in real situations • They illustrate ideas that are not immediately apparent from the examples that have gone before • The first example is based on a bill of sale, shown on the next overhead

  10. The Bill of Sale Example: An Interesting Case of a pk/fk Relationship

  11. The book analyzes this situation as consisting of base entities which are a sale and the items which are sold • There is a many-to-many relationship between these base entities because each sale can consist of many items • Also, each item can be present in many sales • The book’s ER for this analysis is shown on the next overhead

  12. When first introducing many-to-many relationships, I referred to the table in the middle • More formally, the book refers to an associative entity • The associative entity is the table in the middle that captures the relationship between two base entities

  13. In the ER notation for this example the + sign is used • This has not been seen before • For the purposes of understanding the book’s example, it is important to know what this means

  14. The + sign is shown over a crow’s foot • It symbolizes the fact that the embedded fk is part of the pk of the table it’s embedded in • You have seen an example of a table in the middle where the pk is the concatenation of the two embedded fk’s • This example is not the same as that

  15. In this example the saleno is the pk of the Sale table • It is embedded as a fk in the Lineitem table • A saleno value will appear in the Lineitem table as many times as there are separate lines belonging to the sale • These separate lines are identified by lineno’s • The lineno’s are not embedded fk’s based on the unique identifiers, itemno’s, of entries in the Item table

  16. An alternative way of representing the relationship would be to list the fields of the table in the middle this way: • salenopk, fk • linenopk • itemnofk • lineqty • lineprice • Note again that the saleno is both a pk and a fk, while the lineno is purely pk

  17. At first glance it may seem a little strange, but the table in the middle contains every line of every sale, listed separately • It is the saleno and the lineno together which uniquely identify the entries in the Lineitem table • This model actually reflects reality well • It differs, in particular, from the car sale example

  18. In the car sale example, there were individual cars that were sold • In the example database they were only shown as being sold once • In reality, the same car might be sold more than once • This could be modeled by making the salesdate part of the pk of the Carsale table

  19. In the Sale, Lineitem, Item example, the items are not actually individual items • An item is a kind of item, like a screw or a shovel or a microwave oven • The seller may have many of each kind of item in stock and doesn’t distinguish between individual items

  20. Multiple instances of the same (kind of) item may be sold to the same customer • Also, the same (kind of) item can be sold to more than one customer • It’s not incredibly difficult, but it’s worth emphasizing that the itemnodoes appear in the table in the middle as a fk • This tells which item that line of a sale was in reference to • However, the itemno is not part of the pk of the table in the middle

  21. In a perfect world, you might argue that each item should appear on only one line of a sale • If so, then you could dispense with individual line numbers and use the itemno as part of the pk instead • However, reality makes the given solution better

  22. When creating a data model, it should be flexible and accommodate all possibilities • Could a customer, in the middle of making a purchase, decide that more instances of a certain item were desired? • If so, do you allow this, and how do you support it?

  23. From a business point of view, few things are more destructive than a computer system whose model imposes artificial constraints on the user (seller and customer) • Of course, if a customer decides that more instances are desired you want to sell them

  24. Have you ever heard things like these: • “I’d like to let you buy more, but the computer won’t allow it.” • “I’d like to let you buy more, but it will be necessary to start a completely new bill of sale.” • “I’d like to let you buy more, but it will be necessary to go back and modify the earlier line of the sale for that item.”

  25. In any of the previous scenarios, both the customer and the salesperson want to scream • The best scenario would go like this: • “Oh, you want 20 instead of 10? We’ll just add another line here at the bottom for another 10.” • Now everybody sighs with satisfaction…

  26. Relational Division, For All, and Not Exists • The book points out that SQL, with operations like AND, OR, NOT, and so on, has qualities of algebra • Similarly, there are set operators like UNION • Although Microsoft Access SQL doesn’t support INTERSECT, some implementations do

  27. The Cartesian product represents a form of multiplication for relations • The results of a join operation are a subset of the results of a product • In an algebraic system, the existence of a multiplication operation implies the existence of a division operation

  28. As pointed out when doing the concrete SQL examples, there is no FOR ALL operator • However, double NOT EXISTS can accomplish the same thing • FOR ALL/double NOT EXISTS is roughly analogous to division in a relational system • Before we’re finished with SQL we will see queries which are actually stated in terms of division

  29. This is the point where Watson takes up the case of double not exists • The book shows a ER diagram of 3 tables capturing a many-to-many relationship • This diagram is labeled generically, but it is of the same structure as the Lineitem example

  30. It then outlines the double NOT EXISTS query that could be written for it • The fact that this models the Lineitem example is not important • The table in the middle could have a completely concatenated primary key • It could also have its own, separate primary key

  31. The important point is that the base tables are at the ends of the ER diagram • The book refers to these as target and source, respectively • The table in the middle, the associative entity, is labeled Target-Source by the book

  32. If you want to find those rows of the target which are in relation to all of the rows of the source, • Then in the double NOT EXISTS query: • The target appears first, in the outermost query • The source appears second, in the middle, in the first nested subquery • And the table in the middle appears last, in the second nested subquery • The ER diagram and the schematic query are shown on the next overhead

  33. A Design with a Cycle • The next diagram illustrates a design containing a cycle • Such designs will become especially important when considering normalization, the theory of correctness in designs • For the time being simply note that there is nothing preventing designs with cycles

  34. A Concatenated Key with Date • The next example design is one where both of the embedded foreign keys are part of the primary key of a table in the middle • However, it is more complicated than that because a date field is also included in the primary key • This allows the same pair of base values to be paired with each other more than once

  35. A Simple Concatenated Key • The next design is actually somewhat simpler • It also has two embedded pk/fk’s in the table in the middle • The table in the middle isn’t pure key though • There is also a non-key attribute field for the table in the middle

  36. The Music CD Library Example • In the overheads for chapters 3 and 4 some very primitive starting designs were given for a collection of music CD’s • At the end of chapter 5, with the capability to model many-to-many relationships, this model blossoms • On the next overhead an 8 entity design is shown • Note that 4 of the 8 entities can be classified as associative entities

More Related