350 likes | 475 Vues
This presentation delves into the evolution of databases from the early clerical systems to modern relational models developed by pioneers like Ted Codd. We will discuss the significance of databases in large-scale software engineering, their architectural models, and how they are pivotal in managing vast amounts of data. Attendees will gain insights into different database systems, their operations (like CRUD), and the intricate relationships between data as well as the technologies that have shaped today's database environments.
E N D
Software Engineering 3156 6-Nov-01 #18: Language I, CVEs, and Databases Phil Gross
Administrivia • Upcoming events • Today, 4pm, Interschool (7th floor CEPSR) • Graduate School panel • Tomorrow, 6pm, 415 CEPSR • Me doing C++ again, hopefully better this time • Monday, 11-12:15, Interschool • Phil Wadler: should be a good talk 2
Also Monday • 6pm, location TBA • Janak gets his wireless on • All you ever wanted to know about wireless comm • Phones, Ethernet, Bluetooth • Unleashing his gadget-freak side 3
Mini-Case Studies • iPod installer • Apple screws up big time • Unix is powerful, but Unix is lame • rm -rf "$2Applications/iTunes.app" 2> /dev/null • IRIX 5.1 • How to make a bad product much worse • At great expense • http://yarchive.net/risks/sgi_irix.html 4
Collaborative Virtual Environments • Many people are on-line these days • E-mail works • Chat works, but doesn’t scale too well • People want to see other people • Visual systems are highly optimized for 3D • Get vastly more data • Extra dimensions 5
Solutions? • Video conferencing • Aka Brady Bunch Model • Doesn’t scale much past two or three • Virtual environments • Avatars • Usually trivial • Environment • Usually static 6
Problem Domains • Web, web, web • http://www.cs.brown.edu/memex/ACMCSHT/51/51.html • Military simulation • All military scales, platoon to division • Mechanized and not • Distributed Software Development • CHIME • Complex back-end sources 7
Databases in Nutshell • One of the oldest computer applications • Historically, armies of clerks filed and retrieved • Early computers (i.e. IBM) changed this • Founded on inventions to speed 1890 census • Further developments after intro of Social Security in 1935 (26 million records) • As always with computers, primarily military uses at first 8
First Database Systems • 1950s: Idea of a database, as hardware-independent entity • 1960: COBOL established • 1968: IBM’s IMS for Apollo project • 1971: Codasyl (Conference on Data Systems Languages) model 9
IMS and Codasyl • IMS is hierarchical • Data has parents, children, siblings • Move through a tree-like structure • Codasyl model is a network model • Think arbitrary pointers • Programmer as navigator 10
Relational Model • Ted Codd • "A Relational Model of Data for Large Shared Data Banks," 1970 • Data is independent of storage • Queries are non-procedural and specified against the entire data set • Not an immediately successful idea 11
Ah, Politics • IBM had massive investments in IMS • “Strategic” product • Codd was at IBM San Jose • Supposed to be building disk drives, not coming up with new software paradigms • 10 years pass • 1973-9: Ingres at Berkeley • Open source success story! • 1974-9: IBM develops System R and SQL 12
Modern Databases • Everybody came from Ingres • Oracle, Sybase, Informix • SQL first commercially released by Oracle • Despite having been developed by IBM • 1980: First relational database from IBM • Ingres lives on: first postgres, then postgresql • Blows out mysql IMHO 13
So What’s It All About • Data is stored as rows in tables • Tables have columns • Table employee has columns name, address, id-number, and salary • One row is “’phil’, ‘118th st’, 3456, 120000” • Another row is “’janak’, ‘Long Island’, 5678, 15000” • Set of tables organized into Database 14
SQL is Easy: Four Commands • Select, Insert, Update, and Delete • Where clauses can get tricky, though • Select * from employee where salary > 20000 • Insert into employee values(‘Suhit’, ‘Roosevelt Island’, 1010, 60000) • Update employee set salary = 120000 where name = ‘Janak’ • Delete from employee where id=1010 15
Theoretically Clean • All operations are on tables • All return values are row sets • Query is actually sent as ascii over the wire • Best strategy to answer is computed by database • Structure of database also stored in “system” tables • Table Tables, table Columns 16
Joins • Relate two tables based on common data column • Table with author-ID, author-full-name • Table with paper-title, date, author-ID • Print out paper-title, author-full-name • Normalized: author changes name, only gets changed once • Beware Social Security Numbers! 17
Why Is This Important • Software Engineering deals with large-scale systems • Virtually all large-scale systems have a database component • Often the centerpiece of a system • And an organization • Also, DBAs have the highest salaries in the IT field 18
Also, Remember IMS • Remind you of anything? • Anything else? • Everything comes back in style • I thought IMS was dead, btw; It’s not • IMS serves 200 million end users, managing over 15 billion Gigabytes of production data and processing over 50 billion transactions every day. • One customer reports 3000 days uptime 19
Language I: C • Hello, world #include <stdio.h> main() { printf(“hello, world\n”); } • #include thingy not a C language statement • printf not C language either 20
To Compile: • cc hello.c • Produces program… • Called a.out • !!?? • cc –o hello hello.c will produce program called hello 21
Simple/Stupid Compilation • Compiler can’t find dependencies • Run-time can’t find dynamic libraries • Well, it can now, but not originally • Import becomes a literal include • Compiles to native code • All files resulting from compilation are physically attached to one another to create executable (static linking) 22
Like Java • Or rather, Java (and C++) are wholesale ripoffs of C • While, if-else, for, switch-case (original sin) • Operators, % for mod, ++ and --, prefix and postfix, no exponentiation operator 23
Datatypes • Integers, and a little floating point • Signed and unsigned char, int, short, long • No booleans: 0=false, nonzero=true • Different sizes • Not precisely defined! • Characters just aliases for bytes • Machine addresses are a native type • “pointers” 24
Pointers • Can get address of any variable • The ‘&’ operator • Can value stored at any address • The ‘*’ operator • Declarations: int *ip; /* ip will “point” to an int */ void *vp; /* vp will “point” to something… */ 25
More Pointers int x = 7; ip = &x; printf(“%d\n”, *ip); • Prints 7 *ip = 9 /* assignment */ printf(“%d\n”, x); • Prints 9 26
Pointer Arithmetic • If ip == 4, after ip++, ip == 8 • assuming ints are four bytes • This is the meaning of the “type” of a pointer • Doesn’t really have arrays, just pointers int a[7]; a ≡ &a[0] ++a ≡ &a[1] 27
C Likes Pointers • No strings, just char arrays terminated with 0 • Char pointers flying all over the place in a C program • usually pointing to strings • Note three meanings of 0 • Falseness • End-of-String char • Null pointer (guaranteed invalid) 28
The StringCopy Puzzler void strcpy(char *dst, char *src) { while (*dst++ = *src++) ; } • Assignment returns value • Dereference binds tighter than post-increment • Strings are 0 terminated arrays of chars • 0 ≡ false 29
C is Minimal • i.e. nearly nonexistent runtime services • No array/pointer bounds checking • Except for processor halt if you try to dereference 0 • Carefree casting • Reinterpret bits as you please: You’re the boss! • No exception mechanism • Certainly no garbage collection 30
Memory management • Everything is primitive by default • If you want a block of bytes, just ask void *malloc(int size) int *myString = (int *)malloc(1024 * sizeof(int)); • myString is all mine, until I free it • Exactly once, please • And if I double-free or don’t free? • I’m F*(&%$!ed 31
Pointers Not Null-initialized • char *cp; • Copy a bunch of stuff to cp • Perfectly legal • Totally undefined • Disaster • Need to allocate space to cp 32
“Right” way • char *cp = (char *) malloc (SOMESIZE); • Now I can “safely” copy stuff to cp • Why all the quotes? • What if SOMESIZE isn’t big enough… 33
EZ Aggregation • struct point { • int x; • int y; • }; • struct point pt; • pt.x = 3; pt.y = 4; 34
Struct Meets Pointers struct point *ppt = &pt; (*ppt).x == 3 • Or…. ppt->x == 3 • Wacky “arrow” syntax • And then there are function pointers, but we won’t go there… 35