560 likes | 677 Vues
This lecture introduces fundamental concepts in the design of data structures, particularly focusing on integer lists and their implementations using inductive definitions. We explore operations such as list creation, length retrieval, and adding elements, utilizing mathematical and inductive thinking. Key topics include proving properties through induction and analyzing the efficiency of algorithms based on the number of steps required for operations. Understanding these concepts is crucial for comparing algorithm performance in computer science.
E N D
Introduction to Algorithm Analysis Concepts 15-211 Fundamental Data Structures and Algorithms Peter Lee January 15, 2004
Plan • Today • Introduction to some basic concepts in the design of data structures • Reading: • For today: Chapter 5 and 7.1-7.3 • For next time: Chapter 18 and 19
Homework 1 is available! • See the Blackboard • Due Monday, Jan.19, 11:59pm
Lists of integers • Let’s start with a very simple data structure • Lists of integers, with operations such as: • create a new empty list • return the length of the list • add an integer to the end of the list • …
Implementing lists • How shall we implement this? • What design process could we use? • One answer: • Think mathematically • Think inductively
Induction • Recall proofs by induction: • If trying to prove that a property P(n) holds for all natural numbers 0, 1, 2, …, then • Prove the base case of P(0) • For n>0, assume P(n-1), show that P(n) holds
Base case Inductive case Inductive definitions • A great deal of computer science can be defined inductively • For example, we can define the factorial function as follows: • fact(0) = 1 • fact(n) = n * fact(n-1), for n>0
Implementing lists • How shall we implement this? • What design process could we use? • One answer: • Think mathematically • Think inductively
Base case Inductive case Inductive definitions • An integer list is either • an empty list, or • an integer paired with an integer list
Integer lists in Java The inductive definition gives us guidance on ways to implement integer lists in Java One possibility (not really the best): • An integer list is either • an empty list, or • an integer paired with an integer list use null define a new ListCell class
Integer lists in Java public class List { int head; List tail; public List(int n, List l) { head = n; tail = l; } } • An integer list is either • an empty list, or • an integer paired with an integer list
Another inductive definition • The length of a list L is • 0, if L is the empty list • 1 + length of the tail of L, otherwise
Implementing length() public class ListOps { public static int length (List l) { if (l==null) return 0; else return 1 + length(l.tail); } }
The add operation • The add of n onto the end of list L is • the singleton list containing n, if L is the empty list • otherwise, a list whose head is the head of L and the tail is M • where M is the result of adding n onto the end of the tail of L
Implementing add() public class ListOps { … public static List add (int n, List l) { if (l==null) return new List(n, null); else return new List(l.head, add(n, l.tail)); } }
Running time • How much time does it take to compute length()? • and also add()?
The “step” • In order to abstract from a particular piece of hardware, operating system, and language, we will focus on counting the number of steps of an algorithm • A “step” should execute in constant time • That is, it’s execution time should not vary much when the size of the input varies
Constant-time operations public class ListOps { public static int length (List l) { if (l==null) return 0; else return 1 + length(l.tail); } } This is the only operation in length() that does not run in a constant amount of time. Hence, we want to know how many times this operation is invoked.
Constant-time operations public static int length(List l) { if (l==null) return 0; else return 1 + length(l.tail); } Each call to length() requires at most a constant amount of time plus the time for a recursive call on the tail So, the “steps” we want are the number of recursive calls
length() • How many steps for length()? • for a list with N elements, length() requires N-1 steps • Since length() requires ~N steps for an “input” of size N, we say that length() runs in linear time
Our goal • Our goal is to compare algorithms against each other • Not compute the “wall-clock” time • We will also want to know if an algorithm is “fast”, “slow”, or maybe so slow as to be impractical
What about add()? public static List add(int n, List l) { if (l==null) return new List(n, null); else return new List(l.head, add(n, l.tail)); }
Reverse • The reversal of a list L is: • L, if L is empty • otherwise, the head of L added to the end of M • where M is the reversal of the tail of L
Implementing reverse() public static List reverse(List l) { if (l==null) return null; else { List r = reverse(l.tail); return add(l.head, r); } }
How many steps? • How many “steps” does reverse take? • Think back to the inductive definition: • The reversal of a list L is: • L, if L is empty • otherwise, the head of L added to M • where M is the reversal of the tail of L
Running time for reverse The running time is given by the following recurrence equation: t(0) = 0 t(n) = n + t(n-1) time required to reverse the tail Solving for t would tell us how many steps it takes to reverse a list time required to add head to the end
Reverse t(0) = 0 public static List reverse(List l) { if (l==null) return null; else { List r = reverse(l.tail); return add(l.head, r); } } t(n) = n + t(n-1)
Solving recurrence equations • A common first step is to use repeated substitution: • t(n) = n + t(n-1) • = n + (n-1) + t(n-2) • = n + (n-1) + (n-2) + t(n-3) • and so on… • = n + (n-1) + (n-2) + (n-3) + … + 1
Klaus says that this is easy… t(n) = n + (n-1) + (n-2) + … 1 = n(n+1)/2 But how on earth did he come up with this beautiful little closed-form solution?
Incrementing series • By the way, this is an arithmetic seires that comes up over and over again in computer science, because it characterizes many nested loops: for (i=1; i<n; i++) { for (j=1; j<i; j++) { f(); } }
Mathematical handbooks • For really common series like this one, standard textbooks and mathematical handbooks will usually provide closed-form solutions. • So, one way is simply to look up the answer. • Another way is to try to think visually…
Area of the leftovers:n/2 Area:n2/2 Visualizing it n … 3 So: n2/2 + n/2 = (n2+n)/2 = n(n+1)/2 2 1 0 1 2 3 … n
Proving it • Yet another approach is to start with an answer or a guess, and then verify it by induction. • t(1) = 1(1+1)/2 = 1 • Inductive case: • for n>1, assume t(n-1) = (n-1)(n-1+1)/2 = (n2 – n) /2 • then t(n) = n + (n2 – n) /2 • = (n2 + n)/2 • = n(n+1)/2
Summations • Arithmetic and geometric series come up everywhere in analysis of algorithms. • Some series come up so frequently that every computer scientist should know them by heart.
Quadratic time • Very roughly speaking, • f(n) = n(n+1)/2 • grows no faster than • g(n) = n2 • In such cases, we say that reverse() runs in quadratic time • (we’ll be more precise about this later in the course)
How about Sorting? Everybody knows how to sort an array, but we have singly linked lists. As always, think inductively: sort(nil) = nil sort(L) = insert the head into the right place in sort(tail(L))
Ordered Insert Need to insert element in order, in an already sorted lists. 2 5 10 20 50 12 2 5 12 10 20 50
Code for ordered insert public List order_insert(int x, List l) { if (x <= l.head) return new List(x, l); List t = order_insert(x, l.tail); return new List(l.head, t); } The running time depends on the position of x in the new list. But in the worst case this could take n steps.
Analysis of sort() sort(nil) = nil sort(L) = insert the head into the right place in sort(tail(L)) t(0) = 0 t(n) = n + t(n-1) which we already know to be “very roughly” n2, or quadratic time.
Insertion sort This is yet another example of a doubly-nested loop… for i = 2 to n do insert a[i] in the proper place in a[1:i-1]
How fast is insertion sort? We’ve essentially counted the number of computation steps in the worst case. But what happens if the elements are nearly sorted to begin with?
A preview of some questions • Question: Insertion sort takes n2 steps in the worst case, and n steps in the best case. What do we expect in the average case? What is meant by “average”? • Question: What is the fastest that we could ever hope to sort? How could we prove our answer?
Worst-case analysis • We’ll have much more to say, later in the course, about “worst-case” vs “average-case” vs “expected case” performance.
Better sorting • The sorting algorithm we have just shown is called insertion sort. • It is OK for very small data sets, but otherwise is slow. • Later we will look at several sorting algorithms that run in many fewer steps.