340 likes | 540 Vues
CSE 326 Data Structures: Complexity. Lecture 2: Wednesday, Jan 8, 2003. Overview of the Quarter. Complexity and analysis of algorithms List-like data structures Search trees Priority queues Hash tables Sorting Disjoint sets Graph algorithms Algorithm design Advanced topics.
E N D
CSE 326 Data Structures: Complexity Lecture 2: Wednesday, Jan 8, 2003
Overview of the Quarter • Complexity and analysis of algorithms • List-like data structures • Search trees • Priority queues • Hash tables • Sorting • Disjoint sets • Graph algorithms • Algorithm design • Advanced topics Midtermaround here
Complexity and analysis of algorithms • Weiss Chapters 1 and 2 • Additional material • Cormen,Leiserson,Rivest – on reserve at Eng. library • Graphical analysis • Amortized analysis
Program Analysis • Correctness • Testing • Proofs of correctness • Efficiency • Asymptotic complexity - how running times scales as function of size of input
Proving Programs Correct • Often takes the form of an inductive proof • Example: summing an array int sum(int v[], int n) { if (n==0) return 0; else return v[n-1]+sum(v,n-1); } What are the parts of an inductive proof?
Inductive Proof of Correctness int sum(int v[], int n) { if (n==0) return 0; else return v[n-1]+sum(v,n-1); } Theorem: sum(v,n) correctly returns sum of 1st n elements of array v for any n. Basis Step: Program is correct for n=0; returns 0. Inductive Hypothesis (n=k): Assume sum(v,k) returns sum of first k elements of v. Inductive Step (n=k+1): sum(v,k+1) returns v[k]+sum(v,k), which is the same of the first k+1 elements of v.
Inductive Proof of Correctness Binary search in a sorted array: v[0]v[1] ... v[n-1] Given x, find m s.t. x=v[m] Exercise 1: proof that it is correct int find(int x, int v[], int n) { int left = -1; int right = n; while (left+1 < right) { int m = (left+right) / 2; if (x == v[m]) return m; if (x < v[m]) right = m; else left = m; } return –1; /* not found */ } Exercise 2: compute m, k s.t.v[m-1] < x = v[m] = v[m+1] = ... = v[k] < v[k+1]
Proof by Contradiction • Assume negation of goal, show this leads to a contradiction • Example: there is no program that solves the “halting problem” • Determines if any other program runs forever or not Alan Turing, 1937
Program NonConformist (Program P) If ( HALT(P) = “never halts” ) Then Halt Else Do While (1 > 0) Print “Hello!” End While End If End Program • Does NonConformist(NonConformist) halt? • Yes? That means HALT(NonConformist) = “never halts” • No? That means HALT(NonConformist) = “halts” Contradiction!
Defining Efficiency • Asymptotic Complexity - how running time scales as function of size of input • Two problems: • What is the “input size” ? • How do we express the running time ? (The Big-O notation)
Input Size • Usually: length (in characters) of input • Sometimes: value of input (if it is a number) • Which inputs? • Worst case: tells us how good an algorithm works • Best case: tells us how bad an algorithm works • Average case: useful in practice, but there are technical problems here (next)
Input Size Average Case Analysis • Assume inputs are randomly distributed according to some “realistic” distribution • Compute expected running time • Drawbacks • Often hard to define realistic random distributions • Usually hard to perform math
Input Size • Recall the function: find(x, v, n) • Input size: n (the length of the array) • T(n) = “running time for size n” • But T(n) needs clarification: • Worst case T(n): it runs in at most T(n) time for any x,v • Best case T(n): it takes at least T(n) time for any x,v • Average case T(n): average time over all v and x
Input Size Amortized Analysis • Instead of a single input, consider a sequence of inputs: • This is interesting when the running time on some input depends on the result of processing previous inputs • Worst case analysis over the sequence of inputs • Determine average running time on this sequence • Will illustrate in the next lecture
Definition of Order Notation • Upper bound: T(n) = O(f(n)) Big-O Exist constants c and n’ such that T(n) c f(n) for all n n’ • Lower bound: T(n) = (g(n)) Omega Exist constants c and n’ such that T(n) c g(n) for all n n’ • Tight bound: T(n) = (f(n)) Theta When both hold: T(n) = O(f(n)) T(n) = (f(n)) Other notations: o(f), (f) - see Cormen et al.
Which Function Dominates? g(n) = 100n2 + 1000 log n 2n + 10 log n n! 1000n15 3n7 + 7n f(n) = n3 + 2n2 n0.1 n + 100n0.1 5n5 n-152n/100 82log n Question to class: is f = O(g) ? Is g = O(f) ?
Race I f(n)= n3+2n2 vs. g(n)=100n2+1000
Race II n0.1 vs. log n
Race III n + 100n0.1 vs. 2n + 10 log n
Race IV 5n5 vs. n!
Race V n-152n/100 vs. 1000n15
Race VI 82log(n) vs. 3n7 + 7n
Eliminate low order terms • Eliminate constant coefficients
Common Names Slowest Growth constant: O(1) logarithmic: O(log n) linear: O(n) log-linear: O(n log n) quadratic: O(n2) exponential: O(cn) (c is a constant > 1) hyperexponential: (a tower of n exponentials Other names: superlinear: O(nc) (c is a constant > 1) polynomial: O(nc) (c is a constant > 0) Fastest Growth
Sums and Recurrences Often the function f(n) is not explicit but expressed as: • A sum, or • A recurrence Need to obtain analytical formula first
More Sums Sometimes sums are easiest computed with integrals:
Recurrences • f(n) = 2f(n-1) + 1, f(0) = T • Telescoping f(n)+1 = 2(f(n-1)+1) f(n-1)+1 = 2(f(n-2)+1) 2 f(n-2)+1 = 2(f(n-3)+1) 22 . . . . . f(1) + 1 = 2(f(0) + 1) 2n-1 f(n)+1 = 2n(f(0)+1) = 2n(T+1) f(n) = 2n(T+1) - 1
Recurrences • Fibonacci: f(n) = f(n-1)+f(n-2), f(0)=f(1)=1 try f(n) = A cn What is c ? A cn = A cn-1 + A cn-2c2 – c – 1 = 0 Constants A, B can be determined from f(0), f(1) – not interesting for us for the Big O notation
Recurrences • f(n) = f(n/2) + 1, f(1) = T • Telescoping:f(n) = f(n/2) + 1f(n/2) = f(n/4) + 1. . .f(2) = f(1) + 1 = T + 1 f(n) = T + log n = O(log n)