LECTURE 5: Analysis of Algorithms Efficiency (II)

LECTURE 5: Analysis of Algorithms Efficiency (II) Algorithmics - Lecture 5

In the previous lecture we saw … Which are the basic steps in analyzing the time efficiency of an algorithm: • Identify the input size • Identify the dominant operation • Count for how many times the dominant operation is executed (estimate the running time) • If this number depends on the properties of input data we analyze: • Best case => lower bound of the running time • Worst case => upper bound of the running time • Average case => averaged running time Algorithmics - Lecture 5

Today we will see that … … the main aim of efficiency analysis is to find out how increases the running time when the problem size increases … we don’t need very detailed expressions of the running time, but we need to identify: • The order of growth of the running time • The efficiencyclass to which an algorithm belongs Algorithmics - Lecture 5

Outline • What is the order of growth ? • What is asymptotic analysis ? • Some asymptotic notations • Efficiency analysis of basic processing structures • Efficiency classes • Empirical analysis of the algorithms Algorithmics - Lecture 5

What is the order of growth? In the expression of the running time one of the terms will become significantly larger than the other ones when n becomes large : this is the so-called dominant term Dominant term: a n Dominant term: a log n Dominant term:a n2 Dominant term:an T1(n)=an+b T2(n)=a log n+b T3(n)=a n2+bn+c T4(n)=an+b n +c (a>1) Algorithmics - Lecture 5

What is the order of growth? Let us analyze what happens with the dominant term when the input size is multiplied by k: T’1(kn)= a kn=k T’1(n) T’2(kn)=a log(kn)=T’2(n)+alog k T’3(kn)=a (kn)2=k2 T’3(n) T’4(kn)=akn=(an)k =T’4(n)k T’1(n)=an T’2(n)=a log n T’3(n)=a n2 T’4(n)=an Algorithmics - Lecture 5

What is the order of growth? The order of growth expresses how increases the dominant term of the running time with the input size Order of growth Linear Logarithmic Quadratic Exponential T’1(kn)= a kn=k T’1(n) T’2(kn)=a log(kn)=T’2(n)+a log k T’3(kn)=a (kn)2=k2 T’3(n) T’4(kn)=akn=(an)k =(T’4(n))k Algorithmics - Lecture 5

How can be interpreted the order of growth? Between two algorithms it is considered that the one having a smaller order of growth is more efficient However, this is true only for large enough input sizes Example. Let us consider T1(n)=10n+10 (linear order of growth) T2(n)=n2 (quadratic order of growth) If n<=10 then T1(n)>T2(n) In this case the order of growth is relevant only for n>10 Algorithmics - Lecture 5

A comparison of orders of growth The multiplicative constants in the dominant term can be ignored Algorithmics - Lecture 5

Comparing orders of growth The order of growth of two running times T1(n) and T2(n) can be compared by computing the limit of T1(n)/T2(n) when n goes to infinity.- If the limit is 0 then T1(n) has a smaller order of growth than T2(n)- If the limit is a finite constant c (c>0) then T1(n) and T2(n) have the same order of growth- If the limit is infinity then T1(n) has a larger order of growth than T2(n) Algorithmics - Lecture 5

What is asymptotic analysis ? • The differences between orders of growths are more significant for larger input size • Analyzing the running times on small inputs does not allow us to distinguish between efficient and inefficient algorithms • Asymptotic analysis deals with analyzing the properties of the running time when the input size goes to infinity (this means a large input size) Algorithmics - Lecture 5

What is asymptotic analysis ? • Depending on the behavior of the running time when the input size becomes large, the algorithm can belong to different classes of efficiency • There are several standard notations used in algorithms efficiency analysis :  (big – Theta) O (big – O) Ω (big – Omega) Algorithmics - Lecture 5

Outline • What is the order of growth ? • What is asympotic analysis ? • Some asymptotic notations • Efficiency analysis of basic processing structures • Efficiency classes • Empirical analysis of the algorithms Algorithmics - Lecture 5

 - notation Let f,g: N-> R+ Definition. f(n) (g(n)) iff there exist c1, c2 > 0 and n0N such that c1g(n) ≤f(n) ≤ c2g(n) for all n≥n0 Notation. f(n)=  (g(n)) (same order of growth as g(n)) Examples. • T(n) = 3n+3  T(n)  (n) c1=2, c2=4, n0=3, g(n)=n • T(n)= n2+10 nlgn +5  T(n)  (n2) c1=1, c2=2, n0=40, g(n)=n2 Algorithmics - Lecture 5

 - notation Graphical illustration. f(n) is bounded, for large values of n, both above and below by g(n) multiplied by some positive constants c2g(n)=2n2 c1g(n) ≤ f(n) ≤ c2g(n) f(n)= n2+10 nlgn +5 Doesn’t matter c1g(n)=n2 (n2) n0 Algorithmics - Lecture 5

 - notation. Properties • If T(n)=aknk+ak-1nk-1+…+a1n+a0 then T(n) (nk) Proof. Since T(n)>0 for all n it follows that ak>0. Then T(n)/nk ->ak (as n->). Thus for all ε>0 there exists N(ε) such that |T(n)/nk- ak|< ε => ak- ε<T(n)/nk<ak+ ε for all n>N(ε) Let us suppose that ak- ε>0. Then by taking c1=(ak- ε), c2=ak+ ε and n0=N(ε) one obtains c1nk < T(n) <c2nk for all n>n0, i.e. T(n) (nk) Algorithmics - Lecture 5

 - notation. Properties 2.(c g(n))= (g(n)) for all constant c Proof. Let f(n) (cg(n)). Then c1cg(n) ≤ f(n) ≤ c2cg(n) for all n≥n0. By taking c’1= cc1 and c’2= c c2 we obtain that f(n) (g(n)). Thus (cg(n))(g(n)). Similarly we can prove that  (g(n))  (cg(n)), so  (cg(n))=  (g(n)). Particular cases: a) (c)= (1) b) (logah(n))= (logbh(n)) for all a,b >1 The logarithm’s base is not significant in specifying the efficiency class. We will use logarithms in base 2 Algorithmics - Lecture 5

 - notation. Properties 3. f(n) (f(n)) (reflexivity) 4. f(n) (g(n)) => g(n) (f(n)) (symmetry) 5. f(n) (g(n)) , g(n) (h(n)) => f(n) (h(n)) (transitivity) 6. (f(n)+g(n)) = (max{f(n),g(n)}) Algorithmics - Lecture 5

 - notation. More examples • 3n<=T(n) <=4n-1  T(n) (n) c1=3, c2=4, n0=1 • Multiplying two matrices: T(m,n,p)=4mnp+5mp+4m+2 Extension of the definition: f(m,n,p) (g(m,n,p)) iff there exist c1, c2 >0 and m0,n0 ,p0 N such that c1g(m,n,p) <=f(m,n,p) <=c2g(m,n,p) for all m>=m0, n>=n0, p>=p0 Thus T(m,n,p) (mnp) • Sequential search: 6<= T(n) <= 3(n+1) If T(n)=6 then we cannot find c1 such that 6 >= c1n for large values of n. Thus T(n) does not belong to (n). There exist running times which do not belong to a big-theta class. Algorithmics - Lecture 5

O - notation Definition. f(n) O(g(n)) iff there exist c >0 and n0 N such that f(n) <=cg(n) for all n>=n0 Notation. f(n)= O(g(n)) (an order of growth at most as that of g(n)) Examples. • T(n) = 3n+3  T(n)  O(n) c=4, n0=3, g(n)=n 2. 6<= T(n) <= 3(n+1) T(n)  O(n) c=4, n0=3, g(n)=n Algorithmics - Lecture 5

O - notation Graphical illustration. f(n) is bounded above, for large values of n, by g(n) multiplied by a positive constant cg(n)=n2 f(n)<=cg(n) doesn’t matter f(n)= 10nlgn +5 O(n2) n0=36 Algorithmics - Lecture 5

O – notation. Properties • If T(n)=aknk+ak-1nk-1+…+a1n+a0 then T(n)  O(nd) for all d>=k Proof. Since T(n)>0 for all n it follows that ak>0. Then T(n)/nk -> ak (as n->). Thus for all ε>0 there exists N(ε) such that T(n)/nk <= ak+ ε for all n>N(ε) Hence T(n) <= (ak+ ε)nk <= (ak+ε)nd Then by taking c=ak+ ε and n0=N(ε) one obtains T(n) <cnd for all n>n0, i.e. T(n) O(nd) Example. n O (n2) (it is correct but is more useful to write n O (n)) Algorithmics - Lecture 5

O – notation. Properties 2. f(n)  O(f(n)) (reflexivity) 3. f(n)  O(g(n)) , g(n)  O(h(n)) => f(n)  O(h(n)) (transitivity) 4. (g(n)) is a subset of O(g(n) Remark. The inclusion is a strict one: there exist elements of O(g(n)) which do not belong to (g(n)) Example: f(n)=10nlgn+5, g(n)=n2 f(n)<=g(n) for all n>=36  f(n)O(g(n)) But it don’t exist constants c and n0 such that: cn2 <= 10nlgn+5 for all n >= n0 Algorithmics - Lecture 5

O – notation. Properties When by a worst case analysis we obtain that T(n) <= g(n) we can say that the running time of the algorithm belongs to O(g(n)) Example. Sequential search: 6<= T(n) <= 3(n+1) Thus the running time of sequential search belongs to O(n) Algorithmics - Lecture 5

Ω – notation Definition. f(n)Ω(g(n)) iff there exist c > 0 and n0 N such that cg(n) <= f(n) for all n>=n0 Notation. f(n)= Ω(g(n)) (an order of growth at least as that of g(n)) Examples. • T(n) = 3n+3  T(n) Ω(n) c=3, n0=1, g(n)=n 2. 6<= T(n) <= 3(n+1) T(n) Ω(1) c=6, n0=1, g(n)=1 Algorithmics - Lecture 5

Ω – notation f(n)=10nlgn+5 Graphical illustration. f(n) is bounded below by g(n) multiplied by a positive constant cg(n)<=f(n) Doesn’t matter cg(n)=20n Ω(n) n0=7 Algorithmics - Lecture 5

Ω – notation. Properties • If T(n)=aknk+ak-1nk-1+…+a1n+a0 then T(n) Ω(nd) for all d<=k Proof. Since T(n)>0 for all n it follows that ak>0. Then T(n)/nk -> ak (as n->). Thus for all ε>0 there exists N(ε) such that ak - ε <= T(n)/nk for all n>N(ε) Hence (ak -ε)nd <=(ak-ε)nk <=T(n) Then by taking c=ak- ε and n0=N(ε) one obtains cnd <= T(n) for all n>n0, i.e. T(n) Ω(nd) Example. n2 Ω (n) Algorithmics - Lecture 5

Ω – notation. Properties • (g(n))Ω(g(n) Proof. It suffices to consider only the lower bound from big-theta definition Remark. The inclusion is a strict one: there exist elements of Ω(g(n)) which do not belong to (g(n)) Example: f(n)=10nlgn+5, g(n)=n f(n) >= 10g(n) for all n>=1  f(n)Ω(g(n)) But it don’t exist constants c and n0 such that: 10nlgn+5<=cn for all n >= n0 3. (g(n))=O(g(n))Ω(g(n) Algorithmics - Lecture 5

Efficiency analysis of basic processing structures • Sequential structure P: P1(g1(n))O(g1(n)) Ω(g1(n)) P2(g2(n))O(g2(n)) Ω(g2(n)) … … … Pk(gk(n))O(gk(n)) Ω(gk(n)) ---------------------------------------------------- (max{g1(n),g2(n), …, gk(n)}) O(max{g1(n),g2(n), …, gk(n)}) Ω(max{g1(n),g2(n), …, gk(n)}) Algorithmics - Lecture 5

Efficiency analysis of basic processing structures • Conditional statement P: IF <condition> THEN P1(g1(n))O(g1(n)) Ω(g1(n)) ELSE P2(g2(n))O(g2(n)) Ω(g2(n)) --------------------------------------------------------------------------- O(max{g1(n),g2(n)}) Ω(min{g1(n),g2(n)}) Algorithmics - Lecture 5

Efficiency analysis of basic processing structures • Loop statement P: FOR i←1, n DO P1 (1) (n) FOR i←1,n DO FOR j ← 1,n DO P1 (1) (n2) Remark: If the counting variables vary between 1 and n the complexity is nk (k is the number superposed loops) Algorithmics - Lecture 5

Efficiency analysis of basic processing structures Remark. If the limits of the counters are modified inside the loop body then the analysis need to be modified Example: m ← 1 FOR i ← 1,n DO m ← 3*m {m=3i} FOR j ← 1,m DO processing step from (1) The complexity of the sequence is: 3+32+…+3n = (3n+1-1)/2-1 The complexity (3n) Algorithmics - Lecture 5

Efficiency classes Some frequently encountered efficiency classes: Algorithmics - Lecture 5

Example Let, x[1..n] be an array with values from the set {1,…,n}. Let suppose that this array either has distinct elements or there is a unique pair of indices (i,j) such that i<>j and x[i]=x[j] Particular case: n=8, x=[2,1,4,5,3,8,7,6] all elements are distinct x=[2,1,4,5,3,8,5,6] there is a pair of identical elements Find an efficient algorithm (both with respect the running time and the memory space) to check if all elements are distinct or not Algoritmica - Curs 5

Example search(x[left..right],v) i ← left while x[i]<>v AND i<right do i ← i+1 endwhile if x[i]=v then return True else return False endif Subproblem size: k=f-s+1 1<= T’(k)<=k Best case: x[1]=x[2] Worst case: distinct elements Variant 1: check1(x[1..n]) i←1 d ← True while (d=True) and (i<n) do d←NOT (search(x[i+1..n],x[i])) i ← i+1 endwhile return d Problem size: n 1<= T(n)<=T’(n-1)+T’(n-2)+…+T’(1) 1<=T(n)<=n(n-1)/2 T(n) Ω (1), T(n) O (n2) Algoritmica - Curs 5

Example Variant 3: check3(x[1..n]) Integer f[1..n] // frequencies f[1..n] ← 0 i ← 1 while i<=n do f[x[i]]←f[x[i]]+1 if f[x[i]]>=2 then return False i ← i+1 endif endwhile return True Problem size: n 4<= T(n)<=2n T(n) O (n) Variant 2: check2(x[1..n]) Integer f[1..n] // frequencies f[1..n] ← 0 for i ← 1 to n do f[x[i]]←f[x[i]]+1 i ← 1 while f[i]<2 AND i<n do i ← i+1 if f[i]>=2 then return False else return True endif Problem size: n n+3<= T(n)<=2n T(n)  (n) Algoritmica - Curs 5

Example check4(x[1..n]) s ← 0 for i:=1 to n do s←s+x[i] endfor if s=n(n+1)/2 then return True else return False Endif Problem size: n T(n) = n T(n)  (n) Remark. Variant 4 is better than variant 3 with respect to the size of the memory space but the average running time is smaller in variant 3 than in variant 4 Variant 4: Variants 2 and 3 need an additional memory space of size O(n) Can we solve the problem in linear time by using an additional memory space of size O(1) ? Idea: the elements are distinct if and only if the array contains all elements of the set {1,2,…,n}. In the case when only one value is duplicated then it is enough to check the fact that the sum of all elements is n(n+1)/2 Algoritmica - Curs 5

Empirical analysis of the algorithms Sometimes the mathematical analysis of efficiency is too difficult to apply … in these cases the empirical analysis could be useful It can be used to: • Develop a hypothesis about the algorithm’s efficiency • Compare the efficiency of several algorithms designed to solve the same problem • Establish the efficiency of an algorithm’s implementation • Check the accuracy of a theoretical assertion about algorithm’s efficiency Algorithmics - Lecture 5

General plan for empirical analysis • Establish the aim of the analysis • Choose an efficiency measure (e.g. number of executions of some operations or time needed to execute a sequence of processing steps) • Decide on the characteristics of the input sample (size, range …) • Implement the algorithm in a programming language • Generate a set of data inputs • Execute the program for each data sample and record the results • Analyze the obtained results Algorithmics - Lecture 5

General plan for empirical analysis Efficiency measure. It is chosen depending on the aim of the empirical analysis: • If the aim is to estimate the efficiency class an adequate efficiency measure is the number of operations • If the aim is to analyze/compare the implementation of an algorithm on a given machine an adequate efficiency measure is the physical time Algorithmics - Lecture 5

General plan for empirical analysis Set of input data. Different input data must be generated in order to conduct a useful empirical analysis Some rules in generating input data: • The input data in the set should be of different sizes and values (the entire range of values should be represented) • All characteristics of input data should be represented in the sample set (different configurations) • The data should be typical (not only exceptions) Algorithmics - Lecture 5

General plan for empirical analysis Algorithm’s implementation. Some monitoring processing step should be included: • Counting variables (when the efficiency measure is the number of executions) • Calls of some functions which return the current time (in order to estimate the time needed to execute a processing sequence) Algorithmics - Lecture 5

Next lecture will be on … … basic sorting algorithms … on their correctness and … on their efficiency Algorithmics - Lecture 5

LECTURE 5: Analysis of Algorithms Efficiency (II)