CSE 1342 Programming Concepts

CSE 1342 Programming Concepts Algorithmic Analysis Using Big-O Part 1

The Running Time of Programs • Most problems can be solved by more than one algorithm. So, how do you choose the best solution? • The best solution is usually based on efficiency • Efficiency of time (speed of execution) • Efficiency of space (memory usage) • In the case of a program that is infrequently run or subject to frequent modification, algorithmic simplicity may take precedence over efficiency.

The Running Time of Programs • An absolute measure of time (5.3 seconds, for example) is not a practical measure of efficiency because … • The execution time is a function of the amount of data that the program manipulates and typically grows as the amount of data increases. • Different computers will execute the same program (using the same data) at different speeds. • Depending on the choice of programming language and compiler, speeds can vary on the same computer.

The Running Time of Programs • The solution is to remove all implementation considerations from our analysis and focus on those aspects of the algorithm that most critically effect the execution time. • The most important aspect is usually the number of data elements (n) the program must manipulate. • Occasionally the magnitude of a single data element (and not the number of data elements) is the most important aspect.

The 90 - 10 Rule • The 90 - 10 rule states that, in general, a program spends 90% of its time executing the same 10% of its code. • This is due to the fact that most programs rely heavily on repetition structures (loops and recursive calls). • Because of the 90 - 10 rule, algorithmic analysis focuses on repetition structures.

Analysis of Summation Algorithms Consider the following code segment that sums each row of an n-by-n array (version 1): grandTotal = 0; for (k = 0; k < n; k++) { sum[k] = 0; for (j = 0; j < n; j++) { sum[k] += a[k][j]; grandTotal += a[k][j]; } } Requires 2n2 additions

Analysis of Summation Algorithms Consider the following code segment that sums each row of an n-by-n array (version 2) grandTotal = 0; for (k = 0; k < n; k++) { sum[k] = 0; for (j = 0; j < n; j++) { sum[k] += a[k][j]; } grandTotal += sum[k]; } Requires n2 + n additions

Analysis of Summation Algorithms • When we compare the number of additions performed in versions 1 and 2 we find that … (n2 + n) < (2n2) for any n > 1 • Based on this analysis the version 2 algorithm appears to be the fastest. Although, as we shall see, faster may not have any real meaning in the real world of computation.

Analysis of Summation Algorithms • Further analysis of the two summation algorithms. • Assume a 1000 by 1000 ( n = 1000) array and a computer that can execute an addition instruction in 1 microsecond. • 1 microsecond = one millionth of a second. • The version 1 algorithm (2n2) would require 2(10002)/1,000,000 = 2 seconds to execute. • The version 2 algorithm (n2 + n) would require (10002 + 1000)/1,000,000 = = 1.001 seconds to execute. • From a users real-time perspective the difference is insignificant

Analysis of Summation Algorithms • Now increase the size of n. • Assume a 100,000 by 100,000 ( n = 100,000) array. • The version 1 algorithm (2n2) would require 2(100,0002)/1,000,000 = 20,000 seconds to execute (5.55 hours). • The version 2 algorithm (n2 + n) would require (100,0002 + 100,000)/1,000,000 = 10,000.1 seconds to execute (2.77 hours). • From a users real-time perspective both jobs take a long time and would need to run in a batch environment. • In terms of order of magnitude (big-O) versions 1 and 2 have the same efficiency - O(n2).

Big-O Analysis Overview • O stands for order of magnitude. • Big-O analysis is independent of all implementation factors. • It is dependent (in most cases) on the number of data elements (n) the program must manipulate. • Big-O analysis only has significance for large values of n. • For small values of n big-o analysis breaks down. • Big-O analysis is built around the principle that the runtime behavior of an algorithm is dominated by its behavior in its loops (90 - 10 rule).

Definition of Big-O • Let T(n) be a function that measures the running time of a program in some unknown unit of time. • Let n represent the size of the input data set that the program manipulates where n > 0. • Let f(n) be some function defined on the size of the input data set, n. • We say that “T(n) is O(f(n))” if there exists an integer n0 and a constant c, where c > 0, such that for all integers n >= n0 we have T(n) <= cf(n). • The pair n0 and c are witnesses to the fact that T(n) is O(f(n))

Simplifying Big-O Expressions • Big-O expressions are simplified by dropping constant factors and low order terms. • The total of all terms gives us the total running time of the program. For example, say that T(n) = O(f3(n) + f2(n) + f1(n)) where f3(n) = 4n3; f2(n) = 5n2; f1(n) = 23 or to restate T(n): T(n) = O(4n3 + 5n2 + 23) • After stripping out the constants and low order terms we are left with T(n) = O(n3)

Simplifying Big-O Expressions T(n) = f1(n) + f2(n) + f3(n) + … + fk(n) • In big-O analysis, one of the terms in the T(n) expression is identified as the dominant term. • A dominant term is one that, for large values of n, becomes so large that it allows us to ignore the other terms in the expression. • The problem of big-O analysis can be reduced to one of finding the dominant term in an expression representing the number of operations required by an algorithm. • All other terms and constants are dropped from the expression.

Big-O Analysis Example 1 for (k = 0; k < n/2; ++k) { for (j = 0; j < n*n; ++j) { statement(s) } } • Outer loop executes n/2 times • Inner loop executes n2 times • T(n) = (n/2)(n2) = n3/2 = .5(n3) • T(n) = O(n3)

Big-O Analysis Example 2 for (k = 0; k < n/2; ++k) { statement(s) } for (j = 0; j < n*n; ++j) { statement(s) } • First loop executes n/2 times • Second loop executes n2 times • T(n) = (n/2) + n2 = .5n + n2 • n2 is the dominant term • T(n) = O(n2)

Big-O Analysis Example 3 while (n > 1) { statement(s) n = n / 2; } • The values of n will follow a logarithmic progression. • Assuming n has the initial value of 64, the progression will be 64, 32, 16, 8, 4, 2. • Loop executes log2 times • O(log2 n) = O(log n)

Big-O Comparisons

Analysis Involving if/else if (condition) loop1; //assume O(f(n)) for loop1 else loop2; //assume O(g(n)) for loop 2 • The order of magnitude for the entire if/else statement is O(max(f(n), g(n)))

An Example Involving if/else f(n) = n2 if (a[1][1] = = 0) for (i = 0; i < n; ++i) for (j = 0; j < n; ++j) a[i][j] = 0; else for (i = 0; i < n; ++i) a[i][j] = 1; • The order of magnitude for the entire if/else statement is O(max(f(n), g(n))) = O(n2) g(n) = n

CSE 1342 Programming Concepts