1 / 42

CS 420 – Design of Algorithms

CS 420 – Design of Algorithms. Basic Concepts. Design of Algorithms. We need mechanism to describe/define algorithms Independent of the language implementation of the algorithm Pseudo-code. Algorithms.

Gideon
Télécharger la présentation

CS 420 – Design of Algorithms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 420 – Design of Algorithms Basic Concepts

  2. Design of Algorithms • We need mechanism to describe/define algorithms • Independent of the language implementation of the algorithm • Pseudo-code

  3. Algorithms • Algorithm – “any well defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values, as output” Cormen, et a.

  4. Algorithms • Algorithm – “is a procedure (a finite set of well-defined instructions) for accomplishing some task which, given an initial state, will terminate in a defined end-state. “ • from Wikipedia.org • http://en.wikipedia.org/wiki/Algorithm

  5. Algorithms • Human Genome Project • Security/encryption for e-commerce • Spacecraft navigation • Pulsar searches • Search Engines

  6. Algorithms • Search engines • Search algorithms • linear search- • read-compare-read-… • run-time – linear function of n (n=size of database) • suppose the DB has 40,000,000 records • then 40,000,000 read-compare cycles • at 1000 read-compare cycles per second = 40,000 seconds = 667 minutes ~ 11 hours

  7. Algorithms • Search Google for “house” • 730,000,000 hits in 0.1 seconds

  8. Algorithms • Binary tree search algorithm • The keyword is indexed in a set of binary indexes – is keyword in left or right half of database? Database aaa-mon moo-zxy aaa-jaa jaa-mon moo-tev tew-zxy

  9. Algorithms • Binary search algorithm • So, to search a 40,000,000 record database • for a single term – • T(40,000,000) = log2(40,000,000) • = 26 read compare cycles • at 1000 read/compare cycles/sec = 0.026 seconds

  10. Algorithms • Binary Search Algorithm • So, what about 730,000,000 records • Search for a single keyword – • 30 read/compare cycles • or about 0.03 seconds

  11. Pseudo-code • Like English – easily readable • Clear and consistent • Rough correspondence to language implementation • Should give a clear understanding of what the algorithm does

  12. Using Pseudo-code • Use indentation to indicate block structure. Blocks of code at the same level of indentation. • Do not use “extra” statements like begin-end • Looping constructs and conditionals are similar to Pascal (while, for, repeat, if-then-else). In for loops the loop counter is persistent

  13. Using Pseudo-code • Use a consistent symbol to indicate comments. Anything on line after this symbol is a comment, not code • Multiple assignment is allowed • Variables are local to a procedure unless explicitly declared as global • Array elements are specified by the array name followed by indices in square brackets… A[i]

  14. Pseudo-code • .. indicates a range of values A[1..4] means elements 1,2,3,and 4 of array A • Compound data can be represented as objects with attributes or fields. Reference these attributes array references. For example a variable that is the length of the array A is length[A]

  15. Pseudo-code • An array reference is a pointer • Parameters are passed by value • assignments to parameters within a procedure are local to the procedure • Boolean operators short-circuit • Be consistent • don’t use read one place and input another unless they have functionally different meaning

  16. Insertion-Sort Algorithm INSERTION-SORT(A) for j = 2 to length[A] do key = A[j] C* Insert A[j] into the sorted sequence A[1..j-1] i=j-1 while i > 0 and A[i]> key do A[i+1] = A[i] i=i-1 A[i+1]=key

  17. Analysis of Algorithms • Analysis may be concerned with any resources • memory • bandwidth • runtime • Need a model for describing runtime performance of an algorith • RAM – Random Access Machine

  18. RAM • There are other models but for now… • Assume that all instructions are sequential • All data is accessible in one step • Analyze performance (run-time) in terms of inputs • meaning of inputs varies – size of an array, number of bits, vertices and edges, etc. • Machine independent • Language independent

  19. RAM • Need to base analysis on cost of instruction execution • assign costs (run-time) to each instruction

  20. INSERTION-SORT

  21. INSERTION-SORT • Run-time = sum of products of costs (instruction runtimes) and execution occurrences • T(n)= c1n + c2(n-1) + c4(n-1) + c5nj=2tj +c6nj=2(tj-1) + c7nj=2(tj-1) +c8(n-1)

  22. INSERTION-SORT • Best case vs Worst Case • Best case • Input array already sort • Worst case • Input array sorted in reverse order

  23. INSERTION-SORT • For sake of discussion… • assume that all c=2 • then, for best case • T(n) = 10n-8 • n=1000, T(n) = 9992 • for worst case … • T(n) = 3n2+7n-8 • n=1000, T(n) = 3006992

  24. Insertion-sort Performance * Best case is a linear function of n

  25. So, what are we really interested in? • the big picture • the trend in run-time performance as the problem grows • not concerned about small differences in algorithms • what happens to the algorithm as the problem gets explosively large • the order of growth

  26. Abstractions and assumptions • The cost coefficients will not vary that much… and will not contribute significantly to the growth of run-time performance • so we can set them to a constant • … and we can ignore them • remember the earlier example – • c1 = c2 = … = 2

  27. Abstractions and assumptions • In a polynomial run-time function the order of growth is controlled by the higher order term • T(n) = 3n2+7n-8 • so we can ignore (discard) the lower order terms • T(n) = 3n2

  28. Abstractions and assumptions • It turns out that with sufficiently large n the coefficient of the high order term is not that important in characterizing the order of growth of a run-time function • So, from that perspective the run-time function of the Insertion-Sort algorithm (worst-case) is - • T(n) = n2

  29. Abstractions and assumptions • Are these abstraction assumptions correct? • for small problems – no • but for sufficiently large problem • they do a pretty good job of characterizing the run-time function of algorithms

  30. Design of Algoritms • Incremental approach to algorithm design • Design for a very small case • expand the complexity of the problem and algorithm • Divide and Conquer • Start with a large (full)problem • Divide it into smaller problems • Solve smaller problems • Combine results from smaller problems

  31. Another look at Sort algorithms • Suppose: • you have an array evenly divisible by two • in each half (left and right) values are already sorted in order • but not in order across the whole array • task: sort the array so that it is in order across the entire array

  32. Merge Sorted subarrays • Split the array into two subarrays • Add a marker to each subarrays to indicate the end • Set index to first value of each subarray • Compare indexed (pointed to) value of each subarray • If either indexed value is an end-marker: move all remaining values (except the end-mark from the other subarray to the output array; Stop • Move the smallest of the two values to the output array (sorted); increment the index to that subarray • Go to step 4

  33. Merge(A, p, q, r) • Where A is the array containing values to be sorted, each half is already sorted from smallest to largest • p = is the starting point index for the array A • q = is the end point index for the left side of array A (end of first half… sort of) • r = end index for array A • So, sort values from p to r from two halves of array A where q marks where to split the array into subarray

  34. Merge(A, p, q, r) • n1 = q – p + 1 • n2 = r – q • c* create subarrays L[1..n1+1] and R[1..n2+1] • for i = 1 to n1 • do L[i] = A[p+i-1] • for j = 1 to n2 • do R[j] = A[q+j] • L[n1+1] =  • R[n2+1] =  • i = 1 • j = 1 • for k = p to r • do if L[i] <= R[j] • then A[k] = L[i] • i = i + 1 • else A[k] = R[j] • j = j + 1

  35. MERGE_SORT(A,p,r) • if p < r • then q = (p+r)/2 • MERGE_SORT(A, p, q) • MERGE_SORT(A, q+1, r) • MERGE(A, p, q, r)

  36. Asymptotic Notation • Big  (theta) • (g(n)) = {f(n) : there exists two constants c1 and c2, n0 such that 0<=c1g(n)<=f(n)<=c2g(n) for all n >=n0}

  37. Asymptotic Notation • Big O (oh) • O(g(n)) = {f(n) : there positive constants c and n0 such that 0<=f(n)<=cg(n) for all n >=n0}

  38. Asymptotic Notation • Big  (Omega) • (g(n)) = {f(n) : there positive constants c and n0 such that 0<=cg(n)<=f(n) for all n >=n0}

  39. Asymptotic Notation • Little o (oh) • o(g(n)) = {f(n) : there positive constants c>0 and n0>0 such that 0<=f(n)<cg(n) for all n >=n0}

  40. Asymptotic Notation • Little  (omega) • (g(n)) = {f(n) : there positive constants c>0 there exists a constant n0 such that 0<=cg(n)<f(n) for all n >=n0 }

More Related