1 / 33

Quicksorting - Lecture 13

This lecture covers the Quicksort algorithm and its implementation using the insertelh procedure. It also discusses sorted binary trees and their representation.

cgrande
Télécharger la présentation

Quicksorting - Lecture 13

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 13: Quicksorting David Evans http://www.cs.virginia.edu/evans CS200: Computer Science University of Virginia Computer Science

  2. Menu • Improving insertsort • quicksort CS 200 Spring 2003

  3. Last time: insertsort (define (insertel cf el lst) (if (null? lst) (list el) (if (cf el (car lst)) (cons el lst) (cons (car lst) (insertel cf el (cdr lst)))))) (define (insertsort cf lst) (if (null? lst) null (insertel cf (car lst) (insertsort cf (cdr lst))))) insertsort is (n2) CS 200 Spring 2003

  4. Can we do better? (insertel < 88 (list 1 2 3 5 6 23 63 77 89 90)) Suppose we had procedures (first-half lst) (second-half lst) that quickly divided the list in two halves? CS 200 Spring 2003

  5. Insert Halves (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) CS 200 Spring 2003

  6. Evaluating insertelh > (insertelh < 3 (list 1 2 4 5 7)) |(insertelh #<procedure:traced-<> 3 (1 2 4 5 7)) | (< 3 1) | #f | (< 3 5) | #t | (insertelh #<procedure:traced-<> 3 (1 2 4)) | |(< 3 1) | |#f | |(< 3 4) | |#t | |(insertelh #<procedure:traced-<> 3 (1 2)) | | (< 3 1) | | #f | | (< 3 2) | | #f | | (insertelh #<procedure:traced-<> 3 (2)) | | |(< 3 2) | | |#f | | (2 3) | |(1 2 3) | (1 2 3 4) |(1 2 3 4 5 7) (1 2 3 4 5 7) (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) Every time we call insertelh, the size of the list is approximately halved! CS 200 Spring 2003

  7. How much work is insertelh? • Assume first-half and second-half are  (1) Each time we call insertelh, the size of lst halves. So, doubling the size of the list only increases the number of calls by 1. (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) List Size Number of insertelh applications 1 1 2 2 4 3 8 4 16 5 CS 200 Spring 2003

  8. How much work is insertelh? • Assume first-half and second-half are  (1) Each time we call insertelh, the size of lst halves. So, doubling the size of the list only increases the number of calls by 1. insertelh is  (log2n) log2a=b means 2b=a List Size Number of insertelh applications 1 1 2 2 4 3 8 4 16 5 CS 200 Spring 2003

  9. insertsorth Same as insertsort, except uses insertelh (define (insertsorth cf lst) (if (null? lst) null (insertelh cf (car lst) (insertsorth cf (cdr lst))))) (define (insertelh cf el lst) (if (null? lst) (list el) (let ((fh (first-half lst)) (sh (second-half lst))) (if (cf el (car fh)) (append (cons el fh) sh) (if (null? sh) (append fh (list el)) (if (cf el (car sh)) (append (insertelh cf el fh) sh) (append fh (insertelh cf el sh)))))))) insertsorth would be (n log2n) if we have fast first-half/second-half CS 200 Spring 2003

  10. Is there a fast first-half procedure? • No! • To produce the first half of a list length n, we need to cdr down the first n/2 elements • So: • first-half is  (n) • insertelh calls first-half every time…so • insertelh is  (n) *  (log2 n) =  (n log2 n) • insertsorth is  (n) *  (n log2 n) =  (n2log2 n) Yikes! We’ve done all this work, and its still worse than our simple bubblesort! CS 200 Spring 2003

  11. CS 200 Spring 2003

  12. The Great Lambda Tree of Ultimate Knowledge and Infinite Power CS 200 Spring 2003

  13. Sorted Binary Trees el left right A tree containing all elements x such that (cf x el) is true A tree containing all elements x such that (cf x el) is false CS 200 Spring 2003

  14. Tree Example 3 5 cf: < 2 8 4 7 1 null null CS 200 Spring 2003

  15. Representing Trees (define (make-tree left el right) (list left el right)) (define (get-left tree) (first tree)) (define (get-element tree) (second tree)) (define (get-right tree) (third tree)) left and right are trees (null is a tree) tree must be a non-null tree tree must be a non-null tree tree must be a non-null tree CS 200 Spring 2003

  16. Trees as Lists (define (make-tree left el right) (list left el right)) (define (get-left tree) (first tree)) (define (get-element tree) (second tree)) (define (get-right tree) (third tree)) 5 2 8 1 (make-tree (make-tree (make-tree null 1 null) 2 null) 5 (make-tree null 8 null)) CS 200 Spring 2003

  17. insertel-tree (define (insertel-tree cf el tree) (if (null? tree) (make-tree null el null) (if (cf el (get-element tree)) (make-tree (insertel-tree cf el (get-left tree)) (get-element tree) (get-right tree)) (make-tree (get-left tree) (get-element tree) (insertel-tree cf el (get-right tree)))))) If the tree is null, make a new tree with el as its element and no left or right trees. Otherwise, decide if el should be in the left or right subtree. insert it into that subtree, but leave the other subtree unchanged. CS 200 Spring 2003

  18. How much work is insertel-tree? Each time we call insertel-tree, the size of the tree. So, doubling the size of the tree only increases the number of calls by 1! (define (insertel-tree cf el tree) (if (null? tree) (make-tree null el null) (if (cf el (get-element tree)) (make-tree (insertel-tree cf el (get-left tree)) (get-element tree) (get-right tree)) (make-tree (get-left tree) (get-element tree) (insertel-tree cf el (get-right tree)))))) insertel-tree is  (log2n) log2a=b means2b=a CS 200 Spring 2003

  19. insertsort-tree (define (insertsort cf lst) (if (null? lst) null (insertel cf (car lst) (insertsort cf (cdr lst))))) (define (insertsort-worker cf lst) (if (null? lst) null (insertel-tree cf (car lst) (insertsort-worker cf (cdr lst))))) No change…but insertsort-worker evaluates to a tree not a list! (((() 1 ()) 2 ()) 5 (() 8 ())) CS 200 Spring 2003

  20. extract-elements • We need to make a list of all the tree elements, from left to right. (define (extract-elements tree) (if (null? tree) null (append (extract-elements (get-left tree)) (cons (get-element tree) (extract-elements (get-right tree)))))) CS 200 Spring 2003

  21. How much work is insertsort-tree? (define (insertsort-tree cf lst) (define (insertsort-worker cf lst) (if (null? lst) null (insertel-tree cf (car lst) (insertsort-worker cf (cdr lst))))) (extract-elements (insertsort-worker cf lst))) (n) applications of insertel-tree each is (log n) (n log2n) CS 200 Spring 2003

  22. Growth of time to sort random list n2 bubblesort n log2n insertsort-tree CS 200 Spring 2003

  23. Comparing sorts > (testgrowth bubblesort) n = 250, time = 110 n = 500, time = 371 n = 1000, time = 2363 n = 2000, time = 8162 n = 4000, time = 31757 (3.37 6.37 3.45 3.89) > (testgrowth insertsort) n = 250, time = 40 n = 500, time = 180 n = 1000, time = 571 n = 2000, time = 2644 n = 4000, time = 11537 (4.5 3.17 4.63 4.36) > (testgrowth insertsorth) n = 250, time = 251 n = 500, time = 1262 n = 1000, time = 4025 n = 2000, time = 16454 n = 4000, time = 66137 (5.03 3.19 4.09 4.02) > (testgrowth insertsort-tree) n = 250, time = 30 n = 500, time = 250 n = 1000, time = 150 n = 2000, time = 301 n = 4000, time = 1001 (8.3 0.6 2.0 3.3) CS 200 Spring 2003

  24. Can we do better? • Making all those trees is a lot of work • Can we divide the problem in two halves, without making trees? CS 200 Spring 2003

  25. Quicksort • C. A. R. (Tony) Hoare, 1962 • Divide the problem into: • Sorting all elements in the list where (cf (car list) el) is true (it is < the first element) • Sorting all elements in the list where (not (cf (car list) el) is true (it is >= the first element) • Will this do better? CS 200 Spring 2003

  26. Quicksort (define (quicksort cf lst) (if (null? lst) lst (append (quicksort cf (filter (lambda (el) (cf el (car lst))) (cdr lst))) (list (car lst)) (quicksort cf (filter (lambda (el) (not (cf el (car lst)))) (cdr lst)))))) CS 200 Spring 2003

  27. filter (define (filter f lst) (insertl (lambda (el rest) (if (f el) (cons el rest) rest)) lst null)) CS 200 Spring 2003

  28. (define (quicksort cf lst) (if (null? lst) lst (append (quicksort cf (filter (lambda (el) (cf el (car lst))) (cdr lst))) (list (car lst)) (quicksort cf (filter (lambda (el) (not (cf el (car lst)))) (cdr lst)))))) How much work is quicksort? What if the input list is sorted? Worst Case: (n2) What if the input list is random? Expected: (n log2n) CS 200 Spring 2003

  29. Comparing sorts > (testgrowth insertsort-tree) n = 250, time = 20 n = 500, time = 80 n = 1000, time = 151 n = 2000, time = 470 n = 4000, time = 882 n = 8000, time = 1872 n = 16000, time = 9654 n = 32000, time = 31896 n = 64000, time = 63562 n = 128000, time = 165261 (4.0 1.9 3.1 1.9 2.1 5.2 3.3 2.0 2.6) > (testgrowth quicksort) n = 250, time = 20 n = 500, time = 80 n = 1000, time = 91 n = 2000, time = 170 n = 4000, time = 461 n = 8000, time = 941 n = 16000, time = 2153 n = 32000, time = 5047 n = 64000, time = 16634 n = 128000, time = 35813 (4.0 1.1 1.8 2.7 2.0 2.3 2.3 3.3 2.2) Both are (n log2n) Absolute time of quicksort much faster CS 200 Spring 2003

  30. Good enough for VISA? n = 128000, time = 35813 36 seconds to sort 128000 with quicksort (n log2n) How long to sort 800M items? > (log 4) 1.3862943611198906 > (* 128000 (log 128000)) 1505252.5494914246 > (/ (* 128000 (log 128000)) 36) 41812.57081920624 > (/ (* 128000 (log 128000)) 41812.6) 35.99997487578923 > (/ (* 800000000 (log 800000000)) 41812.6) 392228.6064130373 392000 seconds ~ 4.5 days CS 200 Spring 2003

  31. Quicksorting 800M items? > (log 4) 1.3862943611198906 > (* 128000 (log 128000)) 1505252.5494914246 > (/ (* 128000 (log 128000)) 36) 41812.57081920624 > (/ (* 128000 (log 128000)) 41812.6) 35.99997487578923 > (/ (* 1000000 (log 1000000)) 41812.6) 330.4150078675871 > (/ (* 100000000 (log 100000000)) 41812.6) 44055.33438234496 > (/ (* 800000000 (log 800000000)) 41812.6) 392228.6064130373 > (/ (/ (/ (/ (* 800000000 (log 800000000)) 41812.6) 60) 60) 24) 4.539682944595339 4.5 days with quicksort is a lot better than 20,000 years with bubblesort… but still not good enough for VISA. CS 200 Spring 2003

  32. Later in the course… • So far, we have been talking amount the work a procedure requires • In a few weeks, we will learn how to talk about the amount of work a problem requires • That is, how much work is the best possible sorting procedure? • For the general case, you can’t do better than (n log2n) • VISA’s problem is simpler, so they can do much better: (n) CS 200 Spring 2003

  33. Charge • Monday: Tyson’s essay • Wednesday: Tim Koogle • Remember to send me a question for him • That is your “ticket” to come to class Weds • Friday: Exam Review • Ask any questions you want before picking up exam 1. CS 200 Spring 2003

More Related