Testing and Debugging (Depuração em Haskell)

Complementa as seções anteriores Original autorizado por: John Hughes http://www.cs.chalmers.se/~rjmh/ Adaptado por: Claudio Cesar de Sá Testing and Debugging(Depuração em Haskell)

What’s the Difference? Testing means trying out a program which is believed to work, in an effort to gain confidence that it is correct. If no errors are revealed by thorough testing, then, probably, relatively few errors remain. Debugging means observing a program which is known not to work, in an effort to localise the error. When a bug is found and fixed by debugging, testing can be resumed to see if the program now works. This lecture: describes recently developed tools to help with each activity.

median xs = isort xs !! (length xs `div` 2) isort = foldr insert [] insert x [] = [x] insert x (y:ys) | x<y = x:y:ys | x>=y = y:x:ys A test reveals median doesn’t work. We start trying functions median calls. isort doesn’t work either. Debugging Here’s a program with a bug: Median> median [8,4,6,10,2,7,3,5,9,1] 2 Median> isort [8,4,6,10,2,7,3,5,9,1] [1,8,4,6,10,2,7,3,5,9]

Debugging Tools The Manual Approach We choose cases to try, and manually explore the behaviour of the program, by calling functions with various (hopefully revealing) arguments, and inspecting their outputs. The Automated Approach We connect a debugger to the program, which lets us observe internal values, giving us more information to help us diagnose the bug.

observe :: String -> a -> a import Observe The Haskell Object Observation Debugger Provides a function which collects observations of its second argument, tagged with the String, and returns the argument unchanged. Think of it as like connecting an oscilloscope to the program: the program's behaviour is unchanged, but we see more. (You need to import the library which defines observe in order to use it: add at the start of your program).

Garantindo que Observe.lhs foi carregado ... Listas> :l listas_haskell.hs Reading file "listas_haskell.hs": Reading file "/usr/share/hugs/lib/exts/Observe.lhs": Reading file "listas_haskell.hs": Hugs session for: /usr/share/hugs/lib/Prelude.hs /usr/share/hugs/lib/exts/Observe.lhs listas_haskell.hs Listas>

We add a ''probe'' to the program The values observed are displayed, titled with the name of the observation. What Do Observations Look Like? Median> sum [observe "n*n" (n*n) | n <- [1..4]] 30 >>>>>>> Observations <<<<<< n*n 1 4 9 16

Observing the entire list lets us see the order of values also. Now there is just one observation, the list itself. Lists are always observed in ''cons'' form. Observing a List Median> sum (observe "squares" [n*n | n <- [1..4]]) 30 >>>>>>> Observations <<<<<< squares (1 : 4 : 9 : 16 : [])

We can add observers to ''pipelines'' -- long compositions of functions -- to see the values flowing between them. Observing a Pipeline Median> (sum . observe "squares" . map (\x->x*x)) [1..4] 30 >>>>>>> Observations <<<<<< squares (1 : 4 : 9 : 16 : [])

Add observations after each stage. Observing Counting Occurrences countOccurrences = map (\ws -> (head ws, length ws)) . observe "after groupby" . groupBy (==) . observe "after sort" . sort . observe "after words” . words

Observing Counting Occurrences Main> countOccurrences "hello clouds hello sky" [("clouds",1),("hello",2),("sky",1)] >>>>>>> Observations <<<<<< after groupby (("clouds" : []) : ("hello" : "hello" : []) : ("sky" : []) : []) after sort ("clouds" : "hello" : "hello" : "sky" : []) after words ("hello" : "clouds" : "hello" : "sky" : [])

The _ is a ''don't care'' value -- certainly some list appeared here, but it was never used! Observing Consumers An observation tells us not only what value flowed past the observer -- it also tells us how that value was used! Main> take 3 (observe "xs" [1..10]) [1,2,3] >>>>>>> Observations <<<<<< xs (1 : 2 : 3 : _)

The length function did not need to inspect the values of the elements, so they were not observed! Observing Length Main> length (observe "xs" (words "hello clouds")) 2 >>>>>>> Observations <<<<<< xs (_ : _ : [])

Observe ''sum'' sum is a function, which is applied to [1..5] We see arguments and results, for the calls which actually were made! Observing Functions We can even observe functions themselves! Main> observe "sum" sum [1..5] 15 >>>>>>> Observations <<<<<< sum { \ (1 : 2 : 3 : 4 : 5 : []) -> 15 }

Observing foldr Recall that : foldr (+) 0 [1..4] = 1 + (2 + (3 + (4 + 0))) Let’s check this, by observing the addition function. Main> foldr (observe "+" (+)) 0 [1..4] 10 >>>>>>> Observations <<<<<< + { \ 4 0 -> 4 , \ 3 4 -> 7 , \ 2 7 -> 9 , \ 1 9 -> 10 }

Observing foldl We can do the same thing to observe foldl, which behaves as foldl (+) 0 [1..4] = (((0 + 1) + 2) + 3) + 4 Main> foldl (observe "+" (+)) 0 [1..4] 10 >>>>>>> Observations <<<<<< + { \ 0 1 -> 1 , \ 1 2 -> 3 , \ 3 3 -> 6 , \ 6 4 -> 10 }

How Many Elements Does takeWhile Check? takeWhile isAlpha ''hello clouds hello sky'' == ''hello'' takeWhile isAlpha selects the alphabetic characters from the front of the list. How many times does takeWhile call isAlpha?

takeWhile calls isAlpha six times -- the last call tells us it’s time to stop. How Many Elements Does takeWhile Check? Main> takeWhile (observe "isAlpha" isAlpha) "hello clouds hello sky" "hello" >>>>>>> Observations <<<<<< isAlpha { \ ' ' -> False , \ 'o' -> True , \ 'l' -> True , \ 'l' -> True , \ 'e' -> True , \ 'h' -> True }

fac 0 = 1 fac n | n>0 = n * fac (n-1) We observe this use of the function. We did not observe the recursive calls! Observing Recursion Main> observe "fac" fac 6 720 >>>>>>> Observations <<<<<< fac { \ 6 -> 720 }

fac = observe "fac" fac' fac' 0 = 1 fac' n | n>0 = n * fac (n-1) We observe all calls of the fac function. Observing Recursion Main> fac 6 720 >>>>>>> Observations <<<<<< fac { \ 6 -> 720 , \ 5 -> 120 , \ 4 -> 24 , \ 3 -> 6 , \ 2 -> 2 , \ 1 -> 1 , \ 0 -> 1 }

Wrong answer: the median is 3 Wrong (unsorted) result from isort Debugging median median xs = observe "isort xs" (isort xs) !! (length xs `div` 2) Main> median [4,2,3,5,1] 2 >>>>>>> Observations <<<<<< isort xs (1 : 4 : 2 : 3 : 5 : [])

All well, except for this case Debugging isort isort :: Ord a => [a] -> [a] isort = foldr (observe "insert" insert) [] Main> median [4,2,3,5,1] 2 >>>>>>> Observations <<<<<< insert { \ 1 [] -> 1 : [] , \ 5 (1 : []) -> 1 : 5 : [] , \ 3 (1 : 5 : []) -> 1 : 3 : 5 : [] , \ 2 (1 : 3 : 5 : []) -> 1 : 2 : 3 : 5 : [] , \ 4 (1 : 2 : 3 : 5 : []) -> 1 : 4 : 2 : 3 : 5 : [] }

Observe the results from each case Only the second case was used! Debugging insert insert x [] = [x] insert x (y:ys) | x<y = observe "x<y" (x:y:ys) | x>=y = observe "x>=y" (y:x:ys) Main> median [4,2,3,5,1] 2 >>>>>>> Observations <<<<<< x>=y (1 : 5 : []) (1 : 3 : 5 : []) (1 : 2 : 3 : 5 : []) (1 : 4 : 2 : 3 : 5 : [])

Bug fixed! The right answer The Bug! I forgot the recursive call… insert x [] = [x] insert x (y:ys) | x<y = x:y:ys | x>=y = y:insert x ys Main> median [4,2,3,5,1] 3

Summary • The observe function provides us with a wealth of information about how programs are evaluated, with only small changes to the programs themselves • That information can help us understand how programs work (foldr, foldrl, takeWhile etc.) • It can also help us see where bugs are.

Testing Testing means trying out a program which is believed to work, in an effort to gain confidence that it is correct. Testing accounts for more than half the development effort on a large project (I’ve heard all from 50-80%). Fixing a bug in one place often causes a failure somewhere else -- so the entire system must be retested after each change. At Ericsson, this can take three months!

''Hacking'' vs Systematic Testing ''Hacking'' Systematic testing • Try some examples until the software seems to work. • Record test cases, so that tests can be repeated after a modification (regression testing). • Document what has been tested. • Establish criteria for when a test is successful -- requires a specification. • Automate testing as far as possible, so you can test extensively and often.

QuickCheck: A Tool for Testing Haskell Programs Based on formulating properties, which • can be tested repeatedly and automatically • document what has been tested • define what is a successful outcome • are a good starting point for proofs of correctness Properties are tested by selecting test cases at random!

By taking 20% more points in a random test, any advantage a partition test might have had is wiped out. D. Hamlet Random Testing? Is random testing sensible? Surely carefully chosen test cases are more effective? • QuickCheck can generate 100 random test cases in less time than it takes you to think of one! • Random testing finds common (i.e. important!) errors effectively.

Check that the result of sort is ordered. Random values for xs are generated. The tests were passed. A Simple QuickCheck Property prop_Sort :: [Int] -> Bool prop_Sort xs = ordered (sort xs) Main> quickCheck prop_Sort OK, passed 100 tests.

We must import the QuickCheck library. The type of a property must not be polymorphic. We give properties names beginning with ”prop_” so we can easily find and test all the properties in a module. quickCheck is an (overloaded) higher order function! Some QuickCheck Details import QuickCheck prop_Sort :: [Int] -> Bool prop_Sort xs = ordered (sort xs) Main> quickCheck prop_Sort OK, passed 100 tests.

Whoops! This list isn’t ordered! A Property of insert prop_Insert :: Int -> [Int] -> Bool prop_Insert x xs = ordered (insert x xs) Main> quickCheck prop_Insert Falsifiable, after 4 tests: -2 [5,-2,-5]

Result is no longer a simple Bool. Read it as ”implies”: if xs is ordered, then so is (insert x xs). Discards test cases which are not ordered. A Corrected Property of insert prop_Insert :: Int -> [Int] -> Property prop_Insert x xs = ordered xs ==> ordered (insert x xs) Main> quickCheck prop_Insert OK, passed 100 tests.

Using QuickCheck to Develop Fast Queue Operations What we’re going to do: • Explain what a queue is, and give slow implementations of the queue operations, to act as a specification. • Explain the idea behind the fast implementation. • Formulate properties that say the fast implementation is ”correct”. • Test them with QuickCheck.

Join at the back Leave from the front What is a Queue? Examples • Files to print • Processes to run • Tasks to perform

What is a Queue? A queue contains a sequence of values. We can add elements at the back, and remove elements from the front. We’ll implement the following operations: • empty :: Queue a -- an empty queue • isEmpty :: Queue a -> Bool -- tests if a queue is empty • add :: a -> Queue a -> Queue a -- adds an element at the back • front :: Queue a -> a -- the element at the front • remove :: Queue a -> Queue a -- removes an element from the front

Addition takes time depending on the number of items in the queue! The Specification: Slow but Simple type Queue a = [a] empty = [] isEmpty q = q==empty add x q = q++[x] front (x:q) = x remove (x:q) = q

a b c d e f g h i j Fast to remove Slow to add Fast to remove Periodically move the back to the front. a b c d e j i h g f Fast to add The Idea: Store the Front and Back Separately Old New

Make sure the front is never empty when the back is not. The Fast Implementation type Queue a = ([a],[a]) flipQ ([],b) = (reverse b,[]) flipQ (x:f,b) = (x:f,b) emptyQ = ([],[]) isEmptyQ q = q==emptyQ addQ x (f,b) = (f,x:b) removeQ (x:f,b) = flipQ (f,b) frontQ (x:f,b) = x

retrieve :: Queue a -> [a] retrieve (f, b) = f ++ reverse b Relating the Two Implementations What list does a ”double-ended” queue represent? What does it mean to be correct? • retrieve emptyQ == empty • isEmptyQ q == isEmpty (retrieve q) • retrieve (addQ x q) == add x (retrieve q) • retrieve (removeQ q) == remove (retrieve q) and so on.

Using Retrieve Guarantees Consistent Results Example frontQ (removeQ (addQ 1 (addQ 2 emptyQ))) == front (retrieve (removeQ (addQ 1 (addQ 2 emptyQ)))) == front (remove (retrieve (addQ 1 (addQ 2 emptyQ)))) == front (remove (add 1 (retrieve (addQ 2 emptyQ)))) == front (remove (add 1 (add 2 (retrieve emptyQ)))) == front (remove (add 1 (add 2 empty)))

Removing from an empty queue! QuickChecking Properties prop_Remove :: Queue Int -> Bool prop_Remove q = retrieve (removeQ q) == remove (retrieve q) Main> quickCheck prop_Remove 4 Program error: {removeQ ([],sized_v1740 (instArbitrary_v1…

How can this be? Correcting the Property prop_Remove :: Queue Int -> Property prop_Remove q = not (isEmptyQ q) ==> retrieve (removeQ q) == remove (retrieve q) Main> quickCheck prop_Remove 0 Program error: {removeQ ([],[Arbitrary_arbitrary instArbitrary…

NOW IT WORKS! Making Assumptions Explicit We assumed that the front of a queue will never be empty if the back contains elements! Let’s make that explicit: goodQ :: Queue a -> Bool goodQ ([],[]) = True goodQ (x:f,b) = True goodQ ([],x:b) = False prop_Remove q = not (isEmptyQ q) && goodQ q ==> retrieve (removeQ q) == remove (retrieve q)

addQ x (f,b) = (f,x:b) removeQ (x:f,b) = flipQ (f,b) How Do We Know Only Good Queues Arise? Queues are built by add and remove: New properties: prop_AddGood x q = goodQ q ==> goodQ (addQ x q) prop_RemoveGood q = not (isEmptyQ q) && goodQ q ==> goodQ (removeQ q)

addQ x (f,b) = (f,x:b) removeQ (x:f,b) = flipQ (f,b) Whoops! Main> quickCheck prop_AddGood Falsifiable, after 0 tests: 2 ([],[]) See the bug?

addQ x (f,b) = flipQ (f,x:b) removeQ (x:f,b) = flipQ (f,b) Whoops! Main> quickCheck prop_AddGood Falsifiable, after 0 tests: 2 ([],[])

Looking Back • Formulating properties let us define precisely how the fast queue operations should behave. • Using QuickCheck found a bug, and revealed hidden assumptions which are now explicitly stated. • The property definitions remain in the program, documenting exactly what testing found to hold, and providing a ready made test-bed for any future versions of the Queue library. • We were forced to reason much more carefully about the program’s correctness, and can have much greater confidence that it really works.

Summary • Testing is a major part of any serious software development. • Testing should be systematic, documented, and repeatable. • Automated tools can help a lot. • QuickCheck is a state-of-the-art testing tool for Haskell.

The remaining slides discuss an important subtlety when using QuickCheck

Testing and Debugging (Depuração em Haskell)