1 / 23

new frontiers in formal software verification

s 1. s 0. f. new frontiers in formal software verification. Gerard J. Holzmann gholzmann@acm.org. 23 spacecraft and 10 science instruments. Exploring the Solar System. model checking. s tatic analysis. JASON 2. KEPLER. DAWN. WISE. INSTRUMENTS. CLOUDSAT. Earth Science • ASTER

callista
Télécharger la présentation

new frontiers in formal software verification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. s1 s0 f new frontiers informal software verification Gerard J. Holzmann gholzmann@acm.org

  2. 23 spacecraft and 10 science instruments Exploring the Solar System model checking static analysis JASON 2 KEPLER DAWN WISE INSTRUMENTS CLOUDSAT Earth Science • ASTER • MISR • TES • MLS • AIRS Planetary • MIRO • Diviner • MARSIS Astrophysics • Herschel • Planck AQUARIUS MRO JUNO VOYAGER 2 VOYAGER 1 CASSINI EPOXI/ DEEP IMPACT STARDUST ACRIMSAT MARS ODYSSEY GRAIL SPITZER JASON 1 GRACE OPPORTUNITY MSL GALEX

  3. formal software verification • after some ~30 years of development, is still rarely used on industrial software • primary reasons: • it is (perceived to be) too difficult • it is takes too long(months to years) • even in safety critical applications, software verification is often restricted to the verification of models of software, instead of software • goal: • make software verification as simple as testing and as fast as compilation

  4. verification of the PathStar switch(flashback to 1999) • a commercial data/phone switch designed in Bell Labs research (for Lucent Technologies) • newly written code for the core call processing engine • the first commercial call processing code that was formally verified • 20 versions of the code were verified with model checking during development 1998-2000 with a fully automated procedure

  5. software structure basic call processing plus a long list of features (call waiting, call forwarding, three-way calling, etc., etc.) PathStarCode (C) call processing (~30KLOC) traditional hurdles: feature interaction feature breakage concurrency problems race conditions deadlock scenarios non-compliance with legal requirements, etc. ~10% call processing control kernel

  6. complex feature precedence relations detecting undesired feature interaction is a serious problem

  7. @ the verification was automated in 5 steps code 2 abstraction map 3 context PathStar call processing 1 4 feature requirements Spin msc 5 bug reports property violations

  8. dial else Crdtmf dial1 Crconn else Cronhook error dial2 1. code conversion ... @dial: switch(op) { default: /* unexpected input */ goto error; case Crdtmf: /* digit collector ready */ x->drv->progress(x, Tdial); time = MSEC(16000); /* set timer */ @: switch(op) { default: /* unexpected input */ goto error; case Crconn: goto B@1b; case Cronhook: /* caller hangs up */ x->drv->disconnect(x); @: if(op!=Crconn && op!=Crdis) goto Aidle; ...etc... PathStar implied state machine: a state change a control state

  9. 2. defining abstractions • to verify the code we convert it into an automaton: a labeled transition system • the labels (transitions) are the basic statements from C • each statement can be converted via an abstraction – which is encoded as a lookup table that supports three possible conversions : • relevant: keep (~ 60%) • partially relevant: map/abstract (~ 10%) • irrelevant to the requirements: hide (~ 30%) • the generated transition system is then checked against the formal requirements (in linear temporal logic) with the model checker • the program and the negated requirements are converted into -automata, and the model checker computes their intersection

  10. 3. defining the context C code bug reporting model extraction map environment model Spin model database of feature requirements

  11. 4. defining feature requirements a sample property: “always when the subscriber goes offhook, a dialtone is generated” failure to satisfy this requirement: <> eventually, the subscriber goes offhook /\ and X thereafter, no dialtone is U generated until the next onhook. -automaton in Linear Temporal Logic this is written: <> (offhook /\ X ( !dialtone U onhook)) mechanical conversion

  12. 5. verification C code LTL requirement logical negation environment model abstraction map model extractor bug report * no dialtone generated

  13. hardware support (1999) client/server sockets code scripts

  14. t=40 min. t=15 min. iterative search refinement t=5 min. each verification task is run multiple times, with increasing accuracy

  15. performance (1999) “bugs per minute” 15 bug reports in 3 minutes 25% of all bugs 50% of all bugs reported in 8 minutes 100 (60) percent of bugs reported (number of bugs reported) 75 (50) 50 (30) 25 (15) first bug report in 2 minutes 10 20 30 40 minutes since start of check

  16. that was 1999, can we do better now? 1999 2012 32-core off-the-shelf system, running standard Ubuntu Linux (~ $4K USD) 2.5 GHz clockspeed (~80GHz equiv) 64 Gbyte of shared RAM • 16 networked computers running the plan9 operating system • 500 MHz clockspeed (~8GHz equiv) • 16x128 Mbyte of RAM (~2GB equiv) • difference: • approx. 10x faster, and 32x more RAM • does this change the usefulness of the approach?

  17. performance in 2012:“bugs persecond” number of bugs found 32-core PC, 64 GB RAM 2.5GHz per core Ubuntu Linux (2012) 50% of all bugs in 7 seconds (38 bugs) (1999) 11 bug reports after 1 second 16 PCs, 128 MB per PC 500MHz Plan9 OS 10 seconds 10 min number of seconds since start of check

  18. side-by-side comparison 1999 2012 15% of all bugs reported in 1 second (11 bugs) 50% of all bugs reported in 7 seconds (38 bugs) 1 desktop PC • 25% of all bugs reported in 180 seconds (15 bugs) • 50% of all bugs reported in 480seconds (30 bugs) • 16 CPU networked system

  19. generalization: swarm verification • goal: leverage the availability of large numbers of CPUs and/or cpu-cores • if an application is too large to verify exhaustively, we can define a swarm of verifiers that each tries to check a randomly different part of the code • using different hash-polynomials in bitstate hashing and different numbers of polynomials • using different search algorithms and/or search orders • use iterative search refinement to dramatically speedup error reporting (“bugs per second”)

  20. swarm search

  21. swarm configuration file: # range k 1 4 # min and max nr of hash functions # limits depth 10000 # max search depth cpus 128 # nr available cpus memory 64MB # max memory to be used; recognizes MB,GB time 1h # max time to be used; h=hr, m=min, s=sec vector 512 # bytes per state, used for estimates speed 250000 # states per second processed file model.pml # the spin model # compilation options (each line defines a search mode) -DBITSTATE # standard dfs -DBITSTATE -DREVERSE # reversed process ordering -DBITSTATE -DT_REVERSE # reversed transition ordering -DBITSTATE –DP_RAND # randomized process ordering -DBITSTATE –DT_RAND # randomized transition ordering -DBITSTATE –DP_RAND –DT_RAND # both # runtime options -c1 -x -n spin front-endhttp://spinroot.com/swarm/ $ swarm –F config.lib –c6 > script swarm: 456 runs, avg time per cpu 3599.2 sec $ sh ./script the user specifies: # cpus to use mem / cpu maximum time

  22. many small jobs do the work of one large job – but much faster and in less memory 100% coverage states reached swarm search 100% coverage with a swarm of 100using 0.06% of RAM each (8MB) compared to a single exhaustive run (13GB) linear scaling #processes(log) (DEOS O/S model)

  23. tools mentioned: thankyou! • Spin: http://spinroot.com (1989) • parallel depth-first search added 2007 • parallel breadth-first search added 2012 • Modex: http://spinroot.com/modex (1999) • Swarm: http://spinroot.com/swarm (2008)

More Related