120 likes | 237 Vues
M&S Validation in DoD versus The Scientific Method. “Food For Thought,” …or an Imperative to Action? R.W. Eberth 13 Nov 02. Symptoms of a Growing Problem?.
 
                
                E N D
M&S Validation in DoDversusThe Scientific Method “Food For Thought,” …or an Imperative to Action? R.W. Eberth 13 Nov 02
Symptoms of a Growing Problem? • For the last several years, in M&S V&V workshops, conferences, and symposia, and even in DMSO V&V TWG meetings, I’ve noted the terms “valid” and “validity” used less and less, and the words “credible” and “credibility” used more and more in their place • At last month’s “Foundations ’02: A V&V Workshop,” sponsored by DMSO, administered by NTSA, and held at JHU/APL, two keynote speakers both addressed the importance of the “credibility” of their simulations (one was from the J9 MC02 enterprise, the other from NASA)
…a Growing Problem? (cont’d) • At the conclusion of the keynotes, one member of the audience tossed out a set of statements and questions: • Had just checked that morning before driving to the meeting: the words “credible” and “credibility” appeared nowhere in DoD’s official lexicon for M&S • Where are those terms defined for M&S? • Is “credibility” a greater or lesser standard than validity, is one a pre-condition to having the other, or are they somehow parallel standards? • Regardless of the answers, is it possible to have a credible simulation that isn’t valid?
…a Growing Problem? (cont’d) • The response from one of the program co-chairs was amazing. After first stating that they probably needed to tighten up their terminology somewhat, he acknowledged that certainly we have cases of models that are credible but not valid! * • Thus some thoughts… * Note: The term “credibility” is in fact defined within DMSO’s VV&A Recommended Practices Guide (RPG) although not in DoD’s official M&S Glossary: “The relevance that the user sees in a model and the confidence that the user has that a model or simulation can serve his purpose.” The origin for the definition is stated as the DMSO VV&A Technical Support Team, 1999-2001. The RPG then uses “credibility” as an overarching if subjective goal, with the VV&A processes feeding it.
“Some thoughts” • For years, the thrust of VV&A in DoD has been to “validate,” to “accredit,” our models and simulations • We have relied increasingly on SMEs to “face validate” our sims, and have rather carefully chosen those SMEs • Selection and use of SMEs for validation has become a pseudo-science of its own, spawning MORS and other professional papers and presentations
“Some thoughts” (cont’d) • In short, the validation process in DoD often has become a drive to gather a body of evidentiary support for the credibility of a simulation, even when there’s little or no rigor to the evidence • A cynic could even say that the validation process as practiced in DoD often appears designed to avoid uncovering any evidence that might refute or undermine that credibility
And a Glimmer of Light? • At a workshop in Newport last May, a PhD from Brookings became a bit agitated when I asked him how he had validated a smallpox epidemiology simulation he had developed. Turned out he simply had no prior exposure to the topic of VV&A! As several people attempted to explain to him how validation worked in DoD, he blurted out, “But if you don’t have falsification, you have nothing!” • My jaw dropped. He was right…and we have nothing. • Sooo…What if…
Scientific Method • Is really simple: • Observe something • Make a cause-and-effect hypothesis or theory about it • Gather the pertinent data (generally through objective experimentation) • Analyze the data in order to support or refute the hypothesis • But it’s not really so simple…because the hypothesis must be falsifiable. It must be possible to find a piece of evidence that – if it exists – would refute (“falsify”) the hypothesis. And by definition, if it’s not falsifiable, it’s not scientific!
Scientific Method (cont’d) • The beauty of falsifiability is that it’s often impossible to definitely prove an hypothesis to be true, but it’s often trivially simple to design experiments to falsify the original hypothesis, if it is in fact false. • Consider any of the current crop of combat simulations. How could you definitely prove one to be sufficiently accurate for a future warfare scenario when the purpose of the study might be to assess the effectiveness of some future system or concept? • Answer: you can’t, thus you use face validation – the “reasonable man” approach. • Right?
Scientific Method (cont’d) • Wrong. Well, at least partially wrong. Whenever we deal with future systems that have no predecessors, or completely new operational or tactical concepts, there may be a legitimate role for SMEs
Scientific Method (cont’d) • How would we apply the Scientific Method to M&S? Trivially simple. An example: The null hypothesis is that Sim A is valid for Application X. • The alternative hypothesis is even simpler: No, it’s not! • Now let’s go try to disprove the null. In that kind of search, if thorough and objective, we will find things wrong – functionality problems and accuracy problems, even when accuracy criteria weren’t specifically established. But you don’t use SMEs. You don’t look at pretty pictures on a screen. You go into algorithms and sometimes even code. You may want or need to conduct experiments with the sim or portions of it to examine opaque areas of internal logic. And it’s still pretty easy! (Topic for another time.) • At the end of the process, we may discard the sim for the intended purpose, or even completely. But if we keep it, even if we’ve had to fix problems along the way, we’ll have really strong evidence of the validity of the end product for the intended purpose.