Part 6

Part 6 Case Studies

Course Outline Day 1 Day 2 • Part 0: Student Introduction • Paper Helicopter - Pt 0 Use what you know • Part 1: DOE Introduction • What is a Designed Experiment? • Part 2: Planning • Understand the test item’s process from start to finish • Identify Test Objective – screen, characterize, optimize, compare • Response variables • Identify key factors affecting performance • Paper Helicopter - Pt 1 Planning • Part 3: Hypothesis Testing • Random variables • Understanding hypothesis testing • Demonstrate a hypothesis test • Sample size, risk, and constraints • Seatwork Exercise 1 – Hang time Measurements • Part 4: Design and Execution • Part 4: Design and Execution • Understanding a test matrix • Choose the test space – set levels • Factorials and fractional factorials • Execution – randomization and blocking • In-Class F-18 LEX Study - Design Build • Paper Helicopter - Pt 2 Design for Power • Part 5: Analysis • Regression model building • ANOVA • Interpreting results – assess results, redesign and plan further tests • Optimization • In-Class F-18 LEX Study - Analysis • Seatwork Exercise 2 – NASCAR • Paper Helicopter – Pt 3 Execute and Analyze • Paper Helicopter – Pt 4 Multiple Response Optimization • Part 6: Case Studies

Science of Test IV Metrics of Note Plan Sequentially for Discovery Factors, Responses and Levels Design with Confidence and Power to Span the Battlespace N, a, Power, Test Matrices Analyze Statistically to Model Performance Model, Predictions, Bounds DOE Execute to Control Uncertainty Randomize, Block, Replicate

A Proud Motto … But What Constitutes Adequate “Proof”? • 53W and 46 TW have a long, proud record of successful tests • GBU-15, GBU-24, EGBU-28, JDAM, SDB, F-15 Suite 4E/M, LAIRCM, and many more. • Clearly loads of good science in our tests • In line with AFSO21 – we want to do even better! • To raise the bar, introduce the “Science of Test” • Also known as “design of experiments” (DOE) • DOE principles guide following key test decisions: • How many trials to complete to control fielding risks • What factors to vary, across what range, how arranged • An objective, science-based analysis procedure (ANOVA) to separate noise from causal variation • Stopping rules when Proof is Adequate!

Vision: Science of Test … Design of Experiments (DOE) • Test design is not an art…it is a science • Test Business → Business of Test → Science of Test • Talented test and evaluators however… • Limited knowledge in test design … a, b, s, d, p, & n or N • Limited exposure to building run designs to efficiently explore multi-dimensional spaces • We make decisions too important to be left to professional opinion alone…our decisions should be based on mathematical fact • 53d Wg, AFOTEC, AFSOC, and AFFTC experience • Cheerleading DOE as a “better mousetrap” not enough • Active Leadership from organizational senior executives is key

B-1 Radar Mapping in One Mission (28th TS) Problem: Characterize accuracy of B-1 radar coordinates Design: Responses: absolute error Conditions: angle, side of nose, tail number, target, range to target. Angular Error in Target Coordinates – B-1B Radar Mapping • Results: Similar accuracy • throughout scan volume, • Across targets • Among tail numbers Angular Error (mils) 15 miles 30 miles Angle Angle Left side Right side Result: Single two-aircraft mission answered accuracy questions raised by 7 previous missions using conventional test methods.

DOE Cut ESAMS Validation Sorties 80% (68th EWS) Problem: Compare aircraft-SAM engagements between M&S and live fly. Multi-organization planning group (AFSA, AFOTEC, 412 TW) called for 20 sorties. Design: Responses include tracking error, missile miss distance. Conditions include ECM, range, altitude, offset from threat, and threat type. Results: ESAMS a poor predictor of live fly performance. Proved performance in two sorties -- confirmed results with three more. ESAMS runs were similarly cut from more than 12,000 to several hundred. Result: Using only 2 of 10 planned sorties, team tested and described F-15 TEWS behavior against 2 Threats

DOE Saved 50% of U-2 Flying Hours (36th EWS) Problem: Supply RWR QRC capability to field. Design: Response include standard RWR performance metrics (range, ID, response time). Conditions include angle, side of nose, old/new RWR, threat, and side of aircraft. Results: QRC fielded. With automated data reduction, analysis in 2-4 hours after data received. Results showed one problem threat mode (resolved) and improved performance. Better-Faster-Cheaper : The U-2 Team tested a QRC capability against seven threat systems in more depth than ever before … answers in hours, not weeks …with only 10 of 20 planned flight hours.

Case: AIM-9X Blk II Simulation Pk Predictions Test Objective: • Validate sim predictions with live shots to characterize performance against KPP • AIM-9X a simulation-based acquisition • Live fire shots combine with sim to answer key capabilities • Nearly 20 variables: target, countermeasures, background, velocity … • Block II more complex still! • Results: • In work, but promising … • DOE Approach: • Data mine existing “520 set” to ID critical parameters affecting Pk (and those that are not) • Propose (possible) reduced grid granularity • Propose (certainly) reduced replicates per shot • Objective: reduce simulation CPU workload by 50-90% Notional 520 grid Notional DOE grid

Some Space-Filling 3D designs 50 point 10x10x4 Latin Hypercube Design 10 point 10x10x4 Sphere Packing Design 27 point 3x3x3 Classical DOE Factorial Design 30 point 10x10x4 Uniform Design

Summary • Remarkable savings possible esp in M&S and HWIL • Questions are scientifically answerable (and debatable) • We have run into 0.00 problems where we cannot apply principles • Point is that DOE is widely applicable, yields excellent solutions, often leads to savings, but we know why N • To our good science – add the Science of Test: DOE Logit fit to binary problem

Case: CFD for NASA CEV Test Objective: • Select geometries to minimize total drag in ascent to orbit for NASA’s new Crew Exploration Vehicle (CEV) • Experts identified 7 geometric factors to explore including nose shape • Down-selected parameters further refined in following wind tunnel experiments • Results: • Original CFD study envisioned 1556 runs • DOE optimized parameters in 84 runs – 95%! • ID’d key interaction driving drag • DOE Approach: • Two designs – with 5 and 7 factors to vary • Covered elliptic and conic nose to understand factor contributions • Both designs were first order polynomials with ability to detect nonlinearities • Designs also included additional confirmation points to confirm the empirical math model in the test envelope Source: A Parametric Geometry CFD Study Utilizing DOE Ray D. Rhew, Peter A. Parker, NASA Langley Research Center, AIAA 2007 1616

Case: GWEF Large Aircraft IR Hit Point Prediction Test Objective: • IR man-portable SAMs pose threat to large aircraft in current AOR • Dept Homeland Security desired Hit point prediction for a range of threats needed to assess vulnerabilities • Solution was HWIL study at GWEF (ongoing) IR Missile C-5 Damage • Results: • Revealed unexpected hit point behavior • Process highly interactive (rare 4-way) • Process quite nonlinear w/ 3rd order curves • Reduced runs required 80% over past • Possible reduction of another order of magnitude to 500-800 runs • DOE Approach: • Aspect – 0-180 degees, 7each • Elevation – Lo,Mid,Hi, 3 each • Profiles – Takeoff, Landing, 2 each • Altitudes – 800, 1200, 2 each • Including threat – 588 cases • With usual reps nearly 10,000 runs • DOE controls replication to min needed

Case: Wind Tunnel X-31 AOA 0.475 0.4625 CL 0.45 0.4375 0.425 -30.00 -15.00 da -10.00 0.00 -5.00 0.00 15.00 b 5.00 30.00 10.00 Test Objective: • Model nonlinear aero forces in moderate and high AoA conditions for X-31 • Develop classes of experimental designs suitable for efficiently modeling 3rd order behavior as well as interactions • Results: • Usual wind tunnel trials encompass 1000+ points – this was 104 • Revealed process nonlinear (X3), complex and interactive • Predictions accurate to less than 1% • DOE Approach: • Developed six alternate response surface designs • Compared and contrasted matrix properties including orthogonality, predictive variance, power and confidence • Ran designs, build models, made predictions and confirmed

Case: Secure SATCOM for F-15E Strike Eagle Test Objective: • To achieve secure A-G comms in Iraq and Afghanistan, install ARC-210 in fighters • Characterize P(comms) across geometry, range, freq, radios, bands, modes • Other ARC-210 fighter installs show problems – caution needed here! • Results: • For higher-power radio – all good • For lower power radio, range problems • Despite urgent timeframe and canceled missions, enough proof to field • DOE Approach: • Use Embedded Face-Centered CCD design • Gives 5-levels of geometric variables across radios & modes • Speak 5 randomly-generated words and score number correct • Each Xmit/Rcv a test event – 4 missions planned

Testing to Diverse Requirements: SDB II Shot Design  Perf shift 46 shots too few to check binary values +/- 15-20% Power  Goal Test Objective: • SPO requests help – 46 shots right N? • Power analysis – what can we learn? • Consider Integrated Test with AFOTEC • What are the variables? We do not know yet … • How can we plan? • What “management reserve” Results: • Binary Pacq N=200+ • Demo laser + coords N=4 ea • Prove normal mode N=32 DOE Approach: • Partition performance questions: Pacq/Rel + laser + coords + “normal mode” • Consider total test pgm: HWIL+Captive+Live • Build 3x custom, “right-size” designs to meet objectives/risks 32-shot factorial screens 4-8 variables to 0.5 std dev shift from KPP • Integrate 20 AFOTEC shots for “Mgt Reserve”

Case: Ejector Rack Forces for MAU-12 Jettison with SDB Test Objective: • AF Seek Eagle characterize forces to jettison rack with from 0-4 GBU-39 remaining • Forces feed simulation for safe separation • Desire robust test across multiple fleet conditions • Stores Certification depend on findings • Results: • From Data Mined ‘99 force data with regression • Modeled effects of temp, rack, cg etc • Cg effect insignificant • Store weight, orifice, cart lot, temperature all significant • DOE Approach: • Multiple factors: rack S/N, temperature, orifice, SDB load-out, cart lot, etc • AFSEO used innovative bootstrap technique to choose 15 racks to characterize rack-variation • Final designs in work, but promise 80% reduction from last characterization in FY99

Case: Reduce F-16 Ventral Fin Fatigue – Continuous Response Face-Centered CCD Test Objective: • blah Embedded F-CCD Expert Chose 162 test points Ventral fin • Results: • New design invented circa 2005 capable of efficient flight envelope search • Suitable for loads, flutter, integration, acoustic, vibration – full range of flight test • Experimental designs can increase knowledge while dramatically decreasing required runs • A full DOE toolbox enables more flexible testing • DOE Approach: • Many alternate designs for this 5-dimensional space (a, b , Mach , alt, Jets on/off)

Case: Integration of Sim-HWIL-Captive-Live Fire Events 15-20 factors 8-12 factors 3-5 factors 1000’s Digital Mod/Sim $ - Credibility Predict Test Objective: • Most test programs face this – AIM-9X, JSF, SDB II, etc… • Multiple simulations of reality with increasing credibility but increasing cost • Multiple test conditions to screen for most vital to performance • How to strap together these simulations with prediction and validation? Predict 100’s HWIL or captive Validate + 10’s Live Shot Validate • DOE Approach: • Digital sims screen 15-20 variables with fractional factorials, predict performance • HWIL confirm digital prediction (validate model) and screen 8-12 factors; predict • In live fly, confirm prediction (validate) and test 3-5 most vital variables • Prediction Discrepancies offer chance to improve sims • Results: • Approach successfully used in 53d Wing EW Group • SIL labs at Eglin/PRIMES > HWIL on MSTE Ground Mounts > live fly (MSTE/NTTR) for jammers and receivers • Trimmed live fly sorties from 40-60 to 10-20 (typical) today

Acquisition: F-15E Strike Eagle Suite 4E+ (circa 2001-02) Test Objectives: • Qualify new OFP Suite for Strikes with new radar modes, smart weapons, link 16, etc. • Test must address dumb weapons, smart weapons, comm, sensors, nav, air-to-air, CAS, Interdiction, Strike, ferry, refueling… • Suite 3 test required 600+ sorties DOE Approach: • Build multiple designs spanning: • EW and survivability • BVR and WVR air to air engagements • Smart weapons captive and live • Dumb weapons regression • Sensor performance (SAR and TP) Results: • Vast majority of capabilities passed • Wrung out sensors and weapons deliveries • Dramatic reductions in usual trials while spanning many more test points • Moderate success with teaming with Boeing on design points (maturing) Source: F-15E Secure SATCOM Test, Ms. Cynthia Zessin, Gregory Hutto, 2007 F-15 OFP CTF 53d Wing / 46 Test Wing, Eglin AFB, Florida

Reliability Testing: How Much? How Well? DOE Approach: • Each failure distribution has an associated statistical Power curve (the OC curve) • We can answer test scope/risk questions: • Q1: if test for 1 x MTBF hours, what degradation can we hope to detect? A1: * on graph – about 2.75 failures per period • Q2: if desire to detect twice failure rate of claim, how many hours must we test? A2: # on graph – about 3x MTBF hours Problem: • Replace old unrepairable widget with new one • Claim: new widget more reliable • Limited number of items, limited test times • Experience shows more items fail sustainment than performance • DOT&E Initiative to Improve Suitability Test • Poor sustainment leads to long logistics tails and high operating costs per hour Ceramic bearings LCD screens GPS/INS Unit * # Num MTBF Periods tested x Assumes Poisson failures and exponentially distributed MTBF with common parameter l failures per period . 80% power desired

Munitions Aging Test: Worn Out? DOE Approach: • Engineers showed old Mann-barrel data of typical performance • Passing a degraded lot exposes A-10 to flight damage • Destroying a good lot wastes some money • the b error more important - held to <0.001 • N = 600 rounds to fire (conservative) Problem: • Old GAU-8 30mm training rounds aged beyond 5 year shelf life (circa 1997 Ogden problem) • Concern is age-related deterioration in response of primer to electronic firing signal • Y variable: response time – time from signal to round leaving barrel • Are the lots of ammo safe to fire? Or de-mil? A-10 GAU-8 Gun 30mm TP rounds Hard Spec Limit Alternate world –response 4.0 ms Null world – round response 3.5 ms a: 0.05 Power: .999 d: 0.5ms

Case: GWEF Large Aircraft IR Hit Point Prediction Test Objective: IR man-portable SAMs pose threat to large aircraft in current AOR Dept Homeland Security desired Hit point prediction for a range of threats needed to assess vulnerabilities Solution was HWIL study at GWEF (ongoing) IR Missile C-5 Damage • Results: • Revealed unexpected hit point behavior • Process highly interactive (rare 4-way) • Process quite nonlinear w/ 3rd order curves • Reduced runs required 80% over past • Possible reduction of another order of magnitude to 500-800 runs • DOE Approach: • Aspect – 0-180 degees, 7each • Elevation – Lo,Mid,Hi, 3 each • Profiles – Takeoff, Landing, 2 each • Altitudes – 800, 1200, 2 each • Including threat – 588 cases • With usual reps nearly 10,000 runs • DOE controls replication to min needed

Case: eTOLD • System/Problem: • C-130 APC/ETOLD: C-130 Aircraft Performance Calculator Electronic Takeoff and Landing Data • Calculates C-130 aircraft performance data • Perform a qualification test and evaluation (QT&E) of the C-130 APC by ensuring calculated data matches hand-calculations using charts and lookup tables TE: 1Lt Damien Waden, 413FTS • Results/Impact: • This method provided mixed 2/3-way interaction coverage for the system within a number of cases feasible for the test team • The default approach would have been a random scheme of demos; FCAs added rigor and structure to testing • Lessons learned led to development of ‘augmenting’ FCAs as an approach to handling severe nesting in use cases • DOE Approach: • Used factor-covering to exercise sub-configurations of the system variables using parameterized boundary values • Testers exercised the software such that for a selected test sub-configuration from the factor covering array, the logical boundaries were interrogated

Case: SADL • System/Problem: • The Situation Awareness Data Link (SADL) integrates USAF close air support aircraft with the digitized battlefield via the US Army's Enhanced Position Location Reporting System (EPLRS). • Software intensive testing on the SADL 11z firmware release dictates tests with multiple runs and complex factors TE: 1Lt John Reed, 46TS/OGEJ • Results/Impact: • Maintained fault-finding capability while effecting an 80% reduction in test cases • Allowed the test team to integrate the remaining test set with other portions of the test to achieve even fewer runs • Constraint issue resulted in development of ‘FCA splitting’ for constraint-handling with NIST ACTS software for future C4ISR testing • DOE Approach: • Needed to reduce number of runs required due to the large amount of time required to configure each run; had to include complex constraints for valid configurations • Constructed 33-case covering array to detect faults with100% coverage of 3-factor interactions

Case: JMS (in progress) • System/Problem: • Joint Space Operations Center (JSpOC) Mission System (JMS) provides information to the Joint Functional Component Commander for Space (JFCC Space) and other C2 and support elements to enable the command and control of space forces • Ensure system meets contract functional requirements; demonstrate DOE for part TE: Barry Graham, 46TS/OGEA • DOE Approach: • Exploring feasibility of using covering arrays or space filling algorithms to test goodness of tracking algorithms across satellite catalog • Results/Impact: • In progress • Goal is to demonstrate utility of DOE for AOC-type, software-intensive weapon systems testing

Summary • By the nature of Test, we face inescapable risks in getting it right • Fielding a poor performing system • Failing to field a good system • Risks are knowable -- we want to quantify and manage these risks • DOE is not just “Statistical Queep” • DOE is a powerful test strategy: 12 steps in 4 blocks • DOE is a test design tool • DOE supplies the tool to explore multiple conditions while retaining power and confidence to get the right answer • DOE will help AAC put the Capital “E” back in “T&e” • Following these practices gives the Best tests for our resources • Partnering with 46 TW, 53d Wing, AFOTEC, AFFTC and others to implement DOE at AAC

Part 6

Part 6

Presentation Transcript

Part 6

Part 6

Part 6

Part 6

Part 6

PART 6

Part 6

Part 6 Vocabulary

PART 6

Part 6

PART 6

Part 6

Part 6

Part 6

PART 6

Part 6

Part 6

Part 6

Part 6

Part 6

Part 6