Automatically Extracting Configuration Constraints

Automatically Extracting Configuration Constraints Sarah Nadi*, Thorsten Berger *, Christian Kästner+, and Krzysztof Czarnecki* Product Line Engineering Workshop, Univ. of Waterloo Dec. 9th 2013 * +

Variability in real life Command navigation package Premium Package Comfort Seats, Front Black Leatherette • Integrated Garage Door Opener • Electronic Compass • Black Dacota Leather • Automatic Trunk Selection will result in: + Addition of premium package - Removal of Black Leathertte

Handling variability Build Independently Clone & Own Share Assets Software Product Lines (SPL) Product configuration Variability modeling Components DSLs Generators Preprocessors Design patterns … [Dubinski et al., CSMR ‚13] Slide credits: T. Berger

But.. How can we build an SPL? P1 P2 P3 Build from scratch X … Migrate Expensive and not always possible

What is involved in migrating? Identify Configurable Features Detect Variation Points Identify Feature Dependencies Refactor Code Create Variability Model Refactor Architecture [She et al., ICSE‘11]

Variability model constraints Hierarchy Constraint: MP3 => Media Cross-tree Constraint: Camera => High Res. [Benavides et al., 2010]

Can we automatically extract constraints? To what extent?

Scope • C based systems using conditional compilation • Focus on build-time variability • Identify two sources of constraints

1. Conditional build-time errors Specification 1: Every valid configuration of the system must not contain build-time errors.

Pre-processor Error if ASH && NOMMU Invalid Configuration Constraint: ASH => !NOMMU Parser Error if ASH && EDITING && !MAX_LEN Type Error if ASH && EDITING_VI && MAX_LEN && !EDITING

2. Feature effect • Avoid meaningless configurations which do not add/remove parts of the code • If we add/remove a feature, we want to get different functionality • Determine under which configurations, a feature has an effect on the code Specification 2: Every valid configuration should yield a lexically different program

Feature effect MAX_LEN && EDITING && ASH MAX_LEN && EDITING_VI && ASH MAX_LEN => ASH && (EDITING || EDITING_VI)

Extract constraints by brute force P1 P2 P3 Pn … Not scalable--- 2n combinations Build Individually If every configuration with feature X compiles except when Y is also selected X => !Y If every configuration with feature Z does not change the selected code except if W is also selected Z => W

Extracting constraints in a single pass • Make use of variability-aware parsing & type checking to mimic build-time behaviour. [Kästner et al., OOPSLA ‘11]

Infrastructure • Developed tool • FarCE – Feature Constraints Extraction https://bitbucket.org/tberger/farce • Rely on previous work: • TypeChef – Type-Checking Ifdef Variability https://github.com/ckaestne/TypeChef • KBuildMiner http://code.google.com/p/variability/wiki/PresenceConditionsExtraction • To analyze variability models: • LVAThttps://code.google.com/p/linux-variability-analysis-tools • CDLTools https://bitbucket.org/tberger/cdltools

Empirical study • Objectives: • O1: Evaluate accuracy and scalability of extraction • O2: Quantitatively and qualitatively study kinds of (extractable) constraints in real-world systems • Used four systems with existing variability models • Compare extracted constraints to existing hierarchy & crosstree edges in the model uClibc 1,628 C files 367 features BusyBox 535 C files 844 Features eCos 579 C files 1,254 features Linux Kernel 7,691 C files 6,559 Features

Is the extraction accurate? Specification 1 is 95% accurate Specification 2 is 76% accurate

Which constraints are recovered? Hierarchy edges reflected in code nesting Crosstree edges prevent build-time errors Can automatically recover 23% of variability model constraints! 5% by Specification 1 17% by Specification 2

What about constraints not found? • Qualitative analysis of 144 unrecovered constraints Manual analysis of constraints is hard! Some constraints are non-technical & need expert knowledge 29% unknown 21% additional analyses 19% limitation in comparison Qualitative analysis is subjective 19% configurator-related 9% domain knowledge 3% limitation in extraction

Challenges • Presence conditions and constraints explode • Limit complexity of constraints used • Use non-SAT based constraint combination techniques • Different ways to compare constraints • Our comparison is limited to binary constraints • Other techniques which may be used? • Understand the intent of different constraints without interviewing developers

Extracting configuration constraints from code Automatically Done Conditional Build-time errors & Feature Effect Accurate (95% & 76% respectively) Extracts substantial parts of VM An expert may still be needed (Avg. 23% & up to 65%) Questions? Sarah Nadi (finishing soon … interested in a post doc) snadi@uwaterloo.cahttp://swag.uwaterloo.ca/~snadi

Pre-processor Error if ASH && NOMMU Invalid Configuration Constraint: ASH => !NOMMU Linker Error if ASH && EDITING && !INIT Parser Error if ASH && EDITING && !MAX_LEN Type Error if ASH && EDITING_VI && MAX_LEN && !EDITING

Constraint formulas Preprocessor, parser, and type-checking constraints Linker constraints Feature effect conditionalsymboltable

How are constraints used? • Hierarchy edges are mainly reflected in how features are used/nested in the code (Spec 2: feature effect analysis) • Cross tree edges are often used to prevent build-time errors (Spec 1: conditional build-time errors)

Partial pre-processor (lexer) slidecredits: C. Kästner

parser slidecredits: C. Kästner

Is the analysis scalable? Can analyze Linux files in 12hr with parallelization

Automatically Extracting Configuration Constraints

Automatically Extracting Configuration Constraints

Presentation Transcript

Constraints

Constraints

Extracting Monomers

Automatically Extracting Structured Data for Web Search

Online Banking Fraud: Extracting intelligence from Zeus configuration files

Extracting DNA

Extracting data

Extracting Value

Constraints

Automatically Extracting Ontologically Specified Data from HTML Tables with Unknown Structure

Extracting Copper

Extracting Copper

Slides advance automatically

Constraints

Constraints

Constraints

Constraints

Constraints

Extracting Randomness

Automatically Extracting Structured Data for Web Search

Constraints

Constraints