270 likes | 395 Vues
This paper discusses methods for automatically extracting configuration constraints from software product lines (SPLs). It highlights the importance of identifying configurable features and variation points, as well as the challenges in maintaining accurate variability models. The authors present empirical studies evaluating the accuracy and scalability of their extraction tool, demonstrating its capability to recover a significant portion of variability model constraints. The study also addresses complexities in constraints and the necessity for expert knowledge in some cases.
E N D
Automatically Extracting Configuration Constraints Sarah Nadi*, Thorsten Berger *, Christian Kästner+, and Krzysztof Czarnecki* Product Line Engineering Workshop, Univ. of Waterloo Dec. 9th 2013 * +
Variability in real life Command navigation package Premium Package Comfort Seats, Front Black Leatherette • Integrated Garage Door Opener • Electronic Compass • Black Dacota Leather • Automatic Trunk Selection will result in: + Addition of premium package - Removal of Black Leathertte
Handling variability Build Independently Clone & Own Share Assets Software Product Lines (SPL) Product configuration Variability modeling Components DSLs Generators Preprocessors Design patterns … [Dubinski et al., CSMR ‚13] Slide credits: T. Berger
But.. How can we build an SPL? P1 P2 P3 Build from scratch X … Migrate Expensive and not always possible
What is involved in migrating? Identify Configurable Features Detect Variation Points Identify Feature Dependencies Refactor Code Create Variability Model Refactor Architecture [She et al., ICSE‘11]
Variability model constraints Hierarchy Constraint: MP3 => Media Cross-tree Constraint: Camera => High Res. [Benavides et al., 2010]
Can we automatically extract constraints? To what extent?
Scope • C based systems using conditional compilation • Focus on build-time variability • Identify two sources of constraints
1. Conditional build-time errors Specification 1: Every valid configuration of the system must not contain build-time errors.
Pre-processor Error if ASH && NOMMU Invalid Configuration Constraint: ASH => !NOMMU Parser Error if ASH && EDITING && !MAX_LEN Type Error if ASH && EDITING_VI && MAX_LEN && !EDITING
2. Feature effect • Avoid meaningless configurations which do not add/remove parts of the code • If we add/remove a feature, we want to get different functionality • Determine under which configurations, a feature has an effect on the code Specification 2: Every valid configuration should yield a lexically different program
Feature effect MAX_LEN && EDITING && ASH MAX_LEN && EDITING_VI && ASH MAX_LEN => ASH && (EDITING || EDITING_VI)
Extract constraints by brute force P1 P2 P3 Pn … Not scalable--- 2n combinations Build Individually If every configuration with feature X compiles except when Y is also selected X => !Y If every configuration with feature Z does not change the selected code except if W is also selected Z => W
Extracting constraints in a single pass • Make use of variability-aware parsing & type checking to mimic build-time behaviour. [Kästner et al., OOPSLA ‘11]
Infrastructure • Developed tool • FarCE – Feature Constraints Extraction https://bitbucket.org/tberger/farce • Rely on previous work: • TypeChef – Type-Checking Ifdef Variability https://github.com/ckaestne/TypeChef • KBuildMiner http://code.google.com/p/variability/wiki/PresenceConditionsExtraction • To analyze variability models: • LVAThttps://code.google.com/p/linux-variability-analysis-tools • CDLTools https://bitbucket.org/tberger/cdltools
Empirical study • Objectives: • O1: Evaluate accuracy and scalability of extraction • O2: Quantitatively and qualitatively study kinds of (extractable) constraints in real-world systems • Used four systems with existing variability models • Compare extracted constraints to existing hierarchy & crosstree edges in the model uClibc 1,628 C files 367 features BusyBox 535 C files 844 Features eCos 579 C files 1,254 features Linux Kernel 7,691 C files 6,559 Features
Is the extraction accurate? Specification 1 is 95% accurate Specification 2 is 76% accurate
Which constraints are recovered? Hierarchy edges reflected in code nesting Crosstree edges prevent build-time errors Can automatically recover 23% of variability model constraints! 5% by Specification 1 17% by Specification 2
What about constraints not found? • Qualitative analysis of 144 unrecovered constraints Manual analysis of constraints is hard! Some constraints are non-technical & need expert knowledge 29% unknown 21% additional analyses 19% limitation in comparison Qualitative analysis is subjective 19% configurator-related 9% domain knowledge 3% limitation in extraction
Challenges • Presence conditions and constraints explode • Limit complexity of constraints used • Use non-SAT based constraint combination techniques • Different ways to compare constraints • Our comparison is limited to binary constraints • Other techniques which may be used? • Understand the intent of different constraints without interviewing developers
Extracting configuration constraints from code Automatically Done Conditional Build-time errors & Feature Effect Accurate (95% & 76% respectively) Extracts substantial parts of VM An expert may still be needed (Avg. 23% & up to 65%) Questions? Sarah Nadi (finishing soon … interested in a post doc) snadi@uwaterloo.cahttp://swag.uwaterloo.ca/~snadi
Pre-processor Error if ASH && NOMMU Invalid Configuration Constraint: ASH => !NOMMU Linker Error if ASH && EDITING && !INIT Parser Error if ASH && EDITING && !MAX_LEN Type Error if ASH && EDITING_VI && MAX_LEN && !EDITING
Constraint formulas Preprocessor, parser, and type-checking constraints Linker constraints Feature effect conditionalsymboltable
How are constraints used? • Hierarchy edges are mainly reflected in how features are used/nested in the code (Spec 2: feature effect analysis) • Cross tree edges are often used to prevent build-time errors (Spec 1: conditional build-time errors)
Partial pre-processor (lexer) slidecredits: C. Kästner
parser slidecredits: C. Kästner
Is the analysis scalable? Can analyze Linux files in 12hr with parallelization