1 / 78

An Empirical Assessment of the Crosscutting Concern Problem

An Empirical Assessment of the Crosscutting Concern Problem. Marc Eaddy Department of Computer Science Columbia University. Motivation. Maintenance dominates software costs. Other Development. 50–90% of total software cost. 3–4 x Development Costs. Maintenance. Motivation.

bijan
Télécharger la présentation

An Empirical Assessment of the Crosscutting Concern Problem

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Empirical Assessment of the Crosscutting Concern Problem Marc Eaddy Department of Computer Science Columbia University

  2. Motivation • Maintenance dominates software costs Other Development 50–90% of total software cost 3–4 x Development Costs Maintenance

  3. Motivation • >50% of maintenance time spent understanding the program

  4. Motivation • >50% of maintenance time spent understanding the program • Where are the features,reqs, etc. in the code? Reqs Code

  5. Motivation • >50% of maintenance time spent understanding the program • Where are the features,reqs, etc. in the code? • What is this code for?

  6. Motivation • >50% of maintenance time spent understanding the program • Where are the features,reqs, etc. in the code? • What is this code for? • Why is it hard to understand and changethe program?

  7. Main Contributions ConcernTagger Cerberus • Improved state of theart of concern location • Innovative metricsand experimentalmethodology • Evidence of the dangersof crosscutting concerns PDA

  8. Improving concern location • Statement Annotations for Fine-Grained Advising • ECOOP Workshop on Reflection, AOP, and Meta-Data for Software Evolution (2006) • Eaddy and Aho • Demo: Wicca 2.0 - Dynamic Weaving using the .NET 2.0 Debugging APIs • Aspect-Oriented Software Development (2007) • Eaddy • Identifying, Assigning, and Quantifying Crosscutting Concerns • ICSE Workshop on Assessment of Contemporary Modularization Techniques (2007) • Eaddy, Aho, and Murphy • Cerberus: Tracing Requirements to Source Code Using Information Retrieval, Dynamic Analysis, and Program Analysis • IEEE International Conference on Program Comprehension (2008) • Eaddy, Aho, Antoniol, and Guéhéneuc

  9. Innovative metrics & methodology • Towards Assessing the Impact of Crosscutting Concerns on Modularity • AOSD Workshop on Assessment of Aspect Techniques (2007) • Eaddy and Aho • Do Crosscutting Concerns Cause Defects? • IEEE Transactions on Software Engineering (2008) • Eaddy, Zimmerman, Sherwood, Garg, Murphy, Nagappan, and Aho

  10. Dangers of crosscutting • Do Crosscutting Concerns Cause Defects? • IEEE Transactions on Software Engineering (2008) • Eaddy, Zimmerman, Sherwood, Garg, Murphy, Nagappan, and Aho

  11. Roadmap ConcernTagger Cerberus • Improved state of theart of concern location • Innovative metricsand experimentalmethodology • Evidence of the dangersof crosscutting concerns PDA

  12. What is a “concern?” Anything that affects the implementation of a program • Feature, requirement, design pattern, code idiom, etc. • Raison d'être for code • Every line of code exists to satisfy some concern • Existing definitions are poor • Concern domain must be “well-defined set”

  13. Concern location problem • Concern–code relationship hard to obtain Program Elements Concerns

  14. Concern location problem • Concern–code relationship hard to obtain • Concern–code relationship undocumented Program Elements Concerns ?

  15. Concern location problem • Concern–code relationship hard to obtain • Concern–code relationship undocumented • Reverse engineer the relationship Program Elements Concerns

  16. Manual concern location • Concern–code relationship determined by a human • Existing techniques too subjective • Inaccurate, unreliable • Ideal • Code affected when concern is changed • My insight • Prune dependency rule [ACOM’07] • Code affected when concern is pruned (removed) • i.e., software pruning • Practical approximation

  17. Prune dependency rule • Code is prune dependenton concern if • Concern pruned code removed or altered • Distinguish between removing and altering code • Easily determine change impact of removing code • Code dependent on removed code must be altered (to prevent compile errors) • Easy for human to approximate

  18. Manual concern location • Concern–code relationship determined by a human • Existing tools impractical for analyzing all concerns of a real system • Many concerns (>100) • Many concern–code links (>10K) • Hierarchical concerns • My solution: ConcernTagger [TSE’08]

  19. ConcernTagger

  20. Automated concern location • Concern–code relationship predicted by an “expert” • Experts look for clues in docs and code • Existing techniques only consult 1 or 2 experts • My solution: Cerberus [ICPC’08] • Information retrieval • Execution tracing • Prune dependency analysis

  21. IR-based concern location • i.e., Google for code • Program entities are documents • Requirements are queries Requirement “Array.join” SourceCode join Id_join js_join()

  22. Vector space model [Salton] • Parse code and reqs doc to extract term vectors • NativeArray.js_join()method “native,” “array,” “join” • “Array.join”requirement “array,” “join” • My contributions • Expand abbreviations • numconns number, connections, numberconnections • Index fields • Weigh terms (tf · idf) • Term frequency (tf) • Inverse document frequency (idf) • Similarity = cosine distance between document and query vectors

  23. Tracing-based concern location • Observe elements activated when concern is exercised • Unit tests for each concern • e.g., find elements uniquely activated by a concern

  24. Tracing-based concern location • Observe elements activated when concern is exercised • Unit tests for each concern • e.g., find elements uniquely activated by a concern Unit Test for “Array.join” Call Graph var a = new Array(1, 2); if (a.join(',') == "1,2"){ print "Test passed"; } else { print "Test failed"; } js_join js_construct

  25. Tracing-based concern location • Observe elements activated when concern is exercised • Unit tests for each concern • e.g., find elements uniquely activated by a concern Unit Test for “Array.join” Call Graph var a = new Array(1, 2); if (a.join(',') == "1,2"){ print "Test passed"; } else { print "Test failed"; } js_join js_construct

  26. Tracing-based concern location • Elements often activated by multiple concerns • What is “information content” of element activation? • Element Frequency–Inverse ConcernFrequency [ICPC’08]

  27. Prune dependency analysis • Infer relevant elements based on structural relationship to relevant element e (seed) • Assumes we already have some seeds • Prune dependency analysis[ICPC’08] • Automates prune dependency rule[ACOM’07] • Find references to e • Find superclasses and subclasses of e

  28. PDA example Program Dependency Graph Source Code inherits interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } A C B refs contains contains contains contains calls bar foo foo main

  29. PDA example Program Dependency Graph Source Code inherits interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } A C B refs contains contains contains contains calls bar foo foo main

  30. PDA example Program Dependency Graph Source Code inherits interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } A C B refs contains contains contains contains calls bar foo foo main

  31. PDA example Program Dependency Graph Source Code inherits interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } A C B refs contains contains contains contains calls bar foo foo main

  32. PDA example inherits Program Dependency Graph Source Code interface A { public void foo(); } public class B implements A { public void foo() { ... } public void bar() { ... } } public class C { public static void main() { B b = new B(); b.bar(); } A C B refs contains contains contains contains calls bar foo foo main

  33. Cerberus

  34. Cerberus effectiveness Cerberus Cerberus Most effective PDA improves IR by 155% PDA Improves Tracing by 104%

  35. Roadmap ConcernTagger Cerberus • Improved state of theart of concern location • Innovative metricsand experimentalmethodology • Evidence of the dangersof crosscutting concerns PDA

  36. The crosscutting concern problem Some concerns difficult to modularize • Code related to the concern is… • Scattered across (crosscuts) multiple files • Often tangled with other concern code Program Elements Concerns

  37. Example: Pathfinding in Goblin • Pathfinding is modularized

  38. Example: Collision detection • Collision detection not modularized

  39. How to measure scattering? • Existing metrics inadequate • My solution • Degree of scattering [ASAT’07] • Degree of tangling [ASAT’07]

  40. Degree of scattering (DOS) • Measures concern modularity, i.e., distribution of concern code across multiple classes • Average DOS – Overall modularity of concerns • Summarizes amount of crosscutting present • More insightful than traditional metrics • “class A is highly coupled” vs. “feature A is hard to change” [Wong, et al.] [ACOM’07]

  41. DOS= 1.00 #Classes = 4 DOS= 0.08 #Classes = 4 Insight behind DOS • More descriptive than class count • Consider two different concern implementations Marc Eaddy

  42. Degree of tangling (DOT) • Distribution of classcode across multiple concerns • Average DOT – Overall separation of concerns [Wong, et al.] [ACOM’07] Marc Eaddy

  43. Roadmap ConcernTagger Cerberus • Improved state of theart of concern location • Innovative metricsand experimentalmethodology • Evidence of the dangersof crosscutting concerns PDA

  44. Do crosscutting concerns cause defects? [TSE’08] • Created mappings • Requirement–code map (via ConcernTagger) • Bug–code map (via BugTagger) • Bug–requirement map (inferred)

  45. Do crosscutting concerns cause defects? [TSE’08] • Correlated scatteringand bug count • Spearmancorrelation • Found moderateto strong correlationbetween scatteringand defects • As scattering increasesso do defects Scattering Bugs

  46. How widespread is the problem? • 5 case studies of OO programs • Scattering • Concerns related to 6 classes on average • OO unsuitable for representing these problem domains? • Most (86%) concerns are crosscutting to some extent • Dispels “modular base” notion • General-purpose solution needed • Tangling • Classes related to 10 concerns on average • Poor separation of concerns • Classes doing too much • Crosscutting concerns severely limit modularity

  47. Main Contributions ConcernTagger Cerberus • Improved state of theart of concern location • Innovative metricsand experimentalmethodology • Evidence of the dangersof crosscutting concerns PDA

  48. Future work • Further explore new concern analysis field • Techniques to reduce crosscutting • Improve concern location • Improve PDA generality, precision, and heuristics • Use machine learning to combine judgments • Incorporate smart “grep” and PDA into IDE • Gather empirical evidence • Impact of reducing crosscutting • Impact of crosscutting on maintenance effort • Impact of code tangling on quality

  49. Acknowledgements • Alfred Aho • ConcernTagger/Mapper • Vibhav Garg • Jason Scherer • John Gallagher • Martin Robillard • FrédéricWeigand-Warr • BugTagger • Thomas Zimmermann • Cerberus • Giuliano Antoniol • Yann-Gaël Guéhéneuc • Andrew Howard • Gobin • Erik Petterson • John Waugh • Hrvoje Benko • Wicca • BorianaDitcheva • Rajesh Ramakrishnan • Adam Vartanian • Microsoft Phoenix Team

  50. Questions? Marc Eaddy Columbia University eaddy@cs.columbia.edu

More Related