Software Remodeling Improving Design and Implementation Quality

Static analysis (audits and metrics) and dynamic analysis (run time) and refactoring. Software RemodelingImproving Design and Implementation Quality Al Mannarino Lead Systems Engineer

Who is CodeGear?Evolution of Borland Application Lifecycle Management StarTeam Tempo Silk Gauntlet CaliberRM Together Delphi JBuilder Turbo Delphi Developer Tools C++Builder InterBase Turbo C++

CodeGear Strategy • 100% focus on Developers! • Individual developer’s speed and productivity • Team communication and collaboration • Quality and performance • Languages: Java, Delphi, C++, C#, PHP, Ruby, SQL • Platforms: Java, Windows, Web, Open Source, Mac, Linux.

+ Team Velocity Communication Automation Collaboration Process Visibility Value + Individual Velocity LiveSource Visual Design RAD VCL Refactoring CodeGear Strategy Platforms Java/JEE Windows Open Source Web

CodeGear Goal • Increase your individual and team productivity by 25% to 33% • for example • 1/3 faster time to market or • 1/3 more features developed or • 1/3 less development cost

AGENDA • Software Remodeling: Improving Design and Implementation Quality • Apply automated software inspection and run-time analysis tools to help identify common problems that can cause performance and scalability issues. • Discuss the problems found and demonstrate how tools can help accelerate the implementation of appropriate solutions. • Find: • Apply software quality audits (check for common coding errors and enforce standards) • Apply software quality metrics (measure object-oriented design characteristics such as complexity, cohesion, and coupling) • Reverse engineer structure and behavior (Reverse engineer class diagrams and generated sequence diagrams) • Apply Application Quality Analyzer (analyze components for performance issues, detect misuses and identify errors) • Discuss discovered problems (excessive temporary objects, dead code, highly complex methods, etc.) • Generate documentation • Fix: • Discuss principles of software remodeling (audits/metrics, design patterns, refactoring) • Create JUnit tests (code with confidence) • Apply simple refactoring (modify design without breaking existing code, verify impact of refactoring) • Apply pattern-based refactoring

Agenda • Code Reviews • Static Analysis (Audits and Metrics) • Metric-Driven Refactoring • Dynamic Analysis (run time profiling) • Code Reviews Rethought • Q & A

What is Software Remodeling? • A methodology for improving design and implementation quality using a combination of: • audits, • metrics, and • refactoring. • formal code reviews • the “subtle science and exact art” of software engineering necessitates doing code reviews. • part of a quality assurance program.

Manual Code Reviews versus Automated Code Reviews • As a code base increases in size and complexity, it becomes unlikely that a manual review process, no matter how frequent, can uncover all issues with a high degree of accuracy. • Automation is necessary for tasks that are particularly well suited to machine processing.

Code Reviews - Automation Case Studies • Task • Review by hand a small example set of code • 5 classes, 140 lines of code, 4 “experts” in 15 minutes • Average audit violations detected: 21 • Comparison • The same set of code analyzed by a machine: • Audit violations detected: 150 time: 2 sec • Humans @ 84/hr versus Machine @ 270,000/hr • Real “sample” Project • 187,600 violations on a project with 1233 classes and 199,146 lines of code. • 32,443 violations using the default set • Actual customer case studies show ~400 man/hour savings per code review

“Agile” Software Development • “Agile Manifesto” states that working software should be valued over comprehensive documentation. • implies if reviews are to be done with an agile process, it is likely that source code would be the only artifact deemed worth reviewing. • Taken to the extreme, XP provides a form of continuous code review, as development is always done in pairs, • Other lightweight processes, such as Feature-Driven Development (FDD), incorporate design and code reviews as part of their activities.

Audits - Enforcing Rules for Source Code Conformance • What are Audits? • Checks for conformance to standard or user-defined style, maintenance, and robustness guidelines. • Motivation • Improved readability • Improved standards-conformance • Increased performance • Reveal subtle errors that may result in bugs • Categories, such as: • Coding Style • Errors • Documentation • Superfluous Content • Performance, etc…

An Example AuditASWL – Append to String Within a Loop • Category: Performance • Effective Java™: Item 33 – Beware the performance of string concatenation Wrong: public String method() { String var = "var"; for (int i=0; i<10; i++) { var += (" " + i); } return var; } Right: public String method() { StringBuffer var = new StringBuffer("var"); for (int i = 0; i < 10; i++) { var.append(" " + i); } return var.toString(); }

Why is “Append to String Within a Loop” bad? • In Java™, all String objects are immutable. • This means that when two strings are joined together, a new String object must be created (well, almost true). • In fact, when the += operator is used to join two Strings, a StringBuffer object is also created. • So, when performing multiple String concatenations (i.e. in a loop), it is more efficient to create a single StringBuffer object and use its append() method to perform the concatenation. • Using the StringBuffer class will therefore minimize the number of objects created, and subsequently garbage collected, during multiple String concatenations. • This audit violation is easy to spot and can provide significant improvement in application efficiency. How long would it take to find all of these violations in your code base?

Sample audit • “Avoid float and double if exact answers are required”. • Item 31 in “Effective Java”. • Floating-point types are intended for use in scientific or mathematical equations where numbers of large magnitude are calculated. They are not intended for use in comparing small values, particularly those involving currency values. In this case, rounding in calculations may cause errors, particularly when dealing with values in the negative powers of 10. For example, • if you declare two doubles d1 = 1.03 and d2 = 1.13 and compare d1 – d2 to the expected answer of –0.1, you find that they are unequal using the == operator. • Additionally, when overriding the equals() method, comparisons to an object’s floating point type fields should for the same reasons use the Double.doubleToLongBits() and Float.floatToIntBits() methods for double and float fields, respectively.

Audits -Case Studies

Metrics - Quantifying Source Code Characteristics • What are Software Metrics? • Measurements that allow for the analysis of software code and design. • Motivation • Improve the process of developing software • Identify potential defects through quantitative measures • Further understanding of software design complexity • Categories - Basic Counts - Encapsulation - Cohesion - Polymorphism - Coupling - Inheritance - Complexity - Halstead’s How “healthy” is source code? Do we have a good design or poor object design?

Metrics - A disclaimer • There is no “silver bullet” metric suite. • You can use metrics to guide projects, but be careful not to let them paralyze a project when developers overanalyze. • There are also reasons why automated tools can produce misleading results. • This is often due to particular language features that cannot be accounted for in a general sense. • For example, the use of reflection in the Java programming language makes static analysis of source code problematic. In this case, reflective use of a class and its methods will appear to static analysis tools as the mere utilization of the Reflection API and not the actual class involved. Metrics indicate false dependencies when they encounter the use of this feature.

Eight (8) Metrics for Java Development • Simple starting point to help answer basic questions about the size, quality of design, and ability to modify or maintain existing code. • Lines of Code • Number of Classes • Lack of Cohesion of Methods (3) • Cyclomatic Complexity • Weighted Methods per Class (1) • Coupling Between Objects • Number of Overridden Methods • True Comment Ratio

Depth of Inheritance Hierarchy (DOIH) • Inheritance is one of the features of object-oriented programming that was much touted and then greatly abused. • It has since fallen somewhat out of favor, largely replaced by composition and interfaced-based design strategies. Where it is appropriately used, inheritance is a powerful design feature. Like all things, moderation is the key. • Deep inheritance hierarchies can lead to code fragility with increased complexity and behavioral unpredictability. • An easy metric to comprehend, DOIH indicates the number of classes that are parents of a class, with the top-most class having a count of one (not zero as some computer scientists would think).

DOIH = 1 This diagram illustrates an inheritance hierarchy where the Depth of Inheritance Hierarchy (DOIH) value for the AppException class is 5, while the DOIH value for RuntimeException is 4, and so on. DOIH = 2 DOIH = 3 DOIH = 4 DOIH = 5

Number of Child Classes (NOCC) • For a given class, the number of classes that inherit from it is referred to by the metric Number of Child Classes (NOCC). • With increasing values of NOCC, the potential for reuse increases, while testing difficulty and potential misuse of inheritance also increase. • Generally speaking, a high level of NOCC may indicate the improper abstraction of the class. • In the Java programming language, implemented interfaces count when calculating NOCC. • For example, if an interface is implemented by a class, which has six subclasses, the NOCC value for the interface is seven, while the implementing class has a NOCC value of 6.

This diagram indicates how Number of Child Classes (NOCC) values are calculated. Starting at the bottom, each of the Event classes has no children and therefore receives a NOCC value of 0. Each of the 6 *Event classes extend the EventSupport class, giving it a NOCC value of 6. Finally, at the top of the hierarchy is the interface Event with its implementation (children) classes totaling 7.

Coupling Between Objects (CBO) • Object-oriented design necessitates interaction between objects. • Excessive interaction between an object of one class and many objects of other classes may be detrimental to the modularity, maintenance, and testing of a system. • CBO counts the number of other classes to which a class is coupled, except for inherited classes, java.lang.* classes, and primitive types. • A decrease in the modularity of a class can be expected with high values of CBO. • However, some objects necessarily have a high degree of coupling with other objects. • In the case of factories, controllers, and some modern user interface frameworks, you should expect a higher value for CBO.

From the class declaration, you get 0, as implementation and extension do not count as couplings in the CBO metric. The class attributes int number and String string do not count because they are considered primitives. The List list attribute does count, but only once, even though a List type is passed as an argument to method one(). (1) The Map m passed as the second argument counts, creating the second coupling. (2) Finally, in the method two(), the throws AppException declaration (3) counts as a coupling, in addition to the local File file variable. (4) Again, the List type has already been counted. public class CBO extends Base implements Metric { private int number; private List list; private String string; public void one(List l, Map m) { list = l; } private List two() throws AppException { File file; //... return new List(); } }

Response for Class (RFC) • The number of methods, internal and external, available to a class. • The enumeration of methods belonging to a class and its parent, as well as those called on other classes, indicates a degree of complexity for the class. • RFC differs from CBO in that a class does not need to use a large number of remote objects to have a high RFC value.

Response for Class (RFC) The number of methods, internal and external, available to a class. The enumeration of methods belonging to a class and its parent, as well as those called on other classes, indicates a degree of complexity for the class. RFC differs from CBO in that a class does not need to use a large number of remote objects to have a high RFC value. Because RFC is directly related to complexity, the ability to test, debug, and maintain a class increases with an increase in RFC.

Cyclomatic Complexity (CC) • Counts the number of possible paths through a method. • In other words, using if, for,and while statements as path indicators counts the total number of paths through the body of a method. • Consider a method represented as a strongly connected, Directed Acyclic Graph (DAG). The formula for CC equals the number of edges e, minus the number of nodes n, plus two times the number of components p. CC = V(G) = e – n +2p

Cyclomatic Complexity example In this method, there is an if statement containing a while loop with a for loop in the else portion. What’s the CC for this method? public void one(){ if(true) { while(false) { two(); } } else { for(int i=0;i<10;i++) { two(); } } }

Considering the distinct paths through this method, there are four paths, as the simplified illustration shows CC = 8 – 6 + 2(1) = 4. CC = V(G) = e – n +2p The formula for CC equals the number of edges e, minus the number of nodes n, plus two times the number of components p. public void one(){ if(true) { while(false) { two(); } else { for(int i=0;i<10;i++) { two(); } } }

Weighted Methods Per Class (WMPC) • If you were to sum the values for CC for every method in a class, you would have the WMPC value. • Alternatively, a simple count of the methods and their formal parameters also gives WMPC. • In tools, WMPC1 implements the former, while WMPC2 implements the latter. Tells us time and effort for development and maintenance of a class and the potential reuse of the class

Weighted Methods Per Class (WMPC) In equation form, consider n to be the number of methods and c to be a measure of the method’s complexity. As the WMPC value for a class increases, the time and effort for development and maintenance increases. Also, the potential reuse of the class may decrease as its WMPC value increases. In measuring WMPC, inherited methods are not included, while getter, setter, and private methods are included.

An Example MetricWeighted Methods Per Class • Provides the sum of the static complexities of all methods in a class. • Let n be the number of methods • Let c be a measure of a method’s complexity • Then, • Note that complexity is normally a measure of Cyclomatic Complexity (CC), but may also be taken as unity for each method.

Metrics- Weighted Methods Per Class – An Example public class WMPC { public void one(){ if(true) { two(); } else { } if(true && !false){ } } public void two(){ if(true){} } public void three(int i){} } cc 3 2 1 6 mc 1 1 2 4

Metrics - Weighted Methods Per Class (continued) • In general, as WMPC increases: • Time and effort for development increases • Impact on children classes increases • Testing difficulty increases • Reuse of the class may decrease • WMPC does not include inherited methods. • WMPC includes getter, setter and private methods. • WMPC1 sums the CC for each method, while WMPC2 sums the number of methods and parameters.

Metrics - Common Business Impacts of High Complexity • Impede Delivery • Productivity drops as developers spend more time to understand code and make changes. • Reduce Quality • Complexity strongly correlates with higher defect density. • Increase Risk • Adequate testing coverage becomes more evasive as the number of paths through an application increase. • Raise Spend • Because defects are harder to catch, they are caught later, leading to higher costs to fix.

Lack of Cohesion of Methods (LOCOM) • The most complex metric in the default set, measures the dissimilarity of methods in a class by attributes. • Fundamentally, it indicates the cohesiveness of the class, although the values increase as the cohesiveness decreases.

Lack of Cohesion of Methods (LOCOM) Computes a measure of similarity for a number of methods m accessing a set of attributes Aj, where a is the number of attributes. With an increase in LOCOM, the encapsulation characteristics of the class are also brought into question. Also, the complexity increases with a decreasing reuse potential for the class. This makes intuitive sense, as a class that has methods operating on separate attributes should bring the Extract Class refactoring to mind. The calculation of this metric alone should convince you that manual means of gathering measurements for metric analysis is not feasible!

Metric interpretation • A collection of metrics data generated by static source code analysis is not too valuable in raw form. • Just as the UML™ provides for the visualization of your design in the form of class diagrams, for example, metric results are best displayed graphically for interpretation. • Bar graphs, • Kiviat charts, and • Distribution Graphs. • Also important to identify trends in metric results over a period of time.

Metrics - Interpretation & Outcome • Visualizations: • Bar Graph • Kiviat Graph • Distribution Graph • Metrics Results Comparison • Metric-Driven Refactoring

MetricsVisualization – Kiviat Graph Scaled Axis for Each Metric Red Circle Represents Upper Limit

MetricsVisualization – Kiviat with UML Class Diagram

MetricsTrend Analysis Increased Comparison Decreased

MetricsCase Studies PetStore Ant

Metrics - SUMMARY • Metrics are indeed “deep and muddy waters”. • A firm grasp of object-oriented design concepts and an understanding of what a metric is designed to detect are key. • components in a successful metric program. • If you are new to metrics, it is best to start with a small subset, such as those described here, and not get too caught up in the details. • In other words, take an “agile” approach to implementing a metrics program; focusing on the results you can obtain by using them effectively and not on using them to fulfill a process requirement.

Refactoring • It’s one of the buzzword conversations about software development. • Made famous by a book of the same title [Fowler], it is a term for what has always occurred during the development of software. • Basically, refactoring involves changing source code in a way that does not change its behavior but that improves its structure. • The goal is to more easily understand and maintain software through these usually small changes in its design and implementation. How does a developer know what to refactor on a large, somewhat unfamiliar code base?

Refactoring -Metric-Driven • Configure Metrics sets that are aimed at certain refactorings: • Extract Class • CBO, NOIS, NOC, LOCOM3, FO, NOAM, LOC, NOA, NOO, NORM, HPLen • Extract Subclass • NOA, NOM, NOO, RFC, WMPC1, NOAM, WMPC2, DAC, FO, NOCC • Extract Method • MSOO, NOCON, RFC, MNOL, CC, NORM, WMPC1, MNOP, WMPC2, CBO • Extract Interface • NOOM • Inheritance-Based Refactorings • DOIH, NOCC

Refactoring - Some Considerations • Use Audits to target certain refactorings: • Reverse Conditional, Remove Double Negative • NOIS (Negation Operator in ‘if’ Statement) • Encapsulate Field • PPA (Public and Package Attribute) • Renaming • NC (Naming Conventions – uses regex) • Run Audits before Metrics • UPCM and ULVFP before WMPC2, NOA etc. • ILC before NOIS • Format Code before Metrics, as may impact: • LOC (Lines Of Code) – consider using HPLen • (CR) Comment Ratio & (TCR) True Comment Ratio

Metric-driven refactoring • Refactoring can be a double-edged sword. • Be careful not to “bite off more than you can chew,” so to speak, while refactoring a code base. • The process can work as follows: • Run unit tests, audits, and formatting tools. • Run and analyze metric results on your code. • Navigate to the offending areas, looking for problem areas. • Refactor the code, keeping design patterns in mind to potentially apply. • Rerun the audits, metrics, and unit tests.

Metric-driven refactoring example • Using Tomcat 5.5.20 as an example, lets demonstrate metric-driven refactorings. • Lines of Code: 181,429 • Number of Classes: 1,819

Software Remodeling Improving Design and Implementation Quality

Software Remodeling Improving Design and Implementation Quality

Presentation Transcript

Improving Software Package Search Quality

Design and Implementation

Implementation of Quality by Design:

Improving quality

CSE 331 Software Design and Implementation

DOE O 414.1C, Quality Assurance Improving Safety Software Quality

CS 4311 Software Design and Implementation

SOFTWARE DESIGN QUALITY

Improving Software Quality with Continuous Integration

Software Design and Implementation

Improving Software Quality Through Communication

Network Protocol Software: Design and Implementation

Software Implementation ( Writing Quality Code)

Quality Assurance Software Quality Assurance Implementation Plans

Kitchen Remodeling and Design

Design and Implementation

Quality Remodeling