1 / 20

Example application: source code analysis

Example application: source code analysis. 125 file types; 8029 files; 4689 non-Java; 1112 svn revisions. Querying Software Artefacts. source code. query engine. IDE plugin. version history. parsers. developer. bug reports. build scripts. dash board. software repository. databases.

beau
Télécharger la présentation

Example application: source code analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Example application: source code analysis 125 file types; 8029 files; 4689 non-Java; 1112 svn revisions

  2. Querying Software Artefacts source code query engine IDE plugin version history parsers developer bug reports build scripts dashboard software repository databases manager spreadsheets exceladd-in config files web pages analyst

  3. The problem design query language and engine for accessing vast repository of different types of source artefact libraries of queries: tailor framework to different types of artefact

  4. Tough problem! Dozens of attempts, in industry and academia since 1984: databases, prolog, domain-specific query languages • Difficulties: • - does not scale • efficient queries extremely hard to write • specific to one kind of source artefact 18 man-years of research at University of Oxford 1996-2005 to discover ingredients of solution 15 man-years to implement an industrial product 3 patents pending, several more in pipeline

  5. SemmleCode: the power of .QL

  6. The query language .QL • Object-oriented, for creating libraries of queries • Recursive queries, as in logic programming • Familiar syntax to Java and SQL developers • On top of any traditional relational database • Syntax-highlighting, error-checking and auto-completion

  7. How it works XMLfiles RDBMS .QL library java / jar .QL query bytecode for search procedural SQL template for RDBMS Semmle optimiser

  8. Demo • The source we shall explore: • Alfresco: Enterprise Content Management • Spring: Java/JEE Application Framework • Builds on Tomcat, JBoss, … Vital statistics: 50553 Java methods 6647 Java types 516 XML files • Demo parts: • out-of-the-box • writing your own queries • querying XML config files

  9. Using SemmleCode out-of-the-box 115 pre-packaged queries Find common bug patterns: e.g. compareTo/equals, cloning, serialisation, internationalization Compute metrics: 42 different metrics, including Robert Martin’s package metrics Examine dependencies: e.g. cyclic package dependencies • Visualization: • pie charts, bar charts, tables, graphs, warnings/errors • easy navigation to source • exportable for generating reports

  10. Writing queries of your own: select from Method m where m.fromSource() and m.hasName("compareTo") and not m.getDeclaringType(). getAMethod().hasName("equals") select m, "missing equals?" In general: from <variable-declarations> where <conditions> select <results>

  11. Writing queries of your own: aggregates selectsum (CompilationUnit cu | cu.fromSource() | cu.getNumberOfLinesOfCode()) In general: agg( T1 x1, …, Tn xn | condition | expr )

  12. Writing queries of your own: recursion from RefType s, RefType t, RefType it where it.hasName("PasswordInputTag") and it.hasSupertype*(s) and it.hasSupertype*(t) and t.hasSupertype(s) select t,s In general, can write recursive predicate definitions

  13. Queries in .QL from-where-select autocompletion, typechecking, emptiness tests aggregates arbitrary nesting, no group-by needed recursion implicit with chaining; or explicit

  14. Defining new classes in .QL class ClassAttribute extends XMLAttribute { ClassAttribute() { this.getName()="class" } string getClassName() { this.getValue() = result } RefType getType() { result.getQualifiedName() = this.getClassName() } predicate noType() { notexists(this.getType()) } } from ClassAttribute ca where ca.noType() and ca.getClassName().matches("org.alfresco%") select ca, ca.getClassName() + " not found"

  15. Classes in .QL classes are logical properties “constructor” specifies characteristic property methods body is relation between this, result and parameters more than one result allowed predicates methods without a result body is relation between this and parameters

  16. The key points of .QL designed for creating libraries of queries • classes are predicates • inheritance is implication • nondeterministic expressions recursion with super-simple semantics syntax familiar to SQL and Java programmers excellent error checking and IDE integration

  17. Concluding remarks

  18. Couldn’t you use LINQ instead of .QL? • Different design goals:ORM versus libraries of queries • LINQ does not provide recursion • LINQ cannot do the optimisations across multiple queries that are key to efficiency in .QL “Fortunately, there is light in the darkness. Based on decades of programming language research, the brilliant team at Semmle has created an elegant, industrial strength object-oriented query language called .QL with full support for recursive queries and aggregation… .QL has all the requisites to become a runaway success.” (Erik Meijer, Creator of LINQ, Microsoft)

  19. Too good to be true? Jeff Ullman, 1991: It is not possible for a query language to be seriously logical and seriously object-oriented at the same time. key breakthroughs are Semmle’s proprietary technology: - design of .QL - optimisations on “bytecode for search”

  20. Wrapping up Java is not enough source code analysis tools must process a multitude of artefacts libraries of queries a means to achieve such heterogeneous tools .QL object-oriented queries over trees and graphs made fast and easy

More Related