Semantic Technologies for Software Analysis

SamadPaydar Web Technology Lab. Ferdowsi University of Mashhad 10th August 2011 This is a review of the paper:Semantic Web Enabled Software AnalysisJonas Tappolet, Christoph Kiefer, Abraham BernsteinDynamic and Distributed Information Systems, University of Zurich, Switzerland Journal of Web Semantics, 2010

Outline • Introduction • Software ontology models • Semantic web query methods for software analysis • Experimental evaluation • Conclusion 2

Introduction • In order for software to be developed, maintained and evolved • It is required that it is understood • How code works • Developers’ decisions • Some reasons • Development team changes • Programmers forget what they have done • Undocumented code • Outdated comments • Multiple versions 3

Introduction • Therefore a code comprehension framework is needed • Mainly composed of two major steps • Converting source code to an internal representation • Performing queries 4

Introduction • Further • Open source movement • Software complexity • Libraries dependent on other ones • Software that is developed locally is a node in a world-wide network of interlinked source code • Global Call Graph 5

Introduction • Each node in this cloud should exhibit its information in an open, accessible and uniquely identifiable way • Therefore “we propose the usage of semantic technologies such as OWL, RDF and SPARQL as a software comprehension framework with the abilities to be interlinked with other projects” 6

Software ontology model • Three models for different aspects of code • Software Ontology Model (SOM) • Bug Ontology Model (BOM) • Version Ontology Model (VOM) • Connected to related ontologies • DOAP • SIOC • FOAF • WF

SOM: Software Ontology Model • Based on FAMIX (FAMOOS Information Exchange Model) • A programming language independent model for representing object-oriented source code

VOM: Version Ontology Model • For specifying the relations between files, releases, and revisions of software projects • Based on the data model of Subversion

BOM: Bug Ontology Model • Based on the bug-tracking system Bugzilla

Query Methods • Two non-standard extensions of SPARQL • iSPARQL (Imprecise SPARQL) • SPARQL-ML (SPARQL Machine learning)

iSPARQL • Introduces the idea of “virtual triples” • Are not matched against the underlying ontology graph, but used to configure similarity joins • Which pairs of variables should be joined and compared using a certain type of similarity measure

iSPARQL

SPARQL-ML • An extension of SPARQL with knowledge discovery capabilities • A tool for efficient relational data mining on Semantic Web data • Enables the Statistical Relational Learning (SLR) methods such as Relational Probability Trees (RPTs) and Relational Bayesian Classifiers (RBCs)

SPARQL-ML • Learning phase (building prediction model)

SPARQL-ML • Test phase (making prediction)

Experimental Evaluation • 4 years (2004-2007) of the proceedings of ICSE Workshop on Mining Software Repositories (MSR) are surveyed • Most actively investigated software analysis tasks are determined

Experimental Evaluation

Experimental Evaluation • Dataset: 206 releases of the org.eclipse.compare plug-in for Eclipse (average of about 150 Java classes per version) + bug tracking information • Exported to OWL

Experimental Evaluation • Task 1: software evolution analysis • Applicability of iSPARQL to software evolution visualization (i.e. visualization of code changes foe a certain time span) • Compared all the classes of one major release with another major release with different similarity strategies

Experimental Evaluation • Task 2: computing source code metrics • Calculating OO software design metrics

Experimental Evaluation • Changing methods (CM) and changing classes (CC) • A method that is invoked by many other methods has a higher risk of causing defect in presence of chance

Experimental Evaluation • Number of methods (NOM) and number of attributes (NOA) • As indicators of GOD classes

Experimental Evaluation • Number of bugs (NOB) and number of revisions (NOR)

Experimental Evaluation • Task 3: detection of code smells • Task 4: defeat and evolution density • Task 5: bug prediction

Conclusion • A novel approach to analyze software systems using Semantic Web technologies • EvoOnt provides the basis for representing source code and metadata in OWL • This representation reduces analysis tasks to simple queries in SPARQL (or its extensions) • A limitation: loss of some information due to the use of FAMIX-based ontology model

Conclusion • Language constructs like if-else are not modeled • Measurements cannot conducted at the level of statements • One of the greatest impediments towards widespread use of EvoOnt : current lack of high-performance industrial-strength triple-stores & reasoning engines

Semantic Technologies for Software Analysis

Semantic Technologies for Software Analysis

Presentation Transcript

November 10 th , 2011

February 10 th 2011

August 25 th , 2011

Mashhad University of Medical Sciences

NWS Headquarters August 10, 2011

Lab 10 Food Web

University of Primorska , October 10, 2011

RADWG 7 th 25 th August 2011

Semarang, 10 th -11 th of July 2011

Spillermøde 10. August 2011

Shahzad Sadiq 12 th August 2011

Project-X DPF, August 10 th 2011 R.Tschirhart Fermilab

University of Nebraska - Lincoln August 10, 2011

Bell Ringer for August 10 th , 2011

Parallel Computing Department Of Computer Engineering Ferdowsi University

PRESENTATION FILE: CMDAYS2011 Gauhati University: August 24-26 th 2011

November 10 th , 2011

Friday Sermon August 5 th 2011

August 8 th , 2011 Kevan Thompson

Queensland University of Technology Senior Physics Lab

August 10 th , 2011

Bell Work: August 24 th , 2011