310 likes | 426 Vues
Review of "Semantic Web Enabled Software Analysis" paper proposing software comprehension framework with Semantic Web technologies like OWL, RDF, SPARQL for code analysis and visualization. Includes experimental evaluation tasks. Concludes the benefits and limitations of using Semantic Web for software analysis.
E N D
SamadPaydar Web Technology Lab. Ferdowsi University of Mashhad 10th August 2011 This is a review of the paper:Semantic Web Enabled Software AnalysisJonas Tappolet, Christoph Kiefer, Abraham BernsteinDynamic and Distributed Information Systems, University of Zurich, Switzerland Journal of Web Semantics, 2010
Outline • Introduction • Software ontology models • Semantic web query methods for software analysis • Experimental evaluation • Conclusion 2
Introduction • In order for software to be developed, maintained and evolved • It is required that it is understood • How code works • Developers’ decisions • Some reasons • Development team changes • Programmers forget what they have done • Undocumented code • Outdated comments • Multiple versions 3
Introduction • Therefore a code comprehension framework is needed • Mainly composed of two major steps • Converting source code to an internal representation • Performing queries 4
Introduction • Further • Open source movement • Software complexity • Libraries dependent on other ones • Software that is developed locally is a node in a world-wide network of interlinked source code • Global Call Graph 5
Introduction • Each node in this cloud should exhibit its information in an open, accessible and uniquely identifiable way • Therefore “we propose the usage of semantic technologies such as OWL, RDF and SPARQL as a software comprehension framework with the abilities to be interlinked with other projects” 6
Software ontology model • Three models for different aspects of code • Software Ontology Model (SOM) • Bug Ontology Model (BOM) • Version Ontology Model (VOM) • Connected to related ontologies • DOAP • SIOC • FOAF • WF
SOM: Software Ontology Model • Based on FAMIX (FAMOOS Information Exchange Model) • A programming language independent model for representing object-oriented source code
VOM: Version Ontology Model • For specifying the relations between files, releases, and revisions of software projects • Based on the data model of Subversion
BOM: Bug Ontology Model • Based on the bug-tracking system Bugzilla
Query Methods • Two non-standard extensions of SPARQL • iSPARQL (Imprecise SPARQL) • SPARQL-ML (SPARQL Machine learning)
iSPARQL • Introduces the idea of “virtual triples” • Are not matched against the underlying ontology graph, but used to configure similarity joins • Which pairs of variables should be joined and compared using a certain type of similarity measure
SPARQL-ML • An extension of SPARQL with knowledge discovery capabilities • A tool for efficient relational data mining on Semantic Web data • Enables the Statistical Relational Learning (SLR) methods such as Relational Probability Trees (RPTs) and Relational Bayesian Classifiers (RBCs)
SPARQL-ML • Learning phase (building prediction model)
SPARQL-ML • Test phase (making prediction)
Experimental Evaluation • 4 years (2004-2007) of the proceedings of ICSE Workshop on Mining Software Repositories (MSR) are surveyed • Most actively investigated software analysis tasks are determined
Experimental Evaluation • Dataset: 206 releases of the org.eclipse.compare plug-in for Eclipse (average of about 150 Java classes per version) + bug tracking information • Exported to OWL
Experimental Evaluation • Task 1: software evolution analysis • Applicability of iSPARQL to software evolution visualization (i.e. visualization of code changes foe a certain time span) • Compared all the classes of one major release with another major release with different similarity strategies
Experimental Evaluation • Task 2: computing source code metrics • Calculating OO software design metrics
Experimental Evaluation • Changing methods (CM) and changing classes (CC) • A method that is invoked by many other methods has a higher risk of causing defect in presence of chance
Experimental Evaluation • Number of methods (NOM) and number of attributes (NOA) • As indicators of GOD classes
Experimental Evaluation • Number of bugs (NOB) and number of revisions (NOR)
Experimental Evaluation • Task 3: detection of code smells • Task 4: defeat and evolution density • Task 5: bug prediction
Conclusion • A novel approach to analyze software systems using Semantic Web technologies • EvoOnt provides the basis for representing source code and metadata in OWL • This representation reduces analysis tasks to simple queries in SPARQL (or its extensions) • A limitation: loss of some information due to the use of FAMIX-based ontology model
Conclusion • Language constructs like if-else are not modeled • Measurements cannot conducted at the level of statements • One of the greatest impediments towards widespread use of EvoOnt : current lack of high-performance industrial-strength triple-stores & reasoning engines