300 likes | 462 Vues
Searching through Source Code using CVS Comments. Annie Chen, Eric Chou, Joshua Wong, Andrew Yao, Qing Zhang, Shao Zhang, Amir Michail University of New South Wales. Motivation. Current methods of searching for source code have a number of drawbacks:
E N D
Searching through Source Code using CVS Comments Annie Chen, Eric Chou, Joshua Wong, Andrew Yao, Qing Zhang, Shao Zhang, Amir Michail University of New South Wales
Motivation • Current methods of searching for source code have a number of drawbacks: • Usually requires some knowledge of source code • Relies on programming styles • Relies on good documentation of code • Only matches line, not section of code
CVS – Concurrent Versions System • Allows multiple developers to work together on a central repository • Developers make their changes and commit the files to the repository along with a description of changes made • CVS keeps track of these changes made • Widely used in the open source community
CVS Comments - Advantages • Good source of documentation • Have lines of code that it refers to • Not only describes changes, but purpose: • Added features • Bug fixes • Holds for many future versions • Fragment of code can be involved in multiple commits • Provides multiple descriptions
CVSSearch • Search for source code through CVS comments • Returns the source code fragment in the most recent version of the file which the matched comment refers to
Algorithm Overview • For each revision i with comment C, find lines l modified in i in files of that revision • Propagate l to the corresponding lines in the most recent version of its file • Associate C with the corresponding lines in the most recent version
Useful CVS commands • cvs log – displays for all revisions and its comments for each file • cvs diff – shows differences between different versions of a file
Searching • Comments for each line are stored in MG (Managing Gigabytes) system • MG provides stemming and ranking for query using cosine similarity • Query are inputted into MG, and it returns lines matched, ranked • We group the lines and sort by overall ranking for each group
Problems with the Tool • Troublesome to setup and install • Does not update on changes to the repository
Obtaining CVSSearch • open source • GPL • http://cvssearch.sourceforge.net/
Related Work • Lexical tools • grep • awk • lex • Natural language documentation • CVS tools – bonsai, WebCVS
Future Work / Work in Progress • Program understanding feature that describes arbitrary lines in the code using CVS comments collectively • Mining CVS commits to look for class/function usage patterns
People Involved • Amir Michail • Andrew Y. Yao • Eric Chou • Joshua Wong • Qing Zhang • Shao Zhang • Annie Chen