1 / 17

Who Copied Who?

Who Copied Who?. Gordon Lingard School of Software University of Technology, Sydney glingard@it.uts.edu.au. The Problem. Students copying computer code off other students within a subject is a significant problem. Different to problems of students copying from an external source.

jerrick
Télécharger la présentation

Who Copied Who?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Who Copied Who? Gordon Lingard School of Software University of Technology, Sydney glingard@it.uts.edu.au

  2. The Problem • Students copying computer code off other students within a subject is a significant problem. • Different to problems of students copying from an external source. • Programs exists for determining code is a copy. • They don’t answer the question of who created the code and who copied it. • This presentation outlines a solution to this problem.

  3. Presentation Outline • What is Computer Programming • Detection System • Assignment Submission System • Combining the Systems • Results and Conclusions • Questions

  4. What Is Computer Programming?Computer Code • Computer programs are written in a formal programming language that looks like a cross between mathematics and natural language. • They have a very strict syntax structure. • The language is used to construct a large set of carefully orchestrated instructions that become the program. • Student programs are typically less than a thousand instructions. Commercial programs can be tens of thousands to millions of instructions. • Larger programs are of staggering complexity.

  5. What Is Computer Programming?C++ Code Example

  6. What Is Computer Programming? Why Learning to Program is Hard • Learning issues students face • Learning the language. • Learning how to use the language to create a program to do a specified task. • Managing the complexity as programs grow in size. • In the face of these issues, many students are overwhelmed and resort to copying.

  7. Detection SystemProblems of detection • Disguise • Simple transformations that change the look of the code without changing what it does. • Combinatorics • n assignments creates p = n/(n-1)/2 pairs. • 100 assignments = 4950 pairs. • Code Overlap • Two pieces of code designed to do the same thing – about 50% of the code will be common. • Boilerplate code creating many false positives.

  8. Program Instructions TokenisedInstructions Complexity Numbers if (x > y) { a[x] = b[1][y]; foo(&x, *y); : : instr n-1 instr n if(>) { [] = [][]; (&, *); : : tokenised n-1 tokenised n 98592 112142 147716 : : complex n-1 complex n   Detection SystemComplexity Numbers • Tokenise Code. • Generate Complexity Numbers.

  9. Detection SystemComparing Complexity Numbers • Determine the percentage of numbers common between two programs.

  10. Submission System • Used for a number of years in parallel with the detection system. • A formative assessment tool. • Runs students programs with a suite of tests. • Analyses their code for poor programming practices. • The students can use the results from the tests to refine their assignments and re-submit as often as they like. • The submission system becomes a development environment.

  11. Combining the SystemsOverview • Extract information from the detection system to create a digital fingerprint of an assignment. • The fingerprint helps to uniquely identify a piece of code while being unaffected to by minor changes to the code. • Append the fingerprint, along with time and date, to a log of submissions for each student. • Analyse logs to see if fingerprints are appearing between students and use the date/time to determine order of development.

  12. Combining the SystemsDigital Fingerprints • A fingerprint is created by extracting the 6 largest, unique complexity numbers from all the numbers a piece of code generates. • Represent the 6 most complicated pieces of the code. Assignment Code Complexity Numbers Digital Fingerprint = 6 largest unique complexity numbers in sorted order if (x > y) x = x * 6; else y = x + y; : : : : *z = a->b[x]; 62145 87219 14067 57063 : : : : 112103 68018 68682 72172 87219 97843 112103 Append fingerprint and date/time to log

  13. Combining the SystemsSubmission Logs 1 Changes 4 1

  14. Combining the SystemsComparing Logs • Comparing summary of logs. • Time frames in comparison makes it clear who originated the code, who copied and when.

  15. Who Copied Who?Results • Rarely is there collaboration. It is students copying other students. • In cases of copying, the logs almost always make a very clear statement of what has happened and when. • The copying usually involves one copying off another, sometimes two but rarely more. • Frequently, it is not the final submission that gives away the copying, but earlier submissions. This can be seen in the logs and then examining the earlier submissions.

  16. Who Copied Who?Conclusions • The system has proved extremely successful in presenting misconduct cases to the Faculty. • The sheer weight of evidence the logs produce often saves time as students don’t try and bluff their way through the allegation. • This allows the Faculty to shift the focus away from penalty and to remedial action.

  17. Questions ?

More Related