A Code Clone Oracle

TitleA Code Clone Oracle
Publication TypeConference Paper
Year of Publication2014
AuthorsKrutz, DE, Le, W
Secondary TitleProceedings of the 11th Working Conference on Mining Software Repositories
Place PublishedNew York, NY, USA
ISBN Number978-1-4503-2863-0
Keywordsclone, Clone Oracle, Code Clone Detection, msr data showcase, software engineering

Code clones are functionally equivalent code segments. Detecting code clones is important for determining bugs, fixes and software reuse. Code clone detection is also essential for developing fast and precise code search algorithms. How- ever, the challenge of such research is to evaluate that the clones detected are indeed functionally equivalent, consider- ing the majority of clones are not textual or even syntactically identical. The goal of this work is to generate a set of method level code clones with a high confidence to help to evaluate future code clone detection and code search tools to evaluate their techniques. We selected three open source programs, Apache, Python and PostgreSQL, and randomly sampled a total of 1536 function pairs. To confirm whether or not these function pairs indicate a clone and what types of clones they belong to, we recruited three experts who have experience in code clone research and four students who have experience in programming for manual inspection. For confidence of the data, the experts consulted multiple code clone detection tools to make the consensus. To assist manual inspection, we built a tool to automatically load function pairs of interest and record the manual inspection results. We found that none of the 66 pairs are textual identical type- 1 clones, and 9 pairs are type-4 clones. Our data is available at: http://phd.gccis.rit.edu/weile/data/cloneoracle/.

Full Text
PDF icon clone_oracle.pdf205.31 KB