A Code Clone Oracle

TitleA Code Clone Oracle
Publication TypeConference Paper
Year of Publication2014
AuthorsKrutz, DE, Le, W
Secondary TitleProceedings of the 11th Working Conference on Mining Software Repositories
Pagination388–391
PublisherACM
Place PublishedNew York, NY, USA
ISBN Number978-1-4503-2863-0
Keywordsclone, Clone Oracle, Code Clone Detection, msr data showcase, software engineering
Abstract

Code clones are functionally equivalent code segments. Detecting code clones is important for determining bugs, fixes and software reuse. Code clone detection is also essential for developing fast and precise code search algorithms. How- ever, the challenge of such research is to evaluate that the clones detected are indeed functionally equivalent, consider- ing the majority of clones are not textual or even syntactically identical. The goal of this work is to generate a set of method level code clones with a high confidence to help to evaluate future code clone detection and code search tools to evaluate their techniques. We selected three open source programs, Apache, Python and PostgreSQL, and randomly sampled a total of 1536 function pairs. To confirm whether or not these function pairs indicate a clone and what types of clones they belong to, we recruited three experts who have experience in code clone research and four students who have experience in programming for manual inspection. For confidence of the data, the experts consulted multiple code clone detection tools to make the consensus. To assist manual inspection, we built a tool to automatically load function pairs of interest and record the manual inspection results. We found that none of the 66 pairs are textual identical type- 1 clones, and 9 pairs are type-4 clones. Our data is available at: http://phd.gccis.rit.edu/weile/data/cloneoracle/.

URLhttp://doi.acm.org/10.1145/2597073.2597127
DOI10.1145/2597073.2597127
Full Text
AttachmentSize
PDF icon clone_oracle.pdf205.31 KB