TitleProject Entity Matching across FLOSS Repositories
Publication TypeConference Paper
Year of Publication2007
AuthorsConklin, M
Secondary TitleOSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13)
Pagination45 - 57
Date Published2007///
ISSN Number978-0-387-72485-0

Much of the data about free, libre, and open source (FLOSS) software development comes from studies of code repositories used for managing projects. This paper presents a method for integrating data about open source projects by way of matching projects (entities) and deleting duplicates across multiple code repositories. After a review of the relevant literature, a few of the methods are chosen and applied to the FLOSS domain, including a simple scoring system for confidence in pairwise project matches. Finally, the paper describes limitations of this approach and recommendations for future work.

