A Dataset for Maven Artifacts and Bug Patterns Found in Them

Publication TypeConference Paper
Year of Publication2014
AuthorsSaini, V, Sajnani, H, Ossher, J, Lopes, CV
Secondary TitleProceedings of the 11th Working Conference on Mining Software Repositories
Place PublishedNew York, NY, USA
ISBN Number978-1-4503-2863-0
KeywordsEmpirical Research, Empirical software engineering, findbugs, maven, software quality

In this paper, we present data downloaded from Maven, one of the most popular component repositories. The data includes the binaries of 186,392 components, along with source code for 161,025. We identify and organize these components into groups where each group contains all the versions of a library. In order to asses the quality of these components, we make available report generated by the FindBugs tool on 64,574 components. The information is also made available in the form of a database which stores total number, type, and priority of bug patterns found in each component, along with its defect density. We also describe how this dataset can be useful in software engineering research.

