The Maven Repository Dataset of Metrics, Changes, and Dependencies

Publication TypeConference Proceedings
Year of Publication2013
AuthorsRaemaekers, S, van Deursen, A, Visser, J
Refereed DesignationRefereed
Secondary Title10th Working Conference on Mining Software Repositories
Date Published05/2013

We present the Maven Dependency Dataset (MDD), containing metrics, changes and dependencies of 148,253 jar files. Metrics and changes have been calculated at the level of individual methods, classes and packages of multiple library versions. A complete call graph is also presented which includes call, inheritance, containment and historical relationships between all units of the entire repository. In this paper, we describe our dataset and the methodology used to obtain it. We present different conceptual views of MDD and we also describe limitations and data quality issues that researchers using this data should be aware of.

