An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software

Submitted by msquire on Thu, 2017-05-25 15:50

Title	An Empirical Analysis of Build Failures in the Continuous Integration Workflows of Java-Based Open-Source Software
Publication Type	Conference Proceedings
Year of Publication	2017
Authors	Rausch, T, Hummer, W, Leitner, P, Schulte, S
Secondary Title	2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR)
Pagination	345-355
Date Published	05/2017
Keywords	build errors, continuous integration, correlation analysis, msr
Abstract	—Continuous Integration (CI) has become a common practice in both industrial and open-source software development. While CI has evidently improved aspects of the software development process, errors during CI builds pose a threat to development efficiency. As an increasing amount of time goes into fixing such errors, failing builds can significantly impair the development process and become very costly. We perform an indepth analysis of build failures in CI environments. Our approach links repository commits to data of corresponding CI builds. Using data from 14 open-source Java projects, we first identify 14 common error categories. Besides test failures, which are by far the most common error category (up to >80% per project), we also identify noisy build data, e.g., induced by transient Git interaction errors, or general infrastructure flakiness. Second, we analyze which factors impact the build results, taking into account general process and specific CI metrics. Our results indicate that process metrics have a significant impact on the build outcome in 8 of the 14 projects on average, but the strongest influencing factor across all projects is overall stability in the recent build history. For 10 projects, more than 50% (up to 80%) of all failed builds follow a previous build failure. Moreover, the fail ratio of the last k=10 builds has a significant impact on build results for all projects in our dataset.
Notes	"empirical study of CI build failures in 14 Java-based OSS projects. We extract and analyze data from publicly available GitHub repositories and Travis-CI build logs"
Full Text