The Impact of Tangled Code Changes

TitleThe Impact of Tangled Code Changes
Publication TypeConference Proceedings
Year of Publication2013
AuthorsHerzig, K, Zeller, A
Refereed DesignationRefereed
Secondary Title10th Working Conference on Mining Software Repositories
Date Published05/2013
Keywordsbias, data quality, history, java, mining software repositories, noise, tangled code changes, version control

When interacting with version control systems, developers often commit unrelated or loosely related code changes
in a single transaction. When analyzing the version history, such
tangled changes will make all changes to all modules appear
related, possibly compromising the resulting analyses through
noise and bias. In an investigation of five open-source JAVA
projects, we found up to 15% of all bug fixes to consist of multiple
tangled changes. Using a multi-predictor approach to untangle
changes, we show that on average at least 16.6% of all source
files are incorrectly associated with bug reports. We recommend
better change organization to limit the impact of tangled changes.

Full Text
PDF icon msr2013-untangling.pdf1.56 MB