How does Context affect the Distribution of Software Maintainability Metrics?

TitleHow does Context affect the Distribution of Software Maintainability Metrics?
Publication TypeConference Paper
Year of Publication2013
AuthorsZhang, Feng, Mockus Audris, Zou Ying, Khomh Foutse, and Hassan Ahmed E.
Secondary TitleProceedings of the 29th IEEE International Conference on Software Maintainability
Keywordsbenchmark, context, contextual factor, flossmole, large scale, metrics, mining software repositories, sampling, software maintainability, sourceforge, static metrics

Software metrics have many uses, e.g., defect prediction, effort estimation, and benchmarking an organization against peers and industry standards. In all these cases, metrics may depend on the context, such as the programming language. Here we aim to investigate if the distributions of commonly used metrics do, in fact, vary with six context factors: application domain, programming language, age, lifespan, the number of changes, and the number of downloads. For this preliminary study we select 320 nontrivial software systems from SourceForge. These software systems are randomly sampled from nine popular application domains of SourceForge. We calculate 39 metrics commonly used to assess software maintainability for each software system and use Kruskal Wallis test and Mann-Whitney U test to determine if there are significant differences among the distributions with respect to each of the six context factors. We use Cliff’s delta to measure the magnitude of the differences and find that all six context factors affect the distribution of 20 metrics and the programming language factor affects 35 metrics. We also briefly discuss how each context factor may affect the distribution of metric values.We expect our results to help software benchmarking and other software engineering methods that rely on these commonly used metrics to be tailored to a particular context.


"FLOSSMole [25] is another data source, from
where we download descriptions (i.e., application domain)
of SourceForge software systems. Furthermore, we download
latest application domain information4
and monthly download
data5 of studied software systems directly from SourceForge."

icsm2013_contextstudy.pdf248.73 KB