Characterization and prediction of issue-related risks in software projects

TitleCharacterization and prediction of issue-related risks in software projects
Publication TypeConference Proceedings
Year of Publication2015
AuthorsChoetkiertikul, M, Dam, HK, Tran, T, Ghose, A
Secondary Title12th Working Conference on Mining Software Repositories (MSR 2015)
Date Published05/2015

Identifying risks relevant to a software project and
planning measures to deal with them are critical to the success
of the project. Current practices in risk assessment mostly rely
on high-level, generic guidance or the subjective judgements
of experts. In this paper, we propose a novel approach to
risk assessment using historical data associated with a software
project. Specifically, our approach identifies patterns of past
events that caused project delays, and uses this knowledge to
identify risks in the current state of the project. A set of risk
factors characterizing “risky” software tasks (in the form of
issues) were extracted from five open source projects: Apache,
Duraspace, JBoss, Moodle, and Spring. In addition, we performed
feature selection using a sparse logistic regression model to
select risk factors with good discriminative power. Based on
these risk factors, we built predictive models to predict if an
issue will cause a project delay. Our predictive models are able
to predict both the risk impact (i.e. the extend of the delay)
and the likelihood of a risk occurring. The evaluation results
demonstrate the effectiveness of our predictive models, achieving
on average 48%–81% precision, 23%–90% recall, 29%–71%
F-measure, and 70%–92% Area Under the ROC Curve. Our
predictive models also have low error rates: 0.39–0.75 for Macroaveraged
Mean Cost-Error and and 0.7–1.2 for Macro-averaged
Mean Absolute Error

Full Text
PDF icon msr-2015-preprint.pdf322.54 KB