Putting it All Together: Using Socio-Technical Networks to Predict Failures

TitlePutting it All Together: Using Socio-Technical Networks to Predict Failures
Publication TypeConference Paper
Year of Publication2009
AuthorsBird, C, Nagappan, N, Devanbu, P, Gall, H, Murphy, B
Secondary TitleProceedings of the 17th International Symposium on Software Reliability Engineering
Keywordseclipse, microsoft, social network, vista, windows

Studies have shown that social factors in development organizations have a dramatic effect on software quality. Separately, program dependency information has also been used successfully to predict which software components are more fault prone. Interestingly, the influence of these two phenomena have only been studied separately. Intuition and practical experience suggests, however, that task assignment (i.e. who worked on which components and how much) and dependency structure (which components have dependencies on others) together interact to influence the quality of the resulting software. We study the influence of combined socio-technical software networks on the fault-proneness of individual software components within a system. The network properties of a software component in this combined network are able to predict if an entity is failure prone with greater accuracy than prior methods which use dependency or contribution information in isolation. We evaluate our approach in different settings by using it on Windows Vista and across six releases of the Eclipse development environment including using models built from one release to predict failure prone components in the next release. We compare this to previous work. In every case, our method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%.


First, we build each type of network separately and use network analysis on both to gather metrics for use in a predictive model. Second, we build a socio-technical network which combines the nodes and edges from both the dependency network and the contribution network and use metrics gathered from this network in a predictive model.
We evaluate our approach by collecting data from Mi- crosoft Windows Vista and ECLIPSE development and using logistic regression analysis.

Full Text
PDF icon bird2009pat.pdf460.43 KB