<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="6.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Phadke, Amit A.</style></author><author><style face="normal" font="default" size="100%">Allen, Edward B.</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Predicting risky modules in open-source software for high-performance computing</style></title><secondary-title><style face="normal" font="default" size="100%">Proceedings of the second international workshop on Software engineering for high performance computing system applications</style></secondary-title><tertiary-title><style face="normal" font="default" size="100%">SE-HPCS '05</style></tertiary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">C4.5</style></keyword><keyword><style  face="normal" font="default" size="100%">decision trees</style></keyword><keyword><style  face="normal" font="default" size="100%">empirical case study</style></keyword><keyword><style  face="normal" font="default" size="100%">high performance computing</style></keyword><keyword><style  face="normal" font="default" size="100%">logistic regression</style></keyword><keyword><style  face="normal" font="default" size="100%">Open-source software</style></keyword><keyword><style  face="normal" font="default" size="100%">PETSc</style></keyword><keyword><style  face="normal" font="default" size="100%">software metrics</style></keyword><keyword><style  face="normal" font="default" size="100%">software quality model</style></keyword><keyword><style  face="normal" font="default" size="100%">software reliability</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2005</style></year></dates><urls><web-urls><url><style face="normal" font="default" size="100%">http://doi.acm.org/10.1145/1145319.1145337</style></url></web-urls></urls><publisher><style face="normal" font="default" size="100%">ACM</style></publisher><pub-location><style face="normal" font="default" size="100%">New York, NY, USA</style></pub-location><pages><style face="normal" font="default" size="100%">60–64</style></pages><isbn><style face="normal" font="default" size="100%">1-59593-117-1</style></isbn><abstract><style face="normal" font="default" size="100%">This paper presents the position that software-quality modeling of open-source software for high-performance computing can identify modules that have a high risk of bugs.Given the source code for a recent release, a model can predict which modules are likely to have bugs, based on data from past releases. If a user knows which software modules correspond to functionality of interest, then risks to operations become apparent. If the risks are too great, the user may prefer not to upgrade to the most recent release.Of course, such predictions are never perfect. After release, bugs are discovered. Some bugs are missed by the model, and some predicted errors do not occur. A successful model will be accurate enough for informed management action at the time of the predictions.As evidence for this position, this paper summarizes a case study of the Portable Extensible Toolkit for Scientific Computation (PETSC), which is a mathematical library for high-performance computing. Data was drawn from source-code and configuration management logs. The accuracy of logistic-regression and decision-tree models indicated that the methodology is promising. The case study also illustrated several modeling issues.</style></abstract></record></records></xml>