%0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Predicting defect densities in source code files with decision tree learners %A Knab, Patrick %A Pinzger, Martin %A Bernstein, Abraham %K change analysis %K data mining %K decision tree learner %K defect density %K defect prediction %K mozilla %K prediction %K release history %K scm %K source code %K version control %X With the advent of open source software repositories the data available for defect prediction in source files increased tremendously. Although traditional statistics turned out to derive reasonable results the sheer amount of data and the problem context of defect prediction demand sophisticated analysis such as provided by current data mining and machine learning techniques.In this work we focus on defect density prediction and present an approach that applies a decision tree learner on evolution data extracted from the Mozilla open source web browser project. The evolution data includes different source code, modification, and defect measures computed from seven recent Mozilla releases. Among the modification measures we also take into account the change coupling, a measure for the number of change-dependencies between source files. The main reason for choosing decision tree learners, instead of for example neural nets, was the goal of finding underlying rules which can be easily interpreted by humans. To find these rules, we set up a number of experiments to test common hypotheses regarding defects in software entities. Our experiments showed, that a simple tree learner can produce good results with various sets of input data. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 119–125 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138012 %R http://doi.acm.org/10.1145/1137983.1138012 %> https://flosshub.org/sites/flosshub.org/files/119Predicting.pdf %0 Journal Article %J IEEE Trans. Software Eng. %D 2005 %T The FreeBSD Project: A Replication Case Study of Open Source Development %A Trung T. Dinh-Trong %A James M. Bieman %K apache %K bug reports %K contributors %K core %K cvs %K defect density %K developers %K email %K email archive %K freebsd %K mailing list %K scm %K source code %K users %X Case studies can help to validate claims that open source software development produces higher quality software at lower cost than traditional commercial development. One problem inherent in case studies is external validity—we do not know whether or not results from one case study apply to another development project. We gain or lose confidence in case study results when similar case studies are conducted on other projects. This case study of the FreeBSD project, a long-lived open source project, provides further understanding of open source development. The paper details a method for mining repositories and querying project participants to retrieve key process information. The FreeBSD development process is fairly well-defined with proscribed methods for determining developer responsibilities, dealing with enhancements and defects, and managing releases. Compared to the Apache project, FreeBSD uses 1) a smaller set of core developers—developers who control the code base—that implement a smaller percentage of the system, 2) a larger set of top developers to implement 80 percent of the system, and 3) a more well-defined testing process. FreeBSD and Apache have a similar ratio of core developers to people involved in adapting and debugging the system and people who report problems. Both systems have similar defect densities and the developers are also users in both systems. %B IEEE Trans. Software Eng. %V 31 %P 481-494 %R 10.1109/TSE.2005.73 %> https://flosshub.org/sites/flosshub.org/files/DinhTrungBieman.pdf %0 Journal Article %J ACM Transactions on Software Engineering and Methodology %D 2002 %T Two case studies of open source software development: Apache and Mozilla %A Audris Mockus %A Roy Fielding %A Herbsleb, J. D. %K apache %K bug fixing %K bug reports %K bugzilla %K change history %K core %K defect density %K email %K email archives %K mailing list %K mozilla %K ownership %K participation %K productivity %K scm %K source code %X According to its proponents, open source style software development has the capacity to compete successfully, and perhaps in many cases displace, traditional commercial development methods. In order to begin investigating such claims, we examine data from two major open source projects, the Apache web server and the Mozilla browser. By using email archives of source code change history and problem reports we quantify aspects of developer participation, core team size, code ownership, productivity, defect density, and problem resolution intervals for these OSS projects. We develop several hypotheses by comparing the Apache project with several commercial projects. We then test and refine several of these hypotheses, based on an analysis of Mozilla data. We conclude with thoughts about the prospects for high- performance commercial/ open source process hybrids. %B ACM Transactions on Software Engineering and Methodology %V 11 %P 309-346 %G eng %M WOS:000177759000002 %1 software engineering %2 case study %> https://flosshub.org/sites/flosshub.org/files/mockusFieldingHerbsleb2002.pdf %0 Journal Article %J Proceedings of the International Conference on Software Engineering (ICSE 2000) %D 2000 %T A Case Study of Open Source Software Development: The Apache Server %A Audris Mockus %A Roy Fielding %A Herbsleb, James %K apache %K bug fix revisions %K bugs %K core %K cvs %K defect density %K developers %K email archives %K participation %K productivity %K revision control %K revision history %K roles %K scm %K source code %K team size %X According to its proponents, open source style software development has the capacity to compete successfully, and perhaps in many cases displace, traditional commercial development methods. We examine the development process of a major open source application, the Apache web server. By using email archives of source code change history and problem reports we quantify aspects of developer participation, core team size, code ownership, productivity, defect density, and problem resolution interval for this OSS project. This analysis reveals a unique process, which performs well on important measures. %B Proceedings of the International Conference on Software Engineering (ICSE 2000) %8 June %G eng %> https://flosshub.org/sites/flosshub.org/files/mockusapache.pdf