Year of Publication2008
AuthorsKoch, S
Secondary TitleInformation Economics and Policy (Empirical Issues in Open Source Software)
Date Published12/2008
Keywordscvs, developers, email, email archives, gnome, lines of code, scm, Software repository mining, source code, sourceforge

This paper develops models for programmer participation and effort estimation in open source software projects and employs the results to assess the efficiency of open source software creation. Successful development of such models will be important for decision makers of various kinds. We propose hypotheses based on a prior case study on manpower function and effort modeling. A large data set retrieved from a project repository is used to test these hypotheses. The main results are that if Norden-Rayleigh-based approaches are used, they need to be complemented in order to account for the addition of new features during a product life cycle, and that programmer-participation based effort models result in distinctly lower estimations of effort than those based on output metrics, such as lines of code.


"Using a two-step approach, first a detailed case study on one project, GNOME, will be undertaken, then a large data set retrieved from a project hosting site, SourceForge.net, will be used to validate the results."

CVS was the main source of data

"e-mails sent to the different project discussion lists were identified as an additional source of information especially on communication and coordination besides the CVS-repository"

basic counts were calculated for developer discussion levels

