%0 Conference Proceedings %B 14th International Conference on Mining Software Repositories %D 2014 %T Estimating Development Effort in Free/Open Source Software Projects by Mining Software Repositories: A Case Study of OpenStack %A Gregorio Robles %A González-Barahona, Jesus M %A Cervigón, Carlos %A Capiluppi, Andrea %K effort estimation %K openstack %X Because of the distributed and collaborative nature of free/open source software (FOSS) projects, the development effort invested in a project is usually unknown, even after the software has been released. However, this information is becoming of major interest, especially - but not only- because of the growth in the number of companies for which FOSS has become relevant for their business strategy. In this paper we present a novel approach to estimate effort by considering data from source code management repositories. We apply our model to the OpenStack project, a FOSS project with more than 1,000 authors, in which several tens of companies cooperate. Based on data from its repositories and together with the input from a survey answered by more than 100 developers, we show that the model offers a simple, but sound way of obtaining software development estimations with bounded margins of error. %B 14th International Conference on Mining Software Repositories %8 05/2014 %U http://gsyc.urjc.es/~grex/repro/2014-msr-effort/msr14-robles-estimating-effort.pdf %> https://flosshub.org/sites/flosshub.org/files/msr14-robles-estimating-effort.pdf %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T Estimating Development Effort in Free/Open Source Software Projects by Mining Software Repositories: A Case Study of OpenStack %A Gregorio Robles %A González-Barahona, Jesús M. %A Cervigón, Carlos %A Capiluppi, Andrea %A Izquierdo-Cortázar, Daniel %K effort estimation %K free software %K mining software repositories %K open source %K openstack %X Because of the distributed and collaborative nature of free / open source software (FOSS) projects, the development effort invested in a project is usually unknown, even after the software has been released. However, this information is becoming of major interest, especially ---but not only--- because of the growth in the number of companies for which FOSS has become relevant for their business strategy. In this paper we present a novel approach to estimate effort by considering data from source code management repositories. We apply our model to the OpenStack project, a FOSS project with more than 1,000 authors, in which several tens of companies cooperate. Based on data from its repositories and together with the input from a survey answered by more than 100 developers, we show that the model offers a simple, but sound way of obtaining software development estimations with bounded margins of error. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 222–231 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597107 %R 10.1145/2597073.2597107 %> https://flosshub.org/sites/flosshub.org/files/robles_0.pdf %0 Journal Article %J Empirical Software Engineering %D 2011 %T Effort estimation of FLOSS projects: a study of the Linux kernel %A Capiluppi, Andrea %A Izquierdo-Cortázar, Daniel %K complexity %K effort estimation %K Effort models %K mining software repositories %K open source software %X Empirical research on Free/Libre/Open Source Software (FLOSS) has shown that developers tend to cluster around two main roles: “core” contributors differ from “peripheral” developers in terms of a larger number of responsibilities and a higher productivity pattern. A further, cross-cutting characterization of developers could be achieved by associating developers with “time slots”, and different patterns of activity and effort could be associated to such slots. Such analysis, if replicated, could be used not only to compare different FLOSS communities, and to evaluate their stability and maturity, but also to determine within projects, how the effort is distributed in a given period, and to estimate future needs with respect to key points in the software life-cycle (e.g., major releases). This study analyses the activity patterns within the Linux kernel project, at first focusing on the overall distribution of effort and activity within weeks and days; then, dividing each day into three 8-hour time slots, and focusing on effort and activity around major releases. Such analyses have the objective of evaluating effort, productivity and types of activity globally and around major releases. They enable a comparison of these releases and patterns of effort and activities with traditional software products and processes, and in turn, the identification of company-driven projects (i.e., working mainly during office hours) among FLOSS endeavors. The results of this research show that, overall, the effort within the Linux kernel community is constant (albeit at different levels) throughout the week, signalling the need of updated estimation models, different from those used in traditional 9am–5pm, Monday to Friday commercial companies. It also becomes evident that the activity before a release is vastly different from after a release, and that the changes show an increase in code complexity in specific time slots (notably in the late night hours), which will later require additional maintenance efforts. %B Empirical Software Engineering %P 1-29 %U http://www.springerlink.com/content/612r616k8t52m867/fulltext.html %! Empir Software Eng %R 10.1007/s10664-011-9191-7 %0 Journal Article %J IEEE Transactions on Software Engineering %D 2008 %T An Empirical Study on the Relationship Between Software Design Quality, Development Effort and Governance in Open Source Projects %A Capra, E. %A Francalanci, C. %A Merlo, F. %K effort estimation %K governance %K quality %K source code %X The relationship among software design quality, development effort, and governance practices is a traditional research problem. However, the extent to which consolidated results on this relationship remain valid for open source (OS) projects is an open research problem. An emerging body of literature contrasts the view of open source as an alternative to proprietary software and explains that there exists a continuum between closed and open source projects. This paper hypothesizes that as projects approach the OS end of the continuum, governance becomes less formal. In turn a less formal governance is hypothesized to require a higher-quality code as a means to facilitate coordination among developers by making the structure of code explicit and facilitate quality by removing the pressure of deadlines from contributors. However, a less formal governance is also hypothesized to increase development effort due to a more cumbersome coordination overhead. The verification of research hypotheses is based on empirical data from a sample of 75 major OS projects. Empirical evidence supports our hypotheses and suggests that software quality, mainly measured as coupling and inheritance, does not increase development effort, but represents an important managerial variable to implement the more open governance approach that characterizes OS projects which, in turn, increases development effort. %B IEEE Transactions on Software Engineering %V 34 %P 765 - 782 %8 11/2008 %N 6 %! IIEEE Trans. Software Eng. %R 10.1109/TSE.2008.68 %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Correlating Social Interactions to Release History during Software Evolution %A Baysal, Olga %A Malton, Andrew J. %K ant %K apache %K change management %K developers %K discussion %K effort estimation %K lsedit %K mailing lists %K scm %K source code %X In this paper, we propose a method to reason about the nature of software changes by mining and correlating discussion archives. We employ an information retrieval approach to find correlation between source code change history and history of social interactions surrounding these changes. We apply our correlation method on two software systems, LSEdit and Apache Ant. The results of these exploratory case studies demonstrate the evidence of similarity between the content of free-form text emails among developers and the actual modifications in the code. We identify a set of correlation patterns between discussion and changed code vocabularies and discover that some releases referred to as minor should instead fall under the major category. These patterns can be used to give estimations about the type of a change and time needed to implement it. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 7 - 7 %@ 0-7695-2950-X %R 10.1109/MSR.2007.4 %> https://flosshub.org/sites/flosshub.org/files/28300007.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T How Long Will It Take to Fix This Bug? %A Weiss, Cathrin %A Premraj, Rahul %A Zimmermann, Thomas %A Zeller, Andreas %K bug fixing %K bug reports %K effort estimation %K jboss %K lucene %K prediction %K time %X Predicting the time and effort for a software problem has long been a difficult task. We present an approach that automatically predicts the fixing effort, i.e., the person-hours spent on fixing an issue. Our technique leverages existing issue tracking systems: given a new issue report, we use the Lucene framework to search for similar, earlier reports and use their average time as a prediction. Our approach thus allows for early effort estimation, helping in assigning issues and scheduling stable releases. We evaluated our approach using effort data from the JBoss project. Given a sufficient number of issues reports, our automatic predictions are close to the actual effort; for issues that are bugs, we are off by only one hour, beating naive predictions by a factor of four. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 1 %@ 0-7695-2950-X %R 10.1109/MSR.2007.13 %> https://flosshub.org/sites/flosshub.org/files/28300001.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Predicting Defects and Changes with Import Relations %A Schroter, Adrian %K defects %K eclipse %K effort estimation %K mining challenge %K msr challenge %K prediction %X Lowering the number of defects and estimating the development time of a software project are two important goals of software engineering. To predict the number of defects and changes we train models with import relations. This enables us to decrease the number of defects by more efficient testing and to assess the effort needed in respect to the number of changes. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 31 - 31 %@ 0-7695-2950-X %R 10.1109/MSR.2007.24 %> https://flosshub.org/sites/flosshub.org/files/28300031.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Predicting Eclipse Bug Lifetimes %A Panjer, Lucas D. %K bug fixing %K bugzilla %K classification %K eclipse %K effort estimation %K mining challenge %K msr challenge %K prediction %K weka %X In non-trivial software development projects planning and allocation of resources is an important and difficult task. Estimation of work time to fix a bug is commonly used to support this process. This research explores the viability of using data mining tools to predict the time to fix a bug given only the basic information known at the beginning of a bug's lifetime. To address this question, a historical portion of the Eclipse Bugzilla database is used for modeling and predicting bug lifetimes. A bug history transformation process is described and several data mining models are built and tested. Interesting behaviours derived from the models are documented. The models can correctly predict up to 34.9% of the bugs into a discretized log scaled lifetime class. %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 29 - 29 %@ 0-7695-2950-X %R 10.1109/MSR.2007.25 %> https://flosshub.org/sites/flosshub.org/files/28300029.pdf %0 Conference Paper %B Proceedings of the 2006 International Workshop on Economics Driven Software Engineering Research %D 2006 %T Effort Estimation by Characterizing Developer Activity %A Amor, Juan Jose %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %K developer characterization %K effort estimation %K mining software repositories %K open source software %K software economics %X During the latest years libre (free, open source) software has gained a lot of attention from the industry. Following this interest, the research community is also studying it. For instance, many teams are performing quantitative analysis on the large quantity of data which is publicly available from the development repositories maintained by libre software projects. However, not much of this research is focused on cost or effort estimations, despite its importance (for instance, for companies developing libre software or collaborating with libre software projects), and the availability of some data which could be useful for this purpose. Our position is that classical effort estimation models can be improved from the study of these data, at least when applied to libre software. In this paper, we focus on the characterization of developer activity, which we argue can improve effort estimation. This activity can be traced with a lot of detail, and the resulting data can also be used for validation of any effort estimation model. %B Proceedings of the 2006 International Workshop on Economics Driven Software Engineering Research %S EDSER '06 %I ACM %C New York, NY, USA %P 3–6 %@ 1-59593-396-4 %U http://doi.acm.org/10.1145/1139113.1139116 %R 10.1145/1139113.1139116 %0 Journal Article %J Electronic Markets %D 2004 %T Profiling an Open Source Project Ecology and Its Programmers %A Koch, Stefan %K affiliation network %K brooks law %K cocomo %K effort estimation %K evolution %K productivity %K project success %K scm %K size %K time %K version control %X While many successful and well-known open source projects produce output of high quality, a general assessment of this development paradigm is still missing. In this paper, an online community of both small and large, successful and failed projects and their programmers is analysed mainly using the version-control data of each project, also according to their productivity and estimation of expended effort. As the results show, there are indeed significant differences between this cooperative development model and the commercial organization of work in the areas explored. Both open source software projects in their size and their programmers' effort differ significantly, and the evolution of projects' size over time seems in part to contradict the laws of software evolution proposed for commercial systems. Both the inequality of effort distribution between programmers and an increasing number of developers in a project do not lead to a decrease in productivity, opposing Brooks's Law. Effort estimation based on the COCOMO model for commercial organizations shows a large amount of effort expended for the projects, while a more general Norden-Rayleigh modeling shows a distinctly smaller expenditure. This proposes that either a highly efficient development is achieved by this self-organizing cooperative and highly decentralized form of work, or that the participation of users besides programming tasks is enormous and constitutes an economic factor of large proportions. %B Electronic Markets %V 14 %P 77 - 88 %8 6/2004 %N 2 %! Electronic Markets %R 10.1080/10196780410001675031 %0 Journal Article %J Information Systems Journal %D 2002 %T Effort, co-operation and co-ordination in an open source software project: GNOME %A Koch, Stefan %A Schneider, Georg %K cvs %K discussion %K effort estimation %K gnome %X This paper presents results from research into open source projects from a software engineering perspective. The research methodology employed relies on public data retrieved from the CVS repository of the GNOME project and relevant discussion groups. This methodology is described, and results concerning the special characteristics of open source software development are given. These data are used for a first approach to estimating the total effort to be expended. %B Information Systems Journal %V 12 %P 27 - 42 %8 01/2002 %N 1 %! Inform Syst J %R 10.1046/j.1365-2575.2002.00110.x