%0 Conference Paper %B Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering %D 2017 %T Using Metrics to Track Code Review Performance %A Izquierdo-Cortazar, Daniel %A Sekitoleko, Nelson %A Jesus M. Gonzalez-Barahona %A Kurth, Lars %K code review %K data mining %K Software development analytics %X During 2015, some members of the Xen Project Advisory Board became worried about the performance of their code review process. The Xen Project is a free, open source software project developing one of the most popular virtualization platforms in the industry. They use a pre-commit peer review process similar to that in the Linux kernel, based on email messages. They had observed a large increase over time in the number of messages related to code review, and were worried about how this could be a signal of problems with their code review process. To address these concerns, we designed and conducted, with their continuous feedback, a detailed analysis focused on finding these problems, if any. During the study, we dealt with the methodological problems of Linux-like code review, and with the deeper issue of finding metrics that could uncover the problems they were worried about. For having a benchmark, we run the same analysis on a similar project, which uses very similar code review practices: the Linux Netdev (Netdev) project. As a result, we learned how in fact the Xen Project had some problems, but at the moment of the analysis those were already under control. We found as well how different the Xen and Netdev projects were behaving with respect to code review performance, despite being so similar from many points of view. In this paper we show the results of both analyses, and propose a comprehensive methodology, fully automated, to study Linux-style code review. We discuss also the problems of getting significant metrics to track improvements or detect problems in this kind of code review. %B Proceedings of the 21st International Conference on Evaluation and Assessment in Software Engineering %S EASE'17 %I ACM %C New York, NY, USA %P 214–223 %@ 978-1-4503-4804-1 %U http://doi.acm.org/10.1145/3084226.3084247 %R 10.1145/3084226.3084247 %0 Journal Article %J IEEE Software %D 2013 %T Understanding How Companies Interact with Free Software Communities %A Jesus M. Gonzalez-Barahona %A Izquierdo-Cortazar, Daniel %A Maffulli, Stefano %A Gregorio Robles %X When free, open source software development communities work with companies that use their output, it's especially important for both parties to understand how this collaboration is performing. The use of data analytics techniques on software development repositories can improve factual knowledge about performance metrics. %B IEEE Software %V 30 %P 38 - 45 %8 9/2013 %N 5 %! IEEE Softw. %R 10.1109/MS.2013.95 %0 Conference Paper %B 2012 3rd International Workshop on Emerging Trends in Software Metrics (WETSoM) %D 2012 %T Modification and developer metrics at the function level: Metrics for the study of the evolution of a software project %A Gregorio Robles %A Herraiz, Israel %A Daniel M. German %A Izquierdo-Cortazar, Daniel %X Software evolution, and particularly its growth, has been mainly studied at the file (also sometimes referred as module) level. In this paper we propose to move from the physical towards a level that includes semantic information by using functions or methods for measuring the evolution of a software system. We point out that use of functions-based metrics has many advantages over the use of files or lines of code. We demonstrate our approach with an empirical study of two Free/Open Source projects: a community-driven project, Apache, and a company-led project, Novell Evolution. We discovered that most functions never change; when they do their number of modifications is correlated with their size, and that very few authors who modify each; finally we show that the departure of a developer from a software project slows the evolution of the functions that she authored. %B 2012 3rd International Workshop on Emerging Trends in Software Metrics (WETSoM) %I IEEE %C Zurich, Switzerland %P 49 - 55 %@ 978-1-4673-1763-4 %R 10.1109/WETSoM.2012.6226993 %0 Journal Article %J International Journal of Open Source Software and Processes %D 2011 %T Are Developers Fixing Their Own Bugs? %A Izquierdo-Cortazar, Daniel %A Capiluppi, Andrea %A Jesus M. Gonzalez-Barahona %K bug fixing %K developers %K loc %K scm %X The process of fixing software bugs plays a key role in the maintenance activities of a software project. Ideally, code ownership and responsibility should be enforced among developers working on the same artifacts, so that those introducing buggy code could also contribute to its fix. However, especially in FLOSS projects, this mechanism is not clearly understood: in particular, it is not known whether those contributors fixing a bug are the same introducing and seeding it in the first place. This paper analyzes the comm-central FLOSS project, which hosts part of the Thunderbird, SeaMonkey, Lightning extensions and Sunbird projects from the Mozilla community. The analysis is focused at the level of lines of code and it uses the information stored in the source code management system. The results of this study show that in 80% of the cases, the bug-fixing activity involves source code modified by at most two developers. It also emerges that the developers fixing the bug are only responsible for 3.5% of the previous modifications to the lines affected; this implies that the other developers making changes to those lines could have made that fix. In most of the cases the bug fixing process in comm-central is not carried out by the same developers than those who seeded the buggy code. %B International Journal of Open Source Software and Processes %V 3 %P 23 - 42 %N 2 %R 10.4018/jossp.2011040102 %0 Journal Article %J International Journal of Open Source Software and Processes %D 2010 %T Repositories with Public Data about Software Development %A Jesus M. Gonzalez-Barahona %A Izquierdo-Cortazar, Daniel %A Squire, Megan %X Empirical research on software development based on data obtained from project repositories and code forges is increasingly gaining attention in the software engineering research community. The studies in this area typically start by retrieving or monitoring some subset of data found in the repository or forge, and this data is later analyzed to find interesting patterns. However, retrieving information from these locations can be a challenging task. Meta-repositories providing public information about software development are useful tools that can simplify and streamline the research process. Public data repositories that collect and clean the data from other project repositories or code forges can help ensure that research studies are based on good quality data. This paper provides some insight as to how these meta-repositories (sometimes called a “repository of repositories”, RoR) of data about open source projects should be used to help researchers. This paper describes in detail two of the most widely used collections of data about software development: FLOSSmole and FLOSSMetrics. %B International Journal of Open Source Software and Processes %V 2 %P 1 - 13 %8 04/2010 %N 2 %R 10.4018/jossp.2010040101 %> https://flosshub.org/sites/flosshub.org/files/ijossp2010.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Assessing FLOSS Communities: An Experience Report from the QualOSS Project %A Izquierdo-Cortazar, Daniel %A Gregorio Robles %A González-Barahona, Jesús %A Deprez, Jean-Christophe %X This paper presents work done in the QualOSS (Quality of Open Source Software) research project,which aims at building a methodology and tools to help in the assessment of the quality of FLOSS (free, libre, open source software) endeavors. In particular, we introduce the research done to evaluate the FLOSS endeavor communities. Following the Goal-Question-Metric paradigm, QUALOSS describes goals, the associated questions and then metrics that allow to answer the questions. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 364 - 364 %8 2009/// %G eng %& 38 %R http://dx.doi.org/10.1007/978-3-642-02032-2_38 %> https://flosshub.org/sites/flosshub.org/files/Assessing%20FLOSS%20Communities.pdf %0 Journal Article %J International Journal of Open Source Software and Processes %D 2009 %T Tools for the Study of the Usual Data Sources found in Libre Software Projects %A Gregorio Robles %A González-Barahona, Jesús M. %A Izquierdo-Cortazar, Daniel %A Herraiz, Israel %K bug tracking systems %K data sources %K mailing lists %K scm %K tools %X Due to the open nature of Free/Libre/Open Source software projects, researchers have gained access to a rich set of development-related information. Although this information is publicly available on the Internet, obtaining and analyzing it in a convenient way is not an easy task and many considerations have to be taken into account. In this paper we present the most important data sources that can be found in libre software projects and that are studied by the research community: source code, source code management systems, mailing lists and bug tracking systems. We will give advice for the problems that can be found when retrieving and preparing the data sources for a posterior analysis, as well as provide information about the tools that support these tasks. %B International Journal of Open Source Software and Processes %V 1 %P 24 - 45 %8 31/2009 %N 1 %R 10.4018/jossp.2009010102 %> https://flosshub.org/sites/flosshub.org/files/robles.pdf %0 Journal Article %J 2009 42nd Hawaii International Conference on System Sciences (HICSS 2009) %D 2009 %T Using Software Archaeology to Measure Knowledge Loss in Software Projects Due to Developer Turnover %A Izquierdo-Cortazar, Daniel %A Gregorio Robles %A Ortega, Felipe %A Jesus M. Gonzalez-Barahona %K attrition %K case study %K developers %K evince %K evolution %K gimp %K growth %K knowledge collaboration %K lines of code %K nautilus %K quality %K sloc %K turnover %X Developer turnover can result in a major problem when developing software. When senior developers abandon a software project, they leave a knowledge gap that has to be managed. In addition, new (junior) developers require some time in order to achieve the desired level of productivity. In this paper, we present a methodology to measure the effect of knowledge loss due to developer turnover in software projects. For a given software project, we measure the quantity of code that has been authored by developers that do not belong to the current development team, which we define as orphaned code. Besides, we study how orphaned code is managed by the project. Our methodology is based on the concept of software archaeology, a derivation of software evolution. As case studies we have selected four FLOSS (free, libre, open source software) projects, from purely driven by volunteers to company-supported. The application of our methodology to these case studies will give insight into the turnover that these projects suffer and how they have managed it and shows that this methodology is worth being augmented in future research. %B 2009 42nd Hawaii International Conference on System Sciences (HICSS 2009) %I IEEE Computer Society %C Los Alamitos, CA, USA %P 1-10 %@ 978-0-7695-3450-3 %R http://doi.ieeecomputersociety.org/10.1109/HICSS.2009.1014 %> https://flosshub.org/sites/flosshub.org/files/07-07-08.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T What Does It Take to Develop a Million Lines of Open Source Code? %A Fernandez-Ramil, Juan %A Izquierdo-Cortazar, Daniel %A Mens, Tom %X This article presents a preliminary and exploratory study of the relationship between size, on the one hand, and effort, duration and team size, on the other, for 11 Free/Libre/Open Source Software (FLOSS) projects with current size ranging between between 0.6 and 5.3 million lines of code (MLOC). Effort was operationalised based on the number of active committers per month. The extracted data did not fit well an early version of the closed-source cost estimation model COCOMO for proprietary software, overall suggesting that, at least to some extent, FLOSS communities are more productive than closed-source teams. This also motivated the need for FLOSS-specific effort models. As a first approximation, we evaluated 16 linear regression models involving different pairs of attributes. One of our experiments was to calculate the net size, that is, to remove any suspiciously large outliers or jumps in the growth trends. The best model we found involved effort against net size, accounting for 79 percent of the variance. This model was based on data excluding a possible outlier (Eclipse), the largest project in our sample. This suggests that different effort models may be needed for certain categories of FLOSS projects. Incidentally, for each of the 11 individual FLOSS projects we were able to model the net size trends with very high accuracy (R 2 ≥ 0.98). Of the 11 projects, 3 have grown superlinearly, 5 linearly and 3 sublinearly, suggesting that in the majority of the cases accumulated complexity is either well controlled or don’t constitute a growth constraining factor. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 170 - 184 %8 2009/// %G eng %& 16 %R http://dx.doi.org/10.1007/978-3-642-02032-2_16 %> https://flosshub.org/sites/flosshub.org/files/What%20Does%20it%20Take%20to%20Develop.pdf