%0 Conference Proceedings %B 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track %D 2017 %T Practices and Perceptions of UML Use in Open Source Projects %A Truong Ho-Quang %A Hebig, Regina %A Gregorio Robles %A Chaudron, Michel R. V. %A Miguel Angel Fernandez %K architecture documentation %K communication %K effectiveness of UML %K github %K MOTIVATION %K UML %X Context: Open Source is getting more and more collaborative with industry. At the same time, modeling is today playing a crucial role in development of, e.g., safety critical software. Goal: However, there is a lack of research about the use of modeling in Open Source. Our goal is to shed some light into the motivation and benefits of the use of modeling and its use within project teams. Method: In this study, we perform a survey among Open Source developers. We focus on projects that use the Unified Modeling Language (UML) as a representative for software modeling. Results: We received 485 answers of contributors of 458 different Open Source projects. Conclusion: Collaboration seems to be the most important motivation for using UML. It benefits new contributors and contributors who do not create models. Teams use UML during communication and planning of joint implementation efforts. %B 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track %P 203-212 %8 05/2017 %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T FLOSS 2013: A Survey Dataset About Free Software Contributors: Challenges for Curating, Sharing, and Combining %A Gregorio Robles %A Reina, Laura Arjona %A Serebrenik, Alexander %A Vasilescu, Bogdan %A González-Barahona, Jesús M. %K anonymization %K data combining %K data sharing %K ethics %K free software %K microdata %K msr data showcase %K open data %K open source %K privacy %K Survey %X In this data paper we describe a data set obtained by means of performing an on-line survey to over 2,000 Free Libre Open Source Software (FLOSS) contributors. The survey includes questions related to personal characteristics (gender, age, civil status, nationality, etc.), education and level of English, professional status, dedication to FLOSS projects, reasons and motivations, involvement and goals. We describe as well the possibilities and challenges of using private information from the survey when linked with other, publicly available data sources. In this regard, an example of data sharing will be presented and legal, ethical and technical issues will be discussed. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 396–399 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597129 %R 10.1145/2597073.2597129 %> https://flosshub.org/sites/flosshub.org/files/msr14gregorio.pdf %0 Journal Article %J 2009 42nd Hawaii International Conference on System Sciences (HICSS 2009) %D 2009 %T Using Software Archaeology to Measure Knowledge Loss in Software Projects Due to Developer Turnover %A Izquierdo-Cortazar, Daniel %A Gregorio Robles %A Ortega, Felipe %A Jesus M. Gonzalez-Barahona %K attrition %K case study %K developers %K evince %K evolution %K gimp %K growth %K knowledge collaboration %K lines of code %K nautilus %K quality %K sloc %K turnover %X Developer turnover can result in a major problem when developing software. When senior developers abandon a software project, they leave a knowledge gap that has to be managed. In addition, new (junior) developers require some time in order to achieve the desired level of productivity. In this paper, we present a methodology to measure the effect of knowledge loss due to developer turnover in software projects. For a given software project, we measure the quantity of code that has been authored by developers that do not belong to the current development team, which we define as orphaned code. Besides, we study how orphaned code is managed by the project. Our methodology is based on the concept of software archaeology, a derivation of software evolution. As case studies we have selected four FLOSS (free, libre, open source software) projects, from purely driven by volunteers to company-supported. The application of our methodology to these case studies will give insight into the turnover that these projects suffer and how they have managed it and shows that this methodology is worth being augmented in future research. %B 2009 42nd Hawaii International Conference on System Sciences (HICSS 2009) %I IEEE Computer Society %C Los Alamitos, CA, USA %P 1-10 %@ 978-0-7695-3450-3 %R http://doi.ieeecomputersociety.org/10.1109/HICSS.2009.1014 %> https://flosshub.org/sites/flosshub.org/files/07-07-08.pdf %0 Journal Article %J International Journal of Information Technology and Web Engineering %D 2006 %T Applying Social Network Analysis Techniques to Community-Driven Libre Software Projects %A López-Fernández, L. %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %A Herraiz, I. %K apache %K conway's law %K cvs %K gnome %K kde %K scm %K social network analysis %K source code %X Source code management repositories of large, long-lived libre (free, open source) software projects can be a source of valuable data about the organizational structure, evolution, and knowledge exchange in the corresponding development communities. Unfortunately, the sheer volume of the available information renders it almost unusable without applying methodologies which highlight the relevant information for a given aspect of the project. Such methodology is proposed in this article, based on well known concepts from the social networks analysis field, which can be used to study the relationships among developers and how they collaborate in different parts of a project. It is also applied to data mined from some well known projects (Apache, GNOME, and KDE), focusing on the characterization of their collaboration network architecture. These cases help to understand the potentials of the methodology and how it is applied, but also shows some relevant results which open new paths in the understanding of the informal organization of libre software development communities. %B International Journal of Information Technology and Web Engineering %V 1 %G eng %> https://flosshub.org/sites/flosshub.org/files/06_Lopez_ijitwe_sna.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Contributor Turnover in Libre Software Projects %A Gregorio Robles %A Gonzalez-Barahona, Jesus %K apache %K committers %K core %K cvs %K cvsanaly %K developers %K evolution %K freebsd %K gimp %K gnome %K kde %K mono %K mozilla %X A common problem that management faces in software companies is the high instability of their staff. In libre (free, open source) software projects, the permanence of developers is also an open issue, with the potential of causing problems amplified by the self-organizing nature that most of them exhibit. Hence, human resources in libre software projects are even more difficult to manage: developers are in most cases not bound by a contract and, in addition, there is not a real management structure concerned about this problem. This raises some interesting questions with respect to the composition of development teams in libre software projects, and how they evolve over time. There are projects lead by their original founders (some sort of “code gods”), while others are driven by several different developer groups over time (i.e. the project “regenerates” itself). In this paper, we propose a quantitative methodology, based on the analysis of the activity in the source code management repositories, to study how these processes (developers leaving, developers joining) affect libre software projects. The basis of it is the analysis of the composition of the core group, the group of developers most active in a project, for several time lapses. We will apply this methodology to several large, well-known libre software projects, and show how it can be used to characterize them. In addition, we will discuss the lessons that can be learned, and the validity of our proposal. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %P 273 - 286 %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_28 %> https://flosshub.org/sites/flosshub.org/files/Contributor%20Turnover%20in%20Libre%20Software%20Projects.pdf %0 Conference Paper %B Proceedings of the 2005 international workshop on Mining software repositories %D 2005 %T Developer identification methods for integrated data from various sources %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %K anonymization %K bug tracker %K developers %K email %K email address %K gnome %K identity %K mailing list %K privacy %K source code %K version control %X Studying a software project by mining data from a single repository has been a very active research field in software engineering during the last years. However, few efforts have been devoted to perform studies by integrating data from various repositories, with different kinds of information, which would, for instance, track the different activities of developers. One of the main problems of these multi-repository studies is the different identities that developers use when they interact with different tools in different contexts. This makes them appear as different entities when data is mined from different repositories (and in some cases, even from a single one). In this paper we propose an approach, based on the application of heuristics, to identify the many identities of developers in such cases, and a data structure for allowing both the anonymized distribution of information, and the tracking of identities for verification purposes. The methodology will be presented in general, and applied to the GNOME project as a case example. Privacy issues and partial merging with new data sources will also be considered and discussed. %B Proceedings of the 2005 international workshop on Mining software repositories %S MSR '05 %I ACM %C New York, NY, USA %P 106-110 %@ 1-59593-123-6 %U http://doi.acm.org/10.1145/1082983.1083162 %R http://doi.acm.org/10.1145/1082983.1083162 %> https://flosshub.org/sites/flosshub.org/files/106DeveloperIdentification.pdf %0 Generic %D 2004 %T Applying Social Network Analysis to the Information in CVS Repositories %A López-Fernández, L. %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %K apache %K complex networks %K cvs %K gnome %K kde %K libre software engineering %K source code %K source code repositories %K visualization techniques %K vizualization %X The huge quantities of data available in the CVS repositories of large, long-lived libre (free, open source) software projects, and the many interrelationships among those data offer opportunities for extracting large amounts of valuable information about their structure, evolution and internal processes. Unfortunately, the sheer volume of that information renders it almost unusable without applying methodologies which highlight the relevant information for a given aspect of the project. In this paper, we propose the use of a well known set of methodologies (social network analysis) for characterizing libre software projects, their evolution over time and their internal structure. In addition, we show how we have applied such methodologies to real cases, and extract some preliminary conclusions from that experience. %B International Workshop on Mining Software Repositories (MSR 2004) %P 101-105 %> https://flosshub.org/sites/flosshub.org/files/101ApplyingSocial.pdf %0 Conference Proceedings %B Proceedings of the 4th ICSE Workshop on Open Source %D 2004 %T Community structure of modules in the Apache project %A Jesus M. Gonzalez-Barahona %A Luis Lopez %A Gregorio Robles %K apache %K cvs %K source code %X The relationships among modules in a software project of a certain size can give us much information about its internal organization and a way to control and monitor development activities and evolution of large libre software projects. In this paper, we show how information available in CVS repositories can be used to study the structure of the modules in a project when they are related by the people working in them, and how techniques taken from the social networks fields can be used to highlight the characteristics of that structure. As a case example, we also show some results of applying this methodology to the Apache project in several points in time. Among other facts, it is shown how the project evolves and is self-structuring, with developer communities of modules corresponding to semantically related families of modules. %B Proceedings of the 4th ICSE Workshop on Open Source %P 44-48 %> https://flosshub.org/sites/flosshub.org/files/gonzalezBarahona44-48.pdf