%0 Journal Article %J Applied Economics %D 2016 %T Is there a wage premium for volunteer OSS engagement? – signalling, learning and noise %A Bitzer, Jürgen %A Geishecker, Ingo %A Schröder, Philipp J. H. %K open source software %K peer production %K signalling %K voluntary work %K wage formation %X Volunteer-based open-source production has become a significant new model for the organization of software development. Economics often pictures this phenomenon as a case of signaling: Individuals engage in the volunteer programming of open-source software (OSS) as a labor-market signal resulting in a wage premium. Yet, this explanation could so far not be empirically tested. The present paper fills this gap by estimating an upper-bound composite wage premium of voluntary OSS contributions and by separating the potential signaling effect of OSS engagement from other effects. Although some 70% of OSS contributors believe that OSS involvement benefits their careers, we find no actual labor market premium for OSS engagement. The presence of other motives such as fun of play or altruism render OSS contributions too noisy to function as a signal. %B Applied Economics %I Routledge %P 1 - 16 %8 09/2016 %! Applied Economics %R 10.1080/00036846.2016.1218427 %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T A Dataset of Feature Additions and Feature Removals from the Linux Kernel %A Passos, Leonardo %A Czarnecki, Krzysztof %K evolution %K linux %K msr data showcase %K Traceability %K Version Control History %X This paper describes a dataset of feature additions and removals in the Linux kernel evolution history, spanning over seven years of kernel development. Features, in this context, denote configurable system options that users select when creating customized kernel images. The provided dataset is the largest corpus we are aware of capturing feature additions and removals, allowing researchers to assess the kernel evolution from a feature-oriented point-of-view. Furthermore, the dataset can be used to better understand how features evolve over time, and how different artifacts change as a result. One particular use of the dataset is to provide a real-world case to assess existing support for feature traceability and evolution. In this paper, we detail the dataset extraction process, the underlying database schema, and example queries. The dataset is directly available at our Bitbucket repository: https://bitbucket.org/lpassos/kconfigdb %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 376–379 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597124 %R 10.1145/2597073.2597124 %> https://flosshub.org/sites/flosshub.org/files/kernel.pdf %0 Conference Paper %B Proceedings of The International Symposium on Open Collaboration %D 2014 %T Drupal As a Commons-Based Peer Production Community: A Sociological Perspective %A Rozas, David %K Activity Theory %K Commons-Based Peer Production %K drupal %K Free/Libre Open Source Software %K Virtual Ethnography %X The aim of this research consists of extracting a set of insights related to the dynamics, group decision making procedures, motivations to contribute and mechanisms employed in the coordination of Commons-Based Peer Production communities, using as a case study the community responsible for the development of the Free/Libre Open Source Software Drupal. A sociological perspective is taken for this purpose, and a set of social research qualitative and quantitative methods employed for the study of online communities (virtual ethnography) are being used. %B Proceedings of The International Symposium on Open Collaboration %S OpenSym '14 %I ACM %C New York, NY, USA %P 36:1–36:2 %@ 978-1-4503-3016-9 %U http://doi.acm.org/10.1145/2641580.2641624 %R 10.1145/2641580.2641624 %0 Conference Paper %B Proceedings of The International Symposium on Open Collaboration %D 2014 %T Volunteer Attraction and Retention in Open Source Communities %A Barcomb, Ann %K Community Management %K FLOSS %K open source %K Recruitment %K Service Duration %K Volunteer Management %K Volunteer Retention %K Volunteers %X The importance of volunteers in open source has led to the position of community manager becoming more common in foundations and projects. Yet the advice for volunteer management and retention is fragmented, incomplete, contradictory, and has not been empirically examined. Our aim is to fill this gap by creating a comprehensive guidebook of best practices drawing from open source practitioner guides and general literature on volunteering, and to subject a subset of practices to empirical study. A method for evaluating volunteer attrition in terms of value to the organization will also be developed. %B Proceedings of The International Symposium on Open Collaboration %S OpenSym '14 %I ACM %C New York, NY, USA %P 40:1–40:2 %@ 978-1-4503-3016-9 %U http://doi.acm.org/10.1145/2641580.2641628 %R 10.1145/2641580.2641628 %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T When Are OSS Developers More Likely to Introduce Vulnerable Code Changes? A Case Study %A Bosu, Amiangshu %A Carver, JeffreyC. %A Hafiz, Munawar %A Hilley, Patrick %A Janni, Derek %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %K FOSS %K open source %K OSS %K security %K vulnerability %X We analyzed peer code review data of the Android Open Source Project (AOSP) to understand whether code changes that introduce security vulnerabilities, referred to as vulnerable code changes (VCC), occur at certain intervals. Using a systematic manual analysis process, we identified 60 VCCs. Our results suggest that AOSP developers were more likely to write VCCs prior to AOSP releases, while during the post-release period they wrote fewer VCCs. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 234-236 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_37 %R 10.1007/978-3-642-55128-4_37 %0 Conference Proceedings %B 10th Working Conference on Mining Software Repositories %D 2013 %T The Impact of Tangled Code Changes %A Kim Herzig %A Zeller, Andreas %K bias %K data quality %K history %K java %K mining software repositories %K noise %K tangled code changes %K version control %X When interacting with version control systems, developers often commit unrelated or loosely related code changes in a single transaction. When analyzing the version history, such tangled changes will make all changes to all modules appear related, possibly compromising the resulting analyses through noise and bias. In an investigation of five open-source JAVA projects, we found up to 15% of all bug fixes to consist of multiple tangled changes. Using a multi-predictor approach to untangle changes, we show that on average at least 16.6% of all source files are incorrectly associated with bug reports. We recommend better change organization to limit the impact of tangled changes. %B 10th Working Conference on Mining Software Repositories %8 05/2013 %U http://www.kim-herzig.de/wp-content/uploads/2013/03/msr2013-untangling.pdf %> https://flosshub.org/sites/flosshub.org/files/msr2013-untangling.pdf %0 Thesis %D 2012 %T Software Libre y abierto: comunidades y redes de producción digital de bienes comunes %A Tania E. Turner Sen %K bienes comunes %K commons %K comunidades virtuales %K FLOSS %K flossmole %K hackers %K redes virtuales %K repositories %K repositorios %K Software libre y abierto %K virtual communities %K virtual networks %X This thesis is about a collective form of production that have expanded and strengthen in the global high technology market. It is about FLOSS production. The study takes on account that technnologies are not neutral, they emerge as strategies and mechanisms of politics and economic interests. Although, FLOSS production is inserted in the capitalist context, the collective work of the communities and networks that produce it is based on ideas about freedom and solidarity. The types of rules and organization of labour inside of this communities have develop a kind of product that it is well categorized as part of the new commons. The conclusions at the end of this work pretend to offer a clear approach to the FLOSS production networks dynamics inside the virtual infrastructure. Specifically, it offers an approach of the interaction and forms of cooperation, as well of the individual and collective schemas that motivates the cooperation action of the individuals. %I Universidad Nacional Autónoma de México %C Ciudad de México, México %P 269 pages %U http://132.248.9.195/ptd2012/agosto/406008604/Index.html %> https://flosshub.org/sites/flosshub.org/files/Tesis.pdf %0 Conference Proceedings %B Open Source Systems: Grounding Research (OSS 2011) %D 2011 %T Cliff Walls: An Analysis of Monolithic Commits Using Latent Dirichlet Allocation %A Pratt, Landon J. %A MacLean, Alexander C. %A Knutson, Charles D. %A Ringger, Eric K. %K artifacts %K commit %K cvs %K LDA %K lines of code %K log files %K scm %K sloc %K sourceforge %K version control %X Artifact-based research provides a mechanism whereby researchers may study the creation of software yet avoid many of the difficulties of direct observation and experimentation. However, there are still many challenges that can affect the quality of artifact-based studies, especially those studies examining software evolution. Large commits, which we refer to as “Cliff Walls,” are one significant threat to studies of software evolution because they do not appear to represent incremental development. We used Latent Dirichlet Allocation to extract topics from over 2 million commit log messages, taken from 10,000 SourceForge projects. The topics generated through this method were then analyzed to determine the causes of over 9,000 of the largest commits. We found that branch merges, code imports, and auto-generated documentation were significant causes of large commits. We also found that corrective maintenance tasks, such as bug fixes, did not play a significant role in the creation of large commits. %B Open Source Systems: Grounding Research (OSS 2011) %I Springer %P 282-298 %8 10/2011 %0 Conference Proceedings %B Open Source Systems: Grounding Research (OSS 2011) %D 2011 %T Impact of Stakeholder Type and Collaboration on Issue Resolution Time in OSS Projects %A Duc, Ach Nguyen %A Cruzes, Daniela S. %A Ayala, Claudia %A Conradi, Reidar %K COLLABORATION %K companies %K coordination %K defects %K feature requests %K geronimo %K jira %K qpid %K qt %K social network analysis %K volunteer %X Initialized by a collective contribution of volunteer developers, Open source software (OSS) attracts an increasing involvement of commercial firms. Many OSS projects are composed of a mix group of firm-paid and volunteer developers, with different motivations, collaboration practices and working styles. As OSS development consists of collaborative works in nature, it is important to know whether these differences have an impact on collaboration between difference types of stakeholders, which lead to an influence in the project outcomes. In this paper, we empirically investigate the firm-paid participation in resolving OSS evolution issues, the stakeholder collaboration and its impact on OSS issue resolution time. The results suggest that though a firm-paid assigned developer resolves much more issues than a volunteer developer does, there is no difference in issue resolution time between them. Besides, the more important factor that influences the issue resolution time comes from the collaboration among stakeholders rather than from individual characteristics. %B Open Source Systems: Grounding Research (OSS 2011) %I Springer %P 1-16 %8 10/2011 %0 Conference Paper %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %D 2011 %T Java generics adoption %A Christian Bird %A Murphy-Hill, Emerson %A Parnin, Chris %Y van Deursen, Arie %Y Xie, Tao %Y Zimmermann, Thomas %K commits %K generics %K java %K source code %K version history %X Support for generic programming was added to the Java language in 2004, representing perhaps the most significant change to one of the most widely used programming languages today. Researchers and language designers anticipated this addition would relieve many long-standing problems plaguing developers, but surprisingly, no one has yet measured whether generics actually provide such relief. In this paper, we report on the first empirical investigation into how Java generics have been integrated into open source software by automatically mining the history of 20 popular open source Java programs, traversing more than 500 million lines of code in the process. We evaluate five hypotheses, each based on assertions made by prior researchers, about how Java developers use generics. For example, our results suggest that generics do not significantly reduce the number of type casts and that generics are usually adopted by a single champion in a project, rather than all committers. %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %I ACM Press %C New York, New York, USA %P 3-12 %8 05/2011 %@ 9781450305747 %! MSR '11 %R 10.1145/1985441.1985446 %0 Journal Article %J Information and Software Technology %D 2011 %T Sociomaterial bricolage: The creation of location-spanning work practices by global software developers %A Johri, Aditya %K Global software development %K Interpretive analysis %K interviews %K Qualitative field study %K Sociomaterial bricolage %K Virtual teams %K Work practices %X Context Studies on global software development have documented severe coordination and communication problems among coworkers due to geographic dispersion and consequent dependency on technology. These problems are exacerbated by increase in the complexity of work undertaken by global teams. However, despite these problems, global software development is on the rise and firms are adopting global practices across the board, raising the question: What does successful global software development look like and what can we learn from its practitioners? Objective This study draws on practice-based studies of work to examine successful work practices of global software developers. The primary aim of this study was to understand how workers develop practices that allow them to function effectively across geographically dispersed locations. Method An ethnographically-informed field study was conducted with data collection at two international locations of a firm. Interview, observation and archival data were collected. A total of 42 interviews and 3 weeks of observations were conducted. Results Teams spread across different locations around the world developed work practices through sociomaterial bricolage. Two facets of technology use were necessary for the creation of these practices: multiplicity of media and relational personalization at dyadic and team levels. New practices were triggered by the need to achieve a work-life balance, which was disturbed by global development. Reflecting on my role as a researcher, I underscore the importance of understanding researchers’ own frames of reference and using research practices that mirror informants’ work practices. Conclusion Software developers on global teams face unique challenges which necessitate a shift in their work practices. Successful teams are able to create practices that span locations while still being tied to location based practices. Inventive use of material and social resources is central to the creation of these practices. %B Information and Software Technology %V 53 %P 955 - 968 %8 9/2011 %U http://www.sciencedirect.com/science/article/pii/S0950584911000437 %N 9 %! Information and Software Technology %R 10.1016/j.infsof.2011.01.014 %0 Conference Paper %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %D 2011 %T System compatibility analysis of Eclipse and Netbeans based on bug data %A Baik, Eilwoo %A Devanbu, Premkar %A Wang, Xinlei (Oscar) %Y van Deursen, Arie %Y Xie, Tao %Y Zimmermann, Thomas %K bug tracking system %K bugzilla %K eclipse %K ms challenge %K netbeans %K version history %X Eclipse and Netbeans are two top of the line Integrated Development Environments (IDEs) for Java development. Both of them provide support for a wide variety of development tasks and have a large user base. This paper provides an analysis and comparison for the compatibility and stability of Eclipse and Netbeans on the three most commonly used operating systems, Windows, Linux and Mac OS. Both IDEs are programmed in Java and use a Bugzilla issue tracker to track reported bugs and feature requests. We looked into the Bugzilla repository databases of these two IDEs, which contains the bug records and histories of these two IDEs. We used some basic data mining techniques to analyze some historical statistics of the bug data. Based on the analysis, we try to answer certain stability-comparison oriented questions in the paper, so that users can have a better idea which of these two IDEs is designed better to work on different platforms. %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %I ACM Press %C Waikiki, Honolulu, HI, USANew York, New York, USA %P 230-233 %8 05/2011 %@ 9781450305747 %! MSR '11 %R 10.1145/1985441.1985479 %0 Journal Article %J Journal of the Association for Information Systems %D 2011 %T Validity Issues in the Use of Social Network Analysis with Digital Trace data %A Howison, James %A Andrea Wiggins %A Kevin Crowston %K information system %K Online Communities %K social network analysis %K Virtuality %X There is an exciting natural match between social network analysis methods and the growth of data sources produced by social interactions via information technologies, from online communities to corporate information systems. Information Systems researchers have not been slow to embrace this combination of method and data. Such systems increasingly provide "digital trace data" that provide new research opportunities. Yet digital trace data are substantively different from the survey and interview data for which network analysis measures and interpretations were originally developed. This paper examines ten validity issues associated with the combination of data digital trace data and social network analysis methods, with examples from the IS literature, to provide recommendations for improving the validity of research using this combination. %B Journal of the Association for Information Systems %V 12 %N 12 %& Article 2 %> https://flosshub.org/sites/flosshub.org/files/HowisonSNADigitalTraceData-WorkingPaper.pdf %0 Conference Proceedings %B Open Source Systems: Grounding Research (OSS 2011) %D 2011 %T Virtual Health Information Infrastructures: A Scalable Regional Model %A Seror, Ann %K Bireme %K Communities Of Practice %K culture %K open source systems %K virtual infrastructures %X Integrating research, education and evidence-based medical practice requires complex infrastructures and network linkages among these critical activities. This research examines communities of practice and open source software tools in development of scalable virtual infrastructures for the regional Virtual Health Library of the Latin American and Caribbean Health Sciences System (Bireme) and embedded national cases. Virtual infrastructures refer to an environment characterized by overlapping distribution networks accessible through Internet portals and websites designed to facilitate integrated use of available resources. Case analysis shows engagement of interdisciplinary communities of practice for scalable virtual infrastructure design. This research program considers theory and methods for study of transferability of the Latin American model to large health care systems in other cultures. %B Open Source Systems: Grounding Research (OSS 2011) %I Springer %P 316-319 %8 10/2011 %0 Conference Paper %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %D 2011 %T Visualizing collaboration and influence in the open-source software community %A Marschner, Eli %A Rosenfeld, Evan %A Heer, Jeffrey %A Heller, Brandon %Y van Deursen, Arie %Y Xie, Tao %Y Zimmermann, Thomas %K COLLABORATION %K data exploration %K geography %K geoscatter %K github %K graph %K mapping %K metadata %K open source %K social graph %K user profiles %K visualization %X We apply visualization techniques to user profiles and repository metadata from the GitHub source code hosting service. Our motivation is to identify patterns within this development community that might otherwise remain obscured. Such patterns include the effect of geographic distance on developer relationships, social connectivity and influence among cities, and variation in project-specific contribution styles (e.g., centralized vs. distributed). Our analysis examines directed graphs in which nodes represent users' geographic locations and edges represent (a) follower relationships, (b) successive commits, or (c) contributions to the same project. We inspect this data using a set of visualization techniques: geo-scatter maps, small multiple displays, and matrix diagrams. Using these representations, and tools based on them, we develop hypotheses about the larger GitHub community that would be difficult to discern using traditional lists, tables, or descriptive statistics. These methods are not intended to provide conclusive answers; instead, they provide a way for researchers to explore the question space and communicate initial insights. %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %I ACM Press %C New York, New York, USA %P 223-226 %8 05/2011 %@ 9781450305747 %U http://vis.stanford.edu/files/2011-GotHub-MSR.pdf %! MSR '11 %R 10.1145/1985441.1985476 %0 Journal Article %J Information and Software Technology %D 2010 %T Analysis of virtual communities supporting OSS projects using social network analysis %A Toral, S.L. %A Martínez-Torres, M.R. %A Barrero, F. %K arm %K email %K Knowledge brokers %K linux %K mailing list %K open source software %K social network analysis %K virtual communities %X This paper analyses the behaviour of virtual communities for Open Source Software (OSS) projects. The development of OSS projects relies on virtual communities, which are built on relationships among members, being their final objective sharing knowledge and improving the underlying project. This study addresses the interactive collaboration in these kinds of communities applying social network analysis (SNA). In particular, SNA techniques will be used to identify those members playing a middle-man role among other community members. Results will illustrate the importance of this role to achieve successful virtual communities. %B Information and Software Technology %V 52 %P 296 - 303 %8 3/2010 %U http://www.sciencedirect.com/science/article/pii/S0950584909001888 %N 3 %! Information and Software Technology %R 10.1016/j.infsof.2009.10.007 %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Assessment of issue handling efficiency %A Luijten, Bart %A Visser, Joost %A Zaidman, Andy %K bug reports %K bug tracking %K classification %K gnome %K msr challenge %K visualization %X We mined the issue database of GNOME to assess how issues are handled. How many issues are submitted and resolved? Does the backlog grow or decrease? How fast are issues resolved? Does issue resolution speed increase or decrease over time? In which subproject are issues handled most efficiently? To answer such questions, we apply several visualization and quantification instruments to the raw issue data. In particular, we aggregate issues into four risk categories, based on their resolution time. These categories are the basis both for visualizing and ranking, which are used in concert for issue database exploration. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 94 - 97 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463292 %> https://flosshub.org/sites/flosshub.org/files/94bluijtenMSR2010.pdf %0 Conference Paper %B ACM Conference on Computer-Human Interaction (CHI) %D 2010 %T Lurking? Cyclopaths? A Quantitative Lifecyle Analysis of User Behavior in a Geowiki %A Panciera, K. %A Priedhorsky, R. %A Erickson, T. %A Terveen, L. %K content, %K geographic %K geowiki, %K information, %K lurking %K open %K volunteer %K volunteered %K Wiki, %K work, %B ACM Conference on Computer-Human Interaction (CHI) %I Association for Computing Machinery %C Atlanta, GA %8 04/2010 %G eng %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T When process data quality affects the number of bugs: Correlations in software engineering datasets %A Bachmann, Adrian %A Bernstein, Abraham %K apache %K bug reports %K eclipse %K gnome %K log files %K mozilla %K netbeans %K openoffice.org %K version control %X Software engineering process information extracted from version control systems and bug tracking databases are widely used in empirical software engineering. In prior work, we showed that these data are plagued by quality deficiencies, which vary in its characteristics across projects. In addition, we showed that those deficiencies in the form of bias do impact the results of studies in empirical software engineering. While these findings affect software engineering researchers the impact on practitioners has not yet been substantiated. In this paper we, therefore, explore (i) if the process data quality and characteristics have an influence on the bug fixing process and (ii) if the process quality as measured by the process data has an influence on the product (i.e., software) quality. Specifically, we analyze six Open Source as well as two Closed Source projects and show that process data quality and characteristics have an impact on the bug fixing process: the high rate of empty commit messages in Eclipse, for example, correlates with the bug report quality. We also show that the product quality - measured by number of bugs reported - is affected by process data quality measures. These findings have the potential to prompt practitioners to increase the quality of their software process and its associated data quality. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 62 - 71 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463286 %> https://flosshub.org/sites/flosshub.org/files/62bachmann-msr10.pdf %0 Conference Paper %B 6th IEEE Working Conference on Mining Software Repositories %D 2009 %T Amassing and indexing a large sample of version control systems: towards the census of public source code history %A Audris Mockus %K bazaar %K cvs %K flossmole %K git %K mercurial %K source code %K sourceforge %K subversion %K version control %X The source code and its history represent the output and process of software development activities and are an in- valuable resource for study and improvement of software development practice. While individual projects and groups of projects have been extensively analyzed, some fundamental questions, such as the spread of innovation or genealogy of the source code, can be answered only by considering the entire universe of publicly available source code and its history. We describe methods we developed over the last six years to gather, index, and update an approximation of such a universal repository for publicly accessible version control systems and for the source code inside a large corporation. While challenging, the task is achievable with limited resources. The bottlenecks in network bandwidth, processing, and disk access can be dealt with using inherent parallelism of the tasks and suitable tradeoffs between the amount of storage and computations, but a completely automated discovery of public version control systems may require enticing participation of the sampled projects. Such universal repository would allow studies of global properties and origins of the source code that are not possible through other means. %B 6th IEEE Working Conference on Mining Software Repositories %8 May 16–17 %G eng %> https://flosshub.org/sites/flosshub.org/files/11amassing.pdf %0 Conference Paper %B Proceedings of the 17th International Symposium on Software Reliability Engineering %D 2009 %T Putting it All Together: Using Socio-Technical Networks to Predict Failures %A Christian Bird %A Nachiappan Nagappan %A Devanbu, Premkumar %A Gall, Harald %A Brendan Murphy %K eclipse %K microsoft %K social network %K vista %K windows %X Studies have shown that social factors in development organizations have a dramatic effect on software quality. Separately, program dependency information has also been used successfully to predict which software components are more fault prone. Interestingly, the influence of these two phenomena have only been studied separately. Intuition and practical experience suggests, however, that task assignment (i.e. who worked on which components and how much) and dependency structure (which components have dependencies on others) together interact to influence the quality of the resulting software. We study the influence of combined socio-technical software networks on the fault-proneness of individual software components within a system. The network properties of a software component in this combined network are able to predict if an entity is failure prone with greater accuracy than prior methods which use dependency or contribution information in isolation. We evaluate our approach in different settings by using it on Windows Vista and across six releases of the Eclipse development environment including using models built from one release to predict failure prone components in the next release. We compare this to previous work. In every case, our method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%. %B Proceedings of the 17th International Symposium on Software Reliability Engineering %> https://flosshub.org/sites/flosshub.org/files/bird2009pat.pdf %0 Journal Article %J AMCIS 2009 Proceedings %D 2009 %T Security of Open Source and Closed Source Software: An Empirical Comparison of Published Vulnerabilities %A Schryen, Guido %K closed source software %K empirical comparison %K open source software %K security %K Vulnerabilities %X Reviewing literature on open source and closed source security reveals that the discussion is often determined by biased attitudes toward one of these development styles. The discussion specifically lacks appropriate metrics, methodology and hard data. This paper contributes to solving this problem by analyzing and comparing published vulnerabilities of eight open source software and nine closed source software packages, all of which are widely deployed. Thereby, it provides an extensive empirical analysis of vulnerabilities in terms of mean time between vulnerability disclosures, the development of disclosure over time, and the severity of vulnerabilities, and allows for validating models provided in the literature. The investigation reveals that (a) the mean time between vulnerability disclosures was lower for open source software in half of the cases, while the other cases show no differences, (b) in contrast to literature assumption, 14 out of 17 software packages showed a significant linear or piecewise linear correlation between time and the number of published vulnerabilities, and (c) regarding the severity of vulnerabilities, no significant differences were found between open source and closed source. %B AMCIS 2009 Proceedings %P 387 %U http://epub.uni-regensburg.de/21296/1/Schryen_-_AMCIS_09_-_Security_of_open_source_and_closed_source_software_-_Web_version.pdf %> https://flosshub.org/sites/flosshub.org/files/Schryen_-_AMCIS_09_-_Security_of_open_source_and_closed_source_software_-_Web_version.pdf %0 Conference Paper %B Proceedings of the Warm Up Workshop for ACM/IEEE ICSE 2010 %D 2009 %T Trust issues in open source software development %A Orsila, Heikki %A Geldenhuys, Jaco %A Ruokonen, Anna %A Hammouda, Imed %K ffmpeg %K trust %K version control %K zlib %X Open source software and the associated development model holds great promise, but the issue of trust is a major challenge. This applies to companies wishing to adopt the open source model but also within open source projects. We investigate this issue by data mining open source repositories to study two related phenomena: update propagation and distributed version control. %B Proceedings of the Warm Up Workshop for ACM/IEEE ICSE 2010 %S WUP '09 %I ACM %C New York, NY, USA %P 9–12 %@ 978-1-60558-565-9 %U http://doi.acm.org/10.1145/1527033.1527037 %R 10.1145/1527033.1527037 %0 Conference Paper %B Proceedings of the 27th international conference on Human factors in computing systems %D 2009 %T Understanding how and why open source contributors use diagrams in the development of Ubuntu %A Yatani, Koji %A Chung, Eunyoung %A Jensen, Carlos %A Truong, Khai N. %K developers %K diagramming %K interviews %K open source software (oss) %K software development %K Ubuntu %K visual representation %X Some of the most interesting differences between Open Source Software (OSS) development and commercial co-located software development lie in the communication and collaboration practices of these two groups of developers. One interesting practice is that of diagramming. Though well studied and important in many aspects of co-located software development (including communication and collaboration among developers), its role in OSS development has not been thoroughly studied. In this paper, we report our investigation on how and why Ubuntu contributors use diagrams in their work. Our study shows that diagrams are not actively used in many scenarios where they commonly would in co-located software development efforts. We describe differences in the use and practices of diagramming, their possible reasons, and present design considerations for potential systems aimed at better supporting diagram use in OSS development. %B Proceedings of the 27th international conference on Human factors in computing systems %S CHI '09 %I ACM %C New York, NY, USA %P 995–1004 %@ 978-1-60558-246-7 %U http://doi.acm.org/10.1145/1518701.1518853 %R http://doi.acm.org/10.1145/1518701.1518853 %0 Journal Article %J 42nd Hawaii International Conference on System Sciences (HICSS 2009) %D 2009 %T Understanding the Nature and Production Model of Hybrid Free and Open Source Systems: The Case of Varnish %A Zegaye Seifu Wubishet %K case study %K organizational sponsorship %K peer production %K varnish %X This is a detailed interpretive case study analysis of an open source software project, called Varnish. The conceptual framework is based on the literature covering issues of commons based production models and the organization of open source projects. The comparative analysis reveals that Varnish is a hybrid project, encompassing the features of open source software while managed by a company as a proprietary project would. It is also hybrid in the sense that it employs a combination of hierarchical and commons based peer production model features. This mix of characters addresses a variety of problems related to each of the aforementioned categories. %B 42nd Hawaii International Conference on System Sciences (HICSS 2009) %I IEEE Computer Society %C Los Alamitos, CA, USA %P 1-11 %@ 978-0-7695-3450-3 %R http://doi.ieeecomputersociety.org/10.1109/HICSS.2009.998 %> https://flosshub.org/sites/flosshub.org/files/07-07-02.pdf %0 Journal Article %J Information & Management %D 2009 %T Virtual organizational learning in open source software development projects %A Yoris A. Au %A Darrell Carpenter %A Xiaogang Chen %A Jan G. Clark %K bug fixing %K bugs %K learning %K Project performance %K sourceforge %K team size %K teams %K virtual organization %X We studied virtual organizational learning in open source software (OSS) development projects. Specifically, our research focused on learning effects of OSS projects and the factors that affect the learning process. The number and percentage of resolved bugs and bug resolution time of 118 SourceForge.net OSS projects were used to measure the learning effects. Projects were characterized by project type, number and experience of developers, number of bugs, and bug resolution time. Our results provided evidence of virtual organizational learning in OSS development projects and support for several factors as determinants of performance. Team size was a significant predictor, with mid-sized project teams functioning best. Teams of three to seven developers exhibited the highest efficiency over time and teams of eight to 15 produced the lowest mean time for bug resolution. Increasing the percentage of bugs assigned to specific developers or boosting developer participation in other OSS projects also improved performance. Furthermore, project type introduced variability in project team performance. %B Information & Management %V 46 %P 9 - 15 %U http://www.sciencedirect.com/science/article/B6VD0-4V1D7NT-1/2/a3bbf7652c674f753398160b8f05f6e9 %R DOI: 10.1016/j.im.2008.09.004 %0 Conference Paper %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %D 2009 %T Visualizing Gnome with the Small Project Observatory %A Lungu, Mircea %A Malnati, Jacopo %A Lanza, Michele %K bugzilla %K contributions %K gnome %K msr challenge %K spo %K visualization %X We analyzed the gnome family of systems with the small project observatory, our online ecosystem visualization platform. We begin by briefly introducing the model of SPO. We then observe and discuss several phases in the activity of the gnome ecosystem. We follow and look at how the contributors are distributed between writing source code and doing other activities such as internationalization. We end with a visual overview of the activity of more than 900 contributors in the 10 years of existence of gnome. %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %I IEEE %C Vancouver, BC, Canada %P 103 - 106 %@ 978-1-4244-3493-0 %R 10.1109/MSR.2009.5069487 %> https://flosshub.org/sites/flosshub.org/files/103Lung2009a.pdf %0 Journal Article %J Information & Management %D 2009 %T Volunteers' involvement in online community based software development %A Bo Xu %A Donald R. Jones %A Bingjia Shao %K age %K developers %K effectiveness %K function points %K ideology %K leadership %K MOTIVATION %K scm %K sourceforge %K status %K Survey %K team size %K Volunteers %X We sought to gain understanding of voluntary developers' involvement in open source software (OSS) projects. Data were collected from voluntary developers working on open source projects. Our findings indicated that a voluntary developer's involvement was very important to his or her performance and that involvement was dependent on individual motivations (personal software needs, reputation and skills gaining expectation, enjoyment in open source coding) and project community factors (leadership effectiveness, interpersonal relationship, community ideology). Our work contributes theoretically and empirically to the body of OSS research and has practical implications for OSS project management. %B Information & Management %V 46 %P 151 - 158 %U http://www.sciencedirect.com/science/article/B6VD0-4VP1CN0-1/2/8e1c7be4fcedd1419209c5c843ffa923 %R DOI: 10.1016/j.im.2008.12.005 %0 Journal Article %J Information Economics and Policy %D 2008 %T The allocation of collaborative efforts in open-source software %A den Besten, Matthijs %A Jean-Michel Dalle %A Galia, Fabrice %K age %K apache %K complexity %K cvs %K division of labor %K functions %K gaim %K gcc %K ghostscript %K lines of code %K loc %K log files %K mozilla %K netbsd %K openssh %K postgresql %K python %K revision control %K scm %K size %K source code %K Stigmergy %K version control %X The article investigates the allocation of collaborative efforts among core developers (maintainers) of open-source software by analyzing on-line development traces (logs) for a set of 10 large projects. Specifically, we investigate whether the division of labor within open-source projects is influenced by characteristics of software code. We suggest that the collaboration among maintainers tends to be influenced by different measures of code complexity. We interpret these findings by providing preliminary evidence that the organization of open-source software development would self-adapt to characteristics of the code base, in a 'stigmergic' manner. %B Information Economics and Policy %V 20 %P 316 - 322 %U http://www.sciencedirect.com/science/article/B6V8J-4SSG4PN-1/2/88b3824c30a31c18929d8a5ca6d64f62 %R DOI: 10.1016/j.infoecopol.2008.06.003 %0 Conference Paper %B the 2008 international workshopProceedings of the 2008 international workshop on Mining software repositories - MSR '08 %D 2008 %T Branching and merging in the repository %A Spacco, Jamie %A Williams, Chadd C. %Y Hassan, Ahmed E. %Y Lanza, Michele %Y Godfrey, Michael W. %K argouml %K changes %K cvs2svn %K diffj %K revision %K scm %K source code %K version control %X Two of the most complex operations version control software allows a user to perform are branching and merging. Branching provides the user the ability to create a copy of the source code to allow changes to be stored in version control but outside of the trunk. Merging provides the user the ability to copy changes from a branch to the trunk. Performing a merge can be a tedious operation and one that may be error prone. In this paper, we compare file revisions found on branches with those found on the trunk to determine when a change that is applied to a branch is moved to the trunk. This will allow us to study how developers use merges and to determine if merges are in fact more error prone than other commits. %B the 2008 international workshopProceedings of the 2008 international workshop on Mining software repositories - MSR '08 %I ACM Press %C New York, New York, USA %P 19-22 %8 05/2008 %@ 9781605580241 %! MSR '08 %R 10.1145/1370750.1370754 %> https://flosshub.org/sites/flosshub.org/files/p19-williams.pdf %0 Conference Paper %B 3rd Workshop on Public Data about Software Development (WoPDaSD 2008) %D 2008 %T Collecting data from distributed FOSS projects %A Fagerholm, Fabian %A Taina, Juha %K bitkeeper %K bug tracking system %K cvs %K distributed %K email archive %K fork rate %K git %K life cycle %K linux %K linux kernel %K mailing list %K merge rate %K subversion %K svn %K version control %X A key trait of Free and Open Source Software (foss) development is its distributed nature. Nevertheless, two project-level operations, the fork and the merge of program code, are among the least well understood events in the lifespan of a foss project. Some projects have explicitly adopted these operations as the primary means of concurrent development. In this study, we examine the effect of highly distributed software development, as found in the Linux kernel project, on collection and modelling of software development data. We find that distributed development calls for sophisticated temporal modelling techniques where several versions of the source code tree can exist at once. Attention must be turned towards the methods of quality assurance and peer review that projects employ to manage these parallel source trees. Our analysis indicates that two new metrics, fork rate and merge rate, could be useful for determining the role of distributed version control systems in foss projects. The study presents a preliminary data set consisting of version control and mailing list data. %B 3rd Workshop on Public Data about Software Development (WoPDaSD 2008) %P 8-13 %8 2009 %> https://flosshub.org/sites/flosshub.org/files/fagerholm.pdf %0 Journal Article %J Industrial and Corporate Change %D 2008 %T Dynamics of innovation in an "open source" collaboration environment: lurking, laboring, and launching FLOSS projects on SourceForge %A David, P. A. %A Rullani, F. %K contributors %K core %K developers %K roles %K SFnetDataset %K sourceforge %K users %K virtual communities %K virtual organization %K virtual organizations %X A systems analysis perspective is adopted to examine the critical properties of the Free/Libre/Open Source Software (FLOSS) mode of innovation, as reflected on the SourceForge platform (SF.net). This approach re-scales March's (1991) framework and applies it to characterize the “innovation system” of a “distributed organization” of interacting agents in a virtual collaboration environment, rather than to innovation within a firm. March (1991) views the process of innovation at the organizational level as the coupling of sub-processes of exploration and exploitation. Correspondingly, the innovation system of the virtual collaboration environment represented by SF.net is an emergent property of two “coupled” processes: one involves the interactions among agents searching the locale for information and knowledge resources to use in designing novel software products (i.e., exploration), and the other involves the mobilization of individuals’ capabilities for application in the software development projects that become established on the platform (i.e., exploitation). The micro-dynamics of this system are studied empirically by constructing transition probability matrices representing the movements of 222,835 SF.net users among seven different activity states, which range from “lurking” (not contributing or contributing to projects without becoming a member) to “laboring” (joining one or more projects as members), and to “launching” (founding one or more projects) within each successive 6-month interval. The estimated probabilities are found to form first-order Markov chains describing ergodic processes. This makes it possible the computation of the equilibrium distribution of agents among the states, thereby suppressing transient effects and revealing persisting patterns of project joining and project launching. The latter show the FLOSS innovation process on SF.net to be highly dissipative: a very large proportion of the registered “developers” fail to become even minimally active on the platform. There is nevertheless an active core of mobile project joiners, and a (still smaller) core of project founders who persist in creating new projects. The structure of these groups’ interactions (as displayed within the 3-year period examined) is investigated in detail, and it is shown that it would be sufficient to sustain both the exploration and exploitation phases of the platform's global dynamics. %B Industrial and Corporate Change %V 17 %P 647 - 710 %8 07/2008 %N 4 %! Industrial and Corporate Change %R 10.1093/icc/dtn026 %0 Conference Paper %B Proceedings of the 2008 international working conference on Mining software repositories %D 2008 %T Evaluation of source code copy detection methods on freebsd %A Chang, Hung-Fu %A Audris Mockus %K clone %K cloning %K code copying %K freebsd %K version control %X Studies have shown that substantial code reuse is common in open source and in commercial projects. However, the precise extent of reuse and its impact on productivity and quality are not well investigated in the open source context. Previously, we have introduced a simple-to-use method that needs only a set of file pathnames to identifies directories that share filenames and partially validated its performance on a set of closed-source projects. To evaluate this method and to improve reuse detection at the file level, we apply it and four additional file copy detection methods that utilize the underlying content of multiple versions of the source code on the FreeBSD project. The evaluation quantified unique advantages of each method and showed that the filename method detected roughly half of all reuse cases. We are still faced with a challenge to scale the content based methods to large repositories containing all versions of open source files. %B Proceedings of the 2008 international working conference on Mining software repositories %S MSR '08 %I ACM %C New York, NY, USA %P 61–66 %8 05/2008 %@ 978-1-60558-024-1 %U http://doi.acm.org/10.1145/1370750.1370766 %R http://doi.acm.org/10.1145/1370750.1370766 %> https://flosshub.org/sites/flosshub.org/files/p61-chang.pdf %0 Journal Article %J Science Studies %D 2008 %T The Material and Social Dynamics of Motivation: Contributions to Open Source Language Technology Development %A Stephanie Freeman %K contributions %K developers %K email %K email archives %K mailing list %K MOTIVATION %K openoffice %K openoffice.org %K secondary data %K Volunteers %X Volunteer motivation has been a central theme in Free/Libre/Open Source Software (FLOSS) literature. This research has been largely dominated by economists who rely in their surveys on the distinction between intrinsic and extrinsic motivations and the "hacker ethic" for profit juxtaposition. The paper argues that survey-based analytical frameworks and research designs have led to a focus on some motivational attributions at the expense of others. It then presents a case study that explores dynamic, non individualistic and content-sensitive aspects of motivations. The approach is based on socio-cultural psychology and the author's observations of a hybrid firm-community FLOSS project, OpenOffice.org. Instead of separating intrinsic motivations from extrinsic ones, it is argued that complex and changing patterns of motivations are tied to changing objects and personal histories prior to and during participation. The boundary between work and hobby in an individual's participation path is blurred and shifting. %B Science Studies %G eng %> https://flosshub.org/sites/flosshub.org/files/Freeman.pdf %0 Conference Paper %B Proceedings of the 30th International Conference on Software Engineering (ICSE 2008) %D 2008 %T Open source software peer review practices: a case study of the apache server %A Peter C. Rigby %A Daniel M. German %A Storey, Margaret-Anne %K apache %K cvs %K email %K inspection %K mining software repositories (email) %K open source software %K peer review %K version control %X Peer review is seen as an important quality assurance mechanism in both industrial development and the open source software (OSS) community. The techniques for performing inspections have been well studied in industry; in OSS development, peer reviews are less well understood. We examine the two peer review techniques used by the successful, mature Apache server project: review-then-commit and commit-then-review. Using archival records of email discussion and version control repositories, we construct a series of metrics that produces measures similar to those used in traditional inspection experiments. Specifically, we measure the frequency of review, the level of participation in reviews, the size of the artifact under review, the calendar time to perform a review, and the number of reviews that find defects. We provide a comparison of the two Apache review techniques as well as a comparison of Apache review to inspection in an industrial project. We conclude that Apache reviews can be described as (1) early, frequent reviews (2) of small, independent, complete contributions (3) conducted asynchronously by a potentially large, but actually small, group of self-selected experts (4) leading to an efficient and effective peer review technique. %B Proceedings of the 30th International Conference on Software Engineering (ICSE 2008) %S ICSE '08 %I ACM %C New York, NY, USA %P 541–550 %@ 978-1-60558-079-1 %U http://doi.acm.org/10.1145/1368088.1368162 %R 10.1145/1368088.1368162 %> https://flosshub.org/sites/flosshub.org/files/p541-rigby.pdf %0 Conference Paper %B Proceedings of the 2008 international working conference on Mining software repositories %D 2008 %T Understanding bug fix patterns in verilog %A Sudakrishnan, Sangeetha %A Madhavan, Janaki %A Whitehead,Jr., E. James %A Renau, Jose %K bug fixing %K error classification %K hdl %K verilog %K VHDL %X Today, many electronic systems are developed using a hardware description language, a kind of software that can be converted into integrated circuits or programmable logic devices. Like traditional software projects, hardware projects have bugs, and significant developer time is spent fixing them. A useful first step toward reducing bugs in hardware is developing an understanding of the frequency of different types of errors. Once the most common types are known, it is then possible to focus attention on eliminating them. As most hardware projects use software configuration management repositories, these can be mined for the textual bug fix changes. In this project, we analyze the bug fix history of four hardware projects written in Verilog and manually define 25 bug fix patterns. The frequency of each bug type is then computed for all projects. We find that 29 -- 55% of the bug fix pattern instances in Verilog involve assignment statements, while 18 -- 25% are related to if statements. %B Proceedings of the 2008 international working conference on Mining software repositories %S MSR '08 %I ACM %C New York, NY, USA %P 39–42 %@ 978-1-60558-024-1 %U http://doi.acm.org/10.1145/1370750.1370761 %R http://doi.acm.org/10.1145/1370750.1370761 %> https://flosshub.org/sites/flosshub.org/files/p39-sudakrishnan.pdf %0 Journal Article %J Information & Management %D 2007 %T Investigating recognition-based performance in an open content community: A social capital perspective %A Okoli, C. %A Oh, Wonseok %K open content %K recognition-based performance %K social capital %K social networks %K social status %K virtual communities %X As the open source movement grows, it becomes important to understand the dynamics that affect the motivation of participants who contribute their time freely to such projects. One important motivation that has been identified is the desire for formal recognition in the open source community. We investigated the impact of social capital in participants' social networks on their recognition-based performance; i.e., the formal status they are accorded in the community. We used a sample of 465 active participants in the Wikipedia open content encyclopedia community to investigate the effects of two types of social capital and found that network closure, measured by direct and indirect ties, had a significant positive effect on increasing participants' recognition-based performance. Structural holes had mixed effects on participants' status, but were generally a source of social capital. (C) 2007 Elsevier B.V. All rights reserved. %B Information & Management %V 44 %P 240-252 %8 Apr %@ 0378-7206 %G eng %M ISI:000247156800002 %1 management %2 SNA %0 Journal Article %J International Economics and Economic Policy %D 2007 %T Open source software: Motivation and restrictive licensing %A Fershtman, Chaim %A Gandal, Neil %K contributions %K contributors %K developers %K incentives %K license analysis %K licenses %K lines of code %K loc %K MOTIVATION %K restrictive %K scm %K size %K status %K version history %X Open source software (OSS) is an economic paradox. Development of open source software is often done by unpaid volunteers and the source code is typically freely available. Surveys suggest that status, signaling, and intrinsic motivations play an important role in inducing developers to invest effort. Contribution to an OSS project is rewarded by adding one’s name to the list of contributors which is publicly observable. Such incentives imply that programmers may have little incentive to contribute beyond the threshold level required for being listed as a contributor. Using a unique data set we empirically examine this hypothesis. We find that the output per contributor in open source projects is much higher when licenses are less restrictive and more commercially oriented. These results indeed suggest a status, signaling, or intrinsic motivation for participation in OSS projects with restrictive licenses. %B International Economics and Economic Policy %I Springer Berlin / Heidelberg %V 4 %P 209-225 %U http://dx.doi.org/10.1007/s10368-007-0086-4 %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Release Pattern Discovery via Partitioning: Methodology and Case Study %A Hindle, Abram %A Godfrey, Michael W. %A Holt, Richard C. %K bitkeeper %K bt2csv %K cvs %K evolution %K mysql %K releases %K revision history %K scm %K softchange %K version control %X The development of Open Source systems produces a variety of software artifacts such as source code, version control records, bug reports, and email discussions. Since the development is distributed across different tool environments and developer practices, any analysis of project behavior must be inferred from whatever common artifacts happen to be available. In this paper, we propose an approach to characterizing a project's behavior around the time of major and minor releases; we do this by partitioning the observed activities, such as artifact check-ins, around the dates of major and minor releases, and then look for recognizable patterns. We validate this approach by means of a case study on the MySQL database system; in this case study, we found patterns which suggested MySQL was behaving consistently within itself. These patterns included testing and documenting that took place more before a release than after and that the rate of source code changes dipped around release time. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 19 - 19 %@ 0-7695-2950-X %R 10.1109/MSR.2007.28 %> https://flosshub.org/sites/flosshub.org/files/28300019.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Visual Data Mining in Software Archives to Detect How Developers Work Together %A Weissgerber, Peter %A Pohl, Mathias %A Burch, Michael %K change %K coordination %K cvs %K developers %K junit %K modules %K scm %K source code %K svn %K teams %K tomcat %K visualization %X Analyzing the check-in information of open source software projects which use a version control system such as CVS or SUBVERSION can yield interesting and important insights into the programming behavior of developers. As in every major project tasks are assigned to many developers, the development must be coordinated between these programmers. This paper describes three visualization techniques that help to examine how programmers work together, e.g. if they work as a team or if they develop their part of the software separate from each other. Furthermore, phases of stagnation in the lifetime of a project can be uncovered and thus, possible problems are revealed. To demonstrate the usefulness of these visualization techniques we performed case studies on two open source projects. In these studies interesting patterns of developers? behavior, e.g. the specialization on a certain module can be observed. Moreover, modules that have been changed by many developers can be identified as well as such ones that have been altered by only one programmer. %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 9 - 9 %@ 0-7695-2950-X %R 10.1109/MSR.2007.34 %> https://flosshub.org/sites/flosshub.org/files/28300009.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Applying the evolution radar to PostgreSQL %A D'Ambros, Marco %A Lanza, Michele %K cvs %K documentation %K evolution %K evolution radar %K logical coupling %K makefile %K mining challenge %K msr challenge %K postgresql %K re-engineering %K refactoring %K release history %K rhdb %K source code %K version control %K visualization %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 177–178 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138029 %R http://doi.acm.org/10.1145/1137983.1138029 %> https://flosshub.org/sites/flosshub.org/files/177ApplyingEvolution.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Are refactorings less error-prone than other changes? %A Weißgerber, Peter %A Diehl, Stephan %K argouml %K bug reports %K bugs %K change history %K jedit %K junit %K re-engineering %K refactoring %K reverse engineering %K software evolution %K version control %X Refactorings are program transformations which should preserve the program behavior. Consequently, we expect that during phases when there are mostly refactorings in the change history of a system, only few new bugs are introduced. For our case study we analyzed the version histories of several open source systems and reconstructed the refactorings performed. Furthermore, we obtained bug reports from various sources depending on the system. Based on this data we identify phases when the above hypothesis holds and those when it doesn't. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 112–118 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138011 %R http://doi.acm.org/10.1145/1137983.1138011 %> https://flosshub.org/sites/flosshub.org/files/112AreRefactorings.pdf %0 Conference Paper %B Proceedings of the 28th international conference on Software engineering %D 2006 %T A case study of a corporate open source development model %A Gurbani, Vijay K. %A Garvert, Anita %A Herbsleb, James D. %K architecture %K case study %K open source %K session initiation protocol %K software development %K vkg %X Open source practices and tools have proven to be highly effective for overcoming the many problems of geographically distributed software development. We know relatively little, however, about the range of settings in which they work. In particular, can corporations use the open source development model effectively for software projects inside the corporate domain? Or are these tools and practices incompatible with development environments, management practices, and market-driven schedule and feature decisions typical of a commercial software house? We present a case study of open source software development methodology adopted by a significant commercial software project in the telecommunications domain. We extract a number of lessons learned from the experience, and identify open research questions. %B Proceedings of the 28th international conference on Software engineering %S ICSE '06 %I ACM %C New York, NY, USA %P 472–481 %@ 1-59593-375-1 %U http://doi.acm.org/10.1145/1134285.1134352 %R 10.1145/1134285.1134352 %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Co-change visualization applied to PostgreSQL and ArgoUML: (MSR challenge report) %A Beyer, Dirk %K argouml %K ccvisu %K cvs %K force-directed graph layout %K graph %K mining challenge %K msr challenge %K postgresql %K software clustering %K software structure analysis %K software visualization %K version control %K visualization %X Co-change visualization is a method to recover the subsystem structure of a software system from the version history, based on common changes and visual clustering. This paper presents the results of applying the tool CCVisu which implements co-change visualization, to the two open-source software systems PostgreSQL and ArgoUML The input of the method is the co-change graph, which can be easily extracted by CCVisu from a Cvs version repository. The output is a graph layout that places software artifacts that were often commonly changed at close positions, and artifacts that were rarely co-changed at distant positions. This property of the layout is due to the clustering property of the underlying energy model,which evaluates the quality of a produced layout. The layout can be displayed on the screen, or saved to a file in SVG or VRML format. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 165–166 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138023 %R http://doi.acm.org/10.1145/1137983.1138023 %> https://flosshub.org/sites/flosshub.org/files/165Co-Change.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T The evolution radar: visualizing integrated logical coupling information %A D'Ambros, Marco %A Lanza, Michele %A Lungu, Mircea %K change management %K cvs %K evolution %K logical coupling %K mozilla %K scm %K source code %K thunderbird %K tinderbox %K visualization %X In software evolution research logical coupling has extensively been used to recover the hidden dependencies between source code artifacts. They would otherwise go lost because of the file-based nature of current versioning systems. Previous research has dealt with low-level couplings between files, leading to an explosion of data to be analyzed, or has abstracted the logical couplings to module level, leading to a loss of detailed information. In this paper we propose a visualization-based approach which integrates both file-level and module-level logical coupling information. This not only facilitates an in-depth analysis of the logical couplings at all granularity levels, it also leads to a precise characterization of the system modules in terms of their logical coupling dependencies. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 26–32 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1137992 %R http://doi.acm.org/10.1145/1137983.1137992 %> https://flosshub.org/sites/flosshub.org/files/26TheEvolutionRadar.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Fine grained indexing of software repositories to support impact analysis %A Canfora, Gerardo %A Cerulo, Luigi %K argouml %K change analysis %K Firefox %K gedit %K impact analysis %K mining software repositories %K scm %K source code %K version control %X Versioned and bug-tracked software systems provide a huge amount of historical data regarding source code changes and issues management. In this paper we deal with impact analysis of a change request and show that data stored in software repositories are a good descriptor on how past change requests have been resolved. A fine grained analysis method of software repositories is used to index code at different levels of granularity, such as lines of code and source files, with free text contained in software repositories. The method exploits information retrieval algorithms to link the change request description and code entities impacted by similar past change requests. We evaluate such approach on a set of three open-source projects. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 105–111 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138009 %R http://doi.acm.org/10.1145/1137983.1138009 %> https://flosshub.org/sites/flosshub.org/files/105FineGrained.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Mining software repositories with CVSgrab %A Voinea, Lucian %A Telea, Alexandru %K argouml %K cvs %K cvsgrab %K evolution %K mining challenge %K msr challenge %K postgresql %K software visualization %K source code %K team %K visualization %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 167–168 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138024 %R http://doi.acm.org/10.1145/1137983.1138024 %> https://flosshub.org/sites/flosshub.org/files/167MiningSoftware.pdf %0 Journal Article %J Management Science %D 2006 %T Motivation, Governance, and the Viability of Hybrid Forms in Open Source Software Development %A Shah, Sonali K. %K email %K email archives %K governance %K INNOVATION %K interview %K mailing list %K MOTIVATION %K open source software development %K Volunteers %X Open source software projects rely on the voluntary efforts of thousands of software developers, yet we know little about why developers choose to participate in this collective development process. This paper inductively derives a framework for understanding participation from the perspective of the individual software developer based on data from two software communities with different governance structures. In both communities, a need for software-related improvements drives initial participation. The majority of participants leave the community once their needs are met, however, a small subset remains involved. For this set of developers, motives evolve over time and participation becomes a hobby. These hobbyists are critical to the long-term viability of the software code: They take on tasks that might otherwise go undone and work to maintain the simplicity and modularity of the code. Governance structures affect this evolution of motives. Implications for firms interested in implementing hybrid strategies designed to combine the advantages of open source software development with proprietary ownership and control are discussed. %B Management Science %V 52 %P 1000 - 1014 %8 07/2006 %U http://faculty.washington.edu/skshah/Shah%20-%20Motivation,%20Governance,%20Hybrid%20Forms.pdf %N 7 %! Management Science %R 10.1287/mnsc.1060.0553 %> https://flosshub.org/sites/flosshub.org/files/Shah%20-%20Motivation%2C%20Governance%2C%20Hybrid%20Forms.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Predicting defect densities in source code files with decision tree learners %A Knab, Patrick %A Pinzger, Martin %A Bernstein, Abraham %K change analysis %K data mining %K decision tree learner %K defect density %K defect prediction %K mozilla %K prediction %K release history %K scm %K source code %K version control %X With the advent of open source software repositories the data available for defect prediction in source files increased tremendously. Although traditional statistics turned out to derive reasonable results the sheer amount of data and the problem context of defect prediction demand sophisticated analysis such as provided by current data mining and machine learning techniques.In this work we focus on defect density prediction and present an approach that applies a decision tree learner on evolution data extracted from the Mozilla open source web browser project. The evolution data includes different source code, modification, and defect measures computed from seven recent Mozilla releases. Among the modification measures we also take into account the change coupling, a measure for the number of change-dependencies between source files. The main reason for choosing decision tree learners, instead of for example neural nets, was the goal of finding underlying rules which can be easily interpreted by humans. To find these rules, we set up a number of experiments to test common hypotheses regarding defects in software entities. Our experiments showed, that a simple tree learner can produce good results with various sets of input data. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 119–125 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138012 %R http://doi.acm.org/10.1145/1137983.1138012 %> https://flosshub.org/sites/flosshub.org/files/119Predicting.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Using evolutionary annotations from change logs to enhance program comprehension %A Daniel M. German %A Peter C. Rigby %A Storey, Margaret-Anne %K annotations %K apache %K bug tracking %K change history %K eclipse %K evolutionary %K log files %K mailing lists %K mining software repositories %K software evolution %K version control %X Evolutionary annotations are descriptions of how source code evolves over time. Typical source comments, given their static nature, are usually inadequate for describing how a program has evolved over time; instead, source code comments are typically a description of what a program currently does. We propose the use of evolutionary annotations as a way of describing the rationale behind changes applied to a given program (for example "These lines were added to ..."). Evolutionary annotations can assist a software developer in the understanding of how a given portion of source code works by showing him how the source has evolved into its current form.In this paper we describe a method to automatically create evolutionary annotations from change logs, defect tracking systems and mailing lists. We describe the design of a prototype for Eclipse that can filter and present these annotations alongside their corresponding source code and in workbench views. We use Apache as a test case to demonstrate the feasibility of this approach. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 159–162 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138020 %R http://doi.acm.org/10.1145/1137983.1138020 %> https://flosshub.org/sites/flosshub.org/files/159UsingEvolutionary.pdf %0 Conference Paper %B Proceedings of the 2005 international workshop on Mining software repositories %D 2005 %T Accelerating cross-project knowledge collaboration using collaborative filtering and social networks %A Ohira, Masao %A Ohsugi, Naoki %A Ohoka, Tetsuya %A Matsumoto, Ken-ichi %K collaborative filtering %K developers %K knowledge collaboration %K projects %K social networks %K sourceforge %K visualization tool %X Vast numbers of free/open source software (F/OSS) development projects use hosting sites such as Java.net and SourceForge.net. These sites provide each project with a variety of software repositories (e.g. repositories for source code sharing, bug tracking, discussions, etc.) as a media for communication and collaboration. They tend to focus on supporting rich collaboration among members in each project. However, a majority of hosted projects are relatively small projects consisting of few developers and often need more resources for solving problems. In order to support cross-project knowledge collaboration in F/OSS development, we have been developing tools to collect data of projects and developers at SourceForge, and to visualize the relationship among them using the techniques of collaborative filtering and social networks. The tools help a developer identify “who should I ask?” and “what can I ask?” and so on. In this paper, we report a case study of applying the tools to F/OSS projects data collected from SourceForge and how effective the tools can be used for helping cross-project knowledge collaboration. %B Proceedings of the 2005 international workshop on Mining software repositories %S MSR '05 %I ACM %C New York, NY, USA %P 111-115 %@ 1-59593-123-6 %U http://doi.acm.org/10.1145/1082983.1083163 %R http://doi.acm.org/10.1145/1082983.1083163 %> https://flosshub.org/sites/flosshub.org/files/111Accelerating.pdf %0 Conference Paper %B Proceedings of the 2005 international workshop on Mining software repositories %D 2005 %T Developer identification methods for integrated data from various sources %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %K anonymization %K bug tracker %K developers %K email %K email address %K gnome %K identity %K mailing list %K privacy %K source code %K version control %X Studying a software project by mining data from a single repository has been a very active research field in software engineering during the last years. However, few efforts have been devoted to perform studies by integrating data from various repositories, with different kinds of information, which would, for instance, track the different activities of developers. One of the main problems of these multi-repository studies is the different identities that developers use when they interact with different tools in different contexts. This makes them appear as different entities when data is mined from different repositories (and in some cases, even from a single one). In this paper we propose an approach, based on the application of heuristics, to identify the many identities of developers in such cases, and a data structure for allowing both the anonymized distribution of information, and the tracking of identities for verification purposes. The methodology will be presented in general, and applied to the GNOME project as a case example. Privacy issues and partial merging with new data sources will also be considered and discussed. %B Proceedings of the 2005 international workshop on Mining software repositories %S MSR '05 %I ACM %C New York, NY, USA %P 106-110 %@ 1-59593-123-6 %U http://doi.acm.org/10.1145/1082983.1083162 %R http://doi.acm.org/10.1145/1082983.1083162 %> https://flosshub.org/sites/flosshub.org/files/106DeveloperIdentification.pdf %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Evolution of Volunteer Participation in Libre Software Projects: Evidence from Debian %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %A Martin Michlmayr %K contributors %K debian %K maintainers %K PopCon %K popularity %K Volunteers %X Most libre software projects rely on the work of volunteers. Therefore, attracting people who contribute their time and technical skills is of paramount importance, both in technical and economic terms. This reliance on volunteers leads to some fundamental management challenges: volunteer contributions are inherently difficult to predict, plan and manage, especially in the case of large projects. In this paper we analyze the evolution in time of the human resources of one of the largest and most complex libre software projects composed primarily of volunteers, the Debian project. Debian currently has around 1300 volunteers working on several tasks: much activity is focused on packaging software applications and libraries, but there is also major work related to the maintenance of the infrastructure needed to sustain the development. We have performed a quantitative investigation of data from almost seven years, studying how volunteer involvement has affected the software... %B OSS2005: Open Source Systems %P 100-107 %U http://pascal.case.unibz.it/handle/2038/857 %> https://flosshub.org/sites/flosshub.org/files/robles_barahona_michlmayr-evolution_participation.pdf %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Quality Improvement in Volunteer Free Software Projects: Exploring the Impact of Release Management %A Martin Michlmayr %K free software %K open source %K process improvement %K quality assurance %K release management %K volunteer projects %X Even though free software has achieved great popularity and success in recent years, there are a number of product quality challenges facing the open source development model. There is significant room for further quality improvement and one area that deserves special attention is release management. This research will identify problems with current release practices, verify possible advantages of an increasingly popular release model, and develop interventions to improve release management in free software projects. The research also aims to answer the fundamental question as to how volunteer projects can deliver predictable and high quality software. %B OSS2005: Open Source Systems %P 309-310 %U http://pascal.case.unibz.it/handle/2038/1429 %0 Conference Paper %B Proceedings of the 2005 international workshop on Mining software repositories %D 2005 %T Understanding source code evolution using abstract syntax tree matching %A Neamtiu, Iulian %A Foster, Jeffrey S. %A Hicks, Michael %K abstract syntax trees %K apache %K bind %K evolution %K linux %K openssh %K software evolution %K source code %K source code analysis %K vsftpd %X Mining software repositories at the source code level can provide a greater understanding of how software evolves. We present a tool for quickly comparing the source code of different versions of a C program. The approach is based on partial abstract syntax tree matching, and can track simple changes to global variables, types and functions. These changes can characterize aspects of software evolution useful for answering higher level questions. In particular, we consider how they could be used to inform the design of a dynamic software updating system. We report results based on measurements of various versions of popular open source programs, including BIND, OpenSSH, Apache, Vsftpd and the Linux kernel. %B Proceedings of the 2005 international workshop on Mining software repositories %S MSR '05 %I ACM %C New York, NY, USA %P 2-6 %@ 1-59593-123-6 %U http://doi.acm.org/10.1145/1082983.1083143 %R http://doi.acm.org/10.1145/1082983.1083143 %> https://flosshub.org/sites/flosshub.org/files/2Understanding.pdf %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Unreliable Collaborators: Coordination in distributed volunteer teams %A Howison, James %K coordination %K floss organization %K MOTIVATION %K open source %K volunteer teams %X Drawing together the interest, skills and resources of individuals to pursue productive activity is the cornerstone of wealth creation. Recently, new forms of productive activity have emerged that draw together highly motivated, often volunteer, participants to collaborate through low-cost information systems to produce high quality products that rival those produced by wealthy firms and markets. Examples include free (libr´e) and open source software (FLOSS), such as Linux, and collaboratively edited texts, such as Wikipedia and the Open Directory. There is an opportunity to study these novel activities, to understand their organization, in order to both further their continued success and to assess whether and which of their novel organization techniques might be used in wider domains of human collaborative activity. %B OSS2005: Open Source Systems %P 305-306 %U http://pascal.case.unibz.it/handle/2038/1552 %0 Generic %D 2004 %T Applying Social Network Analysis to the Information in CVS Repositories %A López-Fernández, L. %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %K apache %K complex networks %K cvs %K gnome %K kde %K libre software engineering %K source code %K source code repositories %K visualization techniques %K vizualization %X The huge quantities of data available in the CVS repositories of large, long-lived libre (free, open source) software projects, and the many interrelationships among those data offer opportunities for extracting large amounts of valuable information about their structure, evolution and internal processes. Unfortunately, the sheer volume of that information renders it almost unusable without applying methodologies which highlight the relevant information for a given aspect of the project. In this paper, we propose the use of a well known set of methodologies (social network analysis) for characterizing libre software projects, their evolution over time and their internal structure. In addition, we show how we have applied such methodologies to real cases, and extract some preliminary conclusions from that experience. %B International Workshop on Mining Software Repositories (MSR 2004) %P 101-105 %> https://flosshub.org/sites/flosshub.org/files/101ApplyingSocial.pdf %0 Conference Paper %B Proceedings of the 2004 international workshop on Mining software repositories - MSR '04 %D 2004 %T Four Interesting Ways in Which History Can Teach Us About Software %A Michael Godfrey %A Xinyi Dong %A Cory Kapser %A Lijie Zou %K ant %K apache %K change analysis %K clone %K clone detection %K cvs %K evolution %K gcc %K growth %K kepler %K linux %K midworld %K mycore %K postgresql %K source code %K version control %X In this position paper, we outline four kinds of studies that we have undertaken in trying to understand various aspects of a software system’s evolutionary history. In each instance, the studies have involved detailed examination of real software systems based on “facts” extracted from various kinds of source artifact repositories, as well as the development of accompanying tools to aid in the extraction, abstraction, and comprehension processes. We briefly discuss the goals, results, and methodology of each approach. %B Proceedings of the 2004 international workshop on Mining software repositories - MSR '04 %P 58-62 %8 05/2004 %> https://flosshub.org/sites/flosshub.org/files/58FourInterestingWays.pdf %0 Journal Article %J Electronic Markets %D 2004 %T Managing Conflicts in Open Source Communities %A Ruben van Wendel de Joode %K abiword %K apache %K conflict %K covalent %K interviews %K organizational sponsorship %K Volunteers %X An increasing number of companies adopt open source software. These companies will typically pay programmers to participate in the design, development and maintenance of open source software. The programmers, however, are reported to have different interests compared to the voluntary programmers who dominate most open source communities. The diversity of interest will inevitably result in conflicts. To ensure that their interests are achieved, companies should understand how conflicts between their programmers and the voluntary programmers can be managed. The aim of this paper is to identify and discuss mechanisms that are currently present to manage conflicts in open source communities. The mechanisms identified in this paper are based on an explorative literature study and on 48 semi-structured interviews with programmers from a variety of open source communities. Four mechanisms have been identified and their relevance in the management of conflicts are discussed. They are: third-party intervention; modularity; parallel software development lines; and the exit option. The paper ends with an example of Covalent, which deploys parallel software development lines to manage conflicts in the Apache community. %B Electronic Markets %V 14 %P 104-113 %0 Journal Article %D 2004 %T Managing Volunteer Activity in Free Software Projects %A Martin Michlmayr %K debian %K volunteer %K volunteer teams %X During the last few years, thousands of volunteers have created a large body of free software. Even though this accomplishment shows that the free software development model works, there are some drawbacks associated with this model. Due to the volunteer nature of most free software projects, it is impossible to fully rely on participants. Volunteers may become busy and neglect their duties. This may lead to a steady decrease of quality as work is not being carried out. The problem of inactive volunteers is intensified by the fact that most free software projects are distributed, which makes it hard to quickly identify volunteers who neglect their duties. This paper shows Debian's approach to inactive volunteers. Insights presented here can be applied to other free software projects in order to implement effective quality assurance strategies. %8 July %G eng %> https://flosshub.org/sites/flosshub.org/files/michlmayr-mia.pdf %0 Conference Paper %B International Workshop on Mining Software Repositories (MSR 2004) %D 2004 %T Mining version control systems for FACs (frequently applied changes) %A Van Rysselberghe, F. %A Demeyer, S %K ccfinder %K change analysis %K change history %K clone %K clone detection %K cvs %K maintenance %K tomcat %K version control %X Today, programmers are forced to maintain a software system based on their gut feeling and experience. This paper makes an attempt to turn the software maintenance craft into a more disciplined activity, by mining for frequently applied changes in a version control system. Next to some initial results, we show how this technique allows to recover and study successful maintenance strategies, adopted for the redesign of long–lived systems. %B International Workshop on Mining Software Repositories (MSR 2004) %I IEE %C Edinburgh, Scotland, UK %V 2004 %P 48 - 52 %R 10.1049/ic:20040475 %> https://flosshub.org/sites/flosshub.org/files/48MiningVersion.pdf %0 Journal Article %J Electronic Markets %D 2004 %T Profiling an Open Source Project Ecology and Its Programmers %A Koch, Stefan %K affiliation network %K brooks law %K cocomo %K effort estimation %K evolution %K productivity %K project success %K scm %K size %K time %K version control %X While many successful and well-known open source projects produce output of high quality, a general assessment of this development paradigm is still missing. In this paper, an online community of both small and large, successful and failed projects and their programmers is analysed mainly using the version-control data of each project, also according to their productivity and estimation of expended effort. As the results show, there are indeed significant differences between this cooperative development model and the commercial organization of work in the areas explored. Both open source software projects in their size and their programmers' effort differ significantly, and the evolution of projects' size over time seems in part to contradict the laws of software evolution proposed for commercial systems. Both the inequality of effort distribution between programmers and an increasing number of developers in a project do not lead to a decrease in productivity, opposing Brooks's Law. Effort estimation based on the COCOMO model for commercial organizations shows a large amount of effort expended for the projects, while a more general Norden-Rayleigh modeling shows a distinctly smaller expenditure. This proposes that either a highly efficient development is achieved by this self-organizing cooperative and highly decentralized form of work, or that the participation of users besides programming tasks is enormous and constitutes an economic factor of large proportions. %B Electronic Markets %V 14 %P 77 - 88 %8 6/2004 %N 2 %! Electronic Markets %R 10.1080/10196780410001675031 %0 Journal Article %J Computers & Security %D 2003 %T The availability of source code in relation to timely response to security vulnerabilities %A John Reinke %A Hossein Saiedian %K bugtraq %K cert %K email %K email archives %K mailing list %K security %K vulnerability %X Once a vulnerability has been found in an application or service that runs on a computer connected to the Internet, fixing that exploit in a timely fashion is of the utmost importance. There are two parts to fixing vulnerability: a party acting on behalf of the application's vendor gives instructions to fix it or makes a patch available that can be downloaded; then someone using that information fixes the computer or application in question. This paper considers the effects of proprietary software versus non-proprietary software in determining the speed with which a security fix is made available, since this can minimize the amount of time that the computer system remains vulnerable. %B Computers & Security %V 22 %P 707 - 724 %U http://www.sciencedirect.com/science/article/B6V8G-4B9CV31-C/2/a218fccfaef185af5c122f118b252703 %R DOI: 10.1016/S0167-4048(03)00011-7 %0 Journal Article %J Organization Science %D 2003 %T From a Firm-Based to a Community-Based Model of Knowledge Creation: The Case of the Linux Kernel Development %A Lee, Gwendolyn K. %A Cole, Robert E. %K credits %K developers %K email %K email archives %K knowledge creation %K linux kernel %K mailing list %K maintainers %K scm %K source code %K Survey %K Volunteers %X We propose a new model of knowledge creation in purposeful, loosely coordinated, distributed systems, as an alternative to a firm-based one. Specifically, using the case of the Linux kernel development project, we build a model of community-based, evolutionary knowledge creation to study how thousands of talented volunteers, dispersed across organizational and geographical boundaries, collaborate via the Internet to produce a knowledge-intensive, innovative product of high quality. By comparing and contrasting the Linux model with the traditional/commercial model of software development and firm-based knowledge creation efforts, we show how the proposed model of knowledge creation expands beyond the boundary of the firm. Our model suggests that the product development process can be effectively organized as an evolutionary process of learning driven by criticism and error correction. We conclude by offering some theoretical implications of our community-based model of knowledge creation for the literature of organizational learning, community life, and the uses of knowledge in society. %B Organization Science %I INFORMS %V 14 %P pp. 633-649 %U http://www.jstor.org/stable/4135125 %0 Conference Paper %B Proceedings of the 2nd ICSE Workshop on Open Source %D 2002 %T Characterizing the OSS process %A Capiluppi, Andrea %A Patricia Lago %A Maurizio Morisio %K bugs %K change log %K classification %K cvs %K downloads %K freshmeat %K metadata %K patches %K popularity %K project success %K release history %K sourceforge %K vitality %X The Open Source model of software development has gained the attention of both the business, the practitioners’ and the research communities. The Open Source process has been described by the seminal paper by Eric Raymond [4] and [5]. However, sound empirical studies are still very limited [3], [6]. Our goal is to investigate the OS process by empirical means, to analyze, characterize it, and possibly model it with quantitative models. It should be noted that the Open Source process provides open process and product data, and therefore is a rare opportunity for empirical research. Our initial research focus is on the characterization of the process, starting from the evolution of OS projects. In traditional projects, a significant number of releases in a short time is usually considered an instability factor [7] and [8], while in the OSS community, it is an evidence of vitality, shows the commitment of the authors and the power of attraction of other programmers [9]. Is it possible to characterize the vitality of projects? And, can vitality be traced to some other characteristics of a project? %B Proceedings of the 2nd ICSE Workshop on Open Source %> https://flosshub.org/sites/flosshub.org/files/CapiluppiLagoMorisio.pdf %0 Conference Proceedings %B The Twenty-Third International Conference on Information Systems %D 2002 %T Economic incentives for participating in open source software projects %A Il-Horn Hann %A Jeff Roberts %A Sandra Slaughter %A Roy Fielding %K apache %K contributions %K email %K email archives %K mailing list %K organizational sponsorship %K participation %K patch %K scm %K source code %K Survey %K version control %X Using the Internet as a basis for communication, collaboration, and storage of artifacts, the open source community is producing software of a quality that was previously thought to be achievable only by professional engineers following strict software development paradigms. This accomplishment is even more astounding as developers contribute to the source code without any remuneration. Open source leaders as well as academics have proposed theories about the motivation of open source developers that are rooted in diverse fields such as social psychology and anthropology. However, Lerner and Tirole (2000) argue that developer participation in open source projects may, in part, be explained by existing economic theory regarding career concerns. This research seeks to confirm or disconfirm the existence of economic returns to participation in open source development. Our findings suggest that greater open source participation per se, as measured in contributions made, is not associated with wage increases. However, a higher status in a merit-based ranking within the Apache Project is associated with significantly higher wages. This suggests that employers do not reward the gain in experience through open source participation as an increase in human capital. The results are also consistent with the notion that a high rank within the Apache Software Foundation is a credible signal of the productive capacity of a programmer. %B The Twenty-Third International Conference on Information Systems %P 365–372 %G eng %> https://flosshub.org/sites/flosshub.org/files/42.pdf %0 Journal Article %J IEE Proceedings Software %D 2002 %T Open Source Software Projects as Virtual Organizations: Competency Rallying for Software Development %A Kevin Crowston %A Barbara Scozzi %K competencies %K competency rallying %K coordination %K project success %K sourceforge %K virtual organizations %X The contribution of this paper is the identification and testing of factors important for the success of Open Source Software (OSS) projects. We present an analysis of OSS communities as virtual organizations and apply Katzy and Crowston's (2000) competency rallying (CR) theory to the case of OSS development projects. CR theory suggests that project participants must develop necessary competencies, identify and understand market opportunities, marshal competencies to meet the opportunity and manage a short-term cooperative process. Using data collected from 7477 OSS projects hosted by the SourceForge system (http://sourceforge.net/), we formulate and test a set of specific hypotheses derived from CR theory. %B IEE Proceedings Software %V 149 %P 3–17 %G eng %> https://flosshub.org/sites/flosshub.org/files/crowston.pdf %0 Journal Article %J Software, {IEE} Proceedings - %D 2002 %T Trust and vulnerability in open source software %A Hissam, S. A. %A Plakosh, D. %A Weinstock, C. %K closed source software %K community of software developers %K critical infrastructures %K cyber criminal %K open source software %K PITAC %K predictably reliable systems %K predictably secure systems %K software components %K trust %K users %K vulnerability %X Software plays an ever increasing role in the critical infrastructures that run our cities, manage our economies, and defend our nations. In 1999, the Presidents Information Technology Advisory Committee (PITAC) reported to the United States President the need for software components that are reliable, tested, modelled and secure supporting the development of predictably reliable and secure systems that underscore our critical infrastructures. Open source software (OSS) constitutes a viable source for software components. Some believe that OSS is more reliable and more secure than closed source software (CSS)-due to a phenomenon dubbed 'many eyeballs'-but is this truly the case? Or does OSS give the cyber criminal an edge that he would likewise not have? We explore OSS from the perspective of the cyber criminal and discuss what the community of software developers and users alike can do to increase their trust in both open source software and closed source software %B Software, {IEE} Proceedings - %V 149 %P 47–51 %8 02/2002 %N 1 %& 47 %R 10.1049/ip-sen:20020208 %0 Conference Paper %B Proceedings of the 2nd ICSE Workshop on Open Source %D 2002 %T Version Control: A Case Study in the Challenges and Opportunities for Open Source Software Development %A Chu-Carroll, M.C. %A Sheilds, D. %A Wright, J. %K cvs %K kernel %K linux %K linux kernel %K version control %X The growth of the worldwide open source development effort, driven in part by the recent entrance of large corporations into the open source arena, offers new opportunities to improve the software engineering tools available for that effort. Indeed, the increasing difficulty of managing large open source projects, as well as that of integrating related efforts into new programming environments, represents a challenge that must be met if the rapid growth of open source software is to continue. This position paper addresses these issues in the context of software version control. %B Proceedings of the 2nd ICSE Workshop on Open Source %> https://flosshub.org/sites/flosshub.org/files/ChuCarrollShieldsWright.pdf %0 Conference Paper %B Proceedings of the 2nd ICSE Workshop on Open Source %D 2002 %T Why Do Developers Contribute to Open Source Projects? First Evidence of Economic Incentives %A Il-Horn Hann %A Jeff Roberts %A Sandra Slaughter %A Roy Fielding %K apache %K contributions %K cvs %K developers %K ECONOMICS %K email %K email archives %K financial %K Human capital %K mailing list %K MOTIVATION %K participation %K source code %K version control %X The availability of commercial quality, free software products such as the Apache HTTP (web) server or the Linux operating system has focused significant attention on the open source development process by which these products were created. One of the more perplexing aspects of open source software projects is why developers freely devote their time and energy to these projects. While many open source participants cite idealistic motives for participation, Lerner and Tirole (2000) argue that developer participation in open source projects may, in part, be explained by existing economic theory regarding career concerns. This research seeks to confirm or disconfirm the existence of economic returns to participation in open source development. Preliminary results of our empirical investigation suggest that greater open source participation per se, as measured in contributions made, does not lead to wage increases. However, a higher status in a merit-based ranking within the Apache Project does lead to significantly higher wages. This suggests that employers do not reward the gain in experience through open source participation as an increase in human capital. The results are also consistent with the notion that a high rank within the Apache Software Foundation is a credible signal of the productive capacity of a programmer. %B Proceedings of the 2nd ICSE Workshop on Open Source %> https://flosshub.org/sites/flosshub.org/files/HannRobertsSlaughterFielding.pdf %0 Conference Paper %B Proceedings of the 4th International Workshop on Principles of Software Evolution (IWPSE 2001) %D 2001 %T Growth, evolution, and structural change in open source software %A Michael Godfrey %A Tu, Qiang %K agile methods %K beagle %K cloning %K evolution %K fetchmail %K gcc %K growth %K kernel %K lehman's laws %K lines of code %K linux %K linux kernel %K loc %K open source software %K software architecture %K software evolution %K source code %K structural change %K supporting environments %K vim %X Our recent work has addressed how and why software systems evolve over time, with a particular emphasis on software architecture and open source software systems [2, 3, 6]. In this position paper, we present a short summary of two recent projects. First, we have performed a case study on the evolution of the Linux kernel [3], as well as some other open source software (OSS) systems. We have found that several OSS systems appear not to obey some of "Lehman's laws" of software evolution [5, 7], and that Linux in particular is continuing to grow at a geometric rate. Currently, we are working on a detailed study of the evolution of one of the subsystems of the Linux kernel: the SCSI drivers subsystem. We have found that cloning, which is usually considered to be an indicator of lazy development and poor process, is quite common and is even considered to be a useful practice. Second, we are developing a tool called Beagle to aid software maintainers in understanding how large systems have changed over time. Beagle integrates data from various static analysis and metrics tools and provides a query engine as well as navigable visualizations. Of particular note, Beagle aims to provide help in modelling long term evolution of systems that have undergone architectural and structural change. %B Proceedings of the 4th International Workshop on Principles of Software Evolution (IWPSE 2001) %S IWPSE '01 %I ACM %C New York, NY, USA %P 103–106 %@ 1-58113-508-4 %U http://doi.acm.org/10.1145/602461.602482 %R http://doi.acm.org/10.1145/602461.602482 %> https://flosshub.org/sites/flosshub.org/files/tu2001.pdf %0 Conference Paper %B 1st Workshop on Open Source Software Engineering at ICSE 2001 %D 2001 %T Software Engineering Research in the Bazaar %A Hassan, Ahmed E. %A Godfrey, Michael W. %A Holt, Richard C. %K apache %K architecture %K gcc %K kernel %K linux %K linux kernel %K mozilla %K open source software %K software architecture %K Software Engineering Research %K source code %K vim %X During the last five years, our research group has studied the architecture and evolution of several large open source systems — including Linux, GCC, VIM, Mozilla, and Apache — and we have found that open source software systems often exhibit interesting differences when compared to similar commercially-developed systems. Our investigations of these systems have involved the creation of software architecture models, software architecture repair, the creation of a reference architecture for web servers, the study of evolution and growth of open source systems, and the modelling of architectural properties of systems that are apparent only at build time. %B 1st Workshop on Open Source Software Engineering at ICSE 2001 %> https://flosshub.org/sites/flosshub.org/files/hassangodfreyholt.pdf %0 Journal Article %J Information Systems Journal %D 2001 %T Striking a balance between trust anti control in a virtual organization: a content analysis of open source software case studies %A Gallivan, M. J. %K apache %K case studies %K Control %K fetchmail %K jun %K linux %K linux kernel %K McDonaldization %K mozilla %K networked organization %K perl %K rationalization %K trust %K virtual organization %X Many organization theorists have predicted the emergence of the networked or virtual firm as a model for the design of future organizations. Researchers have also emphasized the importance of trust as a necessary condition for ensuring the success of virtual organizations. This paper examines the open source software (OSS) 'movement' as an example of a virtual organization and proposes a model that runs contrary to the belief that trust is critical for virtual organizations. Instead, I argue that various control mechanisms can ensure the effective performance of autonomous agents who participate in virtual organizations. Borrowing from the theory of the 'McDonaldization' of society, I argue that, given a set of practices to ensure the control, efficiency, predictability and calculability of processes and outcomes in virtual organizations, effective performance may occur in the absence of trust. As support for my argument, I employ content analysis to examine a set of published case studies of OSS projects. My results show that, although that trust is rarely mentioned, ensuring control is an important criterion for effective performance within OSS projects. The case studies feature few references to other dimensions of 'McDonaldization' (efficiency, predictability and calculability), however, and I conclude that the OSS movement relies on many other forms of social control and self-control, which are often unacknowledged in OSS projects. Through these implicit forms of control, OSS projects are able to secure the cooperation of the autonomous agents that participate in project teams. I conclude by extrapolating from these case studies to other virtual organizations. %B Information Systems Journal %V 11 %P 277-304 %G eng %M WOS:000172198800003 %1 information systems %2 case study