%0 Conference Proceedings %B 2017 IEEE/ACM 12th International Workshop on Software Engineering for Science (SE4Science) %D 2017 %T Advancing Open Science with Version Control and Blockchains %A Jonathan Bell %A Thomas D. LaToza %A Foteini Baldmitsi %A Angelos Stavrou %K blockchain %K replication %K reproducible %X The scientific community is facing a crisis of reproducibility: confidence in scientific results is damaged by concerns regarding the integrity of experimental data and the analyses applied to that data. Experimental integrity can be compromised inadvertently when researchers overlook some important component of their experimental procedure, or intentionally by researchers or malicious third-parties who are biased towards ensuring a specific outcome of an experiment. The scientific community has pushed for “open science” to add transparency to the experimental process, asking researchers to publicly register their data sets and experimental procedures. We argue that the software engineering community can leverage its expertise in tracking traceability and provenance of source code and its related artifacts to simplify data management for scientists. Moreover, by leveraging smart contract and blockchain technologies, we believe that it is possible for such a system to guarantee end-to-end integrity of scientific data and results while supporting collaborative research. %B 2017 IEEE/ACM 12th International Workshop on Software Engineering for Science (SE4Science) %P 13-14 %8 05/2017 %0 Conference Proceedings %B Open Source Systems: Towards Robust Practices 13th International Conference on Open Source Systems %D 2017 %T Challenges in Validating FLOSS Configuration %A Raab, M %A Barany, G %X Developers invest much effort into validating configuration during startup of free/libre and open source software (FLOSS) applications. Nevertheless, hardly any tools exist to validate configuration files to detect misconfigurations earlier. This paper aims at understanding the challenges to provide better tools for configuration validation. We use mixed methodology: (1) We analyzed 2,683 run-time configuration accesses in the source-code of 16 applications comprising 50 million lines of code. (2) We conducted a questionnaire survey with 162 FLOSS contributors completing the survey. We report our experiences about building up a FLOSS community that tackles the issues by unifying configuration validation with an external configuration access specification. We discovered that information necessary for validation is often missing in the applications and FLOSS developers dislike dependencies on external packages for such validations. %B Open Source Systems: Towards Robust Practices 13th International Conference on Open Source Systems %S IFIP Advances in Information and Communication Technology %I Springer %V 496 %P 101-114 %8 05/2017 %U https://link.springer.com/chapter/10.1007/978-3-319-57735-7_11 %R 10.1007/978-3-319-57735-7_11 %0 Conference Proceedings %B 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) %D 2017 %T Classifying code comments in Java open-source software systems %A Luca Pascarella %A Bacchelli, Alberto %K java %K Survey %X Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code comments enhance the readability of the code. Nevertheless, not all the comments have the same goal and target audience. In this paper, we investigate how six diverse Java OSS projects use code comments, with the aim of understanding their purpose. Through our analysis, we produce a taxonomy of source code comments; subsequently, we investigate how often each category occur by manually classifying more than 2,000 code comments from the aforementioned projects. In addition, we conduct an initial evaluation on how to automatically classify code comments at line level into our taxonomy using machine learning; initial results are promising and suggest that an accurate classification is within reach. %B 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) %P 227-237 %8 05/2017 %0 Conference Paper %B Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education %D 2017 %T Community Engagement with Free and Open Source Software %A Murphy, Christian %A Buffardi, Kevin %A Dehlinger, Josh %A Lambert, Lynn %A Veilleux, Nanette %K free and open source software (FOSS) %K humanitarian free and open source software (HFOSS) %K localized free and open source software (LFOSS) %K under-represented minorities (URM) %X A common refrain from Senior Exit Surveys and Alumni Surveys is the desire to work on "real-world," "practical" and "hands-on" projects using industry-ready tools and development environments. To assuage this, institutions have moved towards adopting Free and Open Source Software (FOSS) as an avenue to provide meaningful, applied learning interventions to students. Through these experiences, students benefit from engagement with various communities including: the community of contributors to the FOSS project; the community of local software developers; the community of citizens who reside in the local area; the community of students at their institution and others; and, the community of people impacted by the FOSS project. These engagements motivate students, enhance their communication and technical skills, allow them to grow and become more confident, help them form professional networks, and provide the "real-world" projects they seek. In this panel, we will discuss our experiences in engaging students with five different types of communities as part of incorporating FOSS into our courses, focusing on how other educators can provide the same benefits to their students as well. In order to satisfy the time constraints of the panel, the last two authors will present together. %B Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer Science Education %S SIGCSE '17 %I ACM %C New York, NY, USA %P 669–670 %@ 978-1-4503-4698-6 %U http://doi.acm.org/10.1145/3017680.3017682 %R 10.1145/3017680.3017682 %0 Journal Article %J Computer Languages, Systems & Structures %D 2017 %T Empirical analysis of search based algorithms to identify change prone classes of open source software %A Bansal, Ankita %K Change proneness; Metrics; Object oriented paradigm; Search based algorithms; Software quality; Empirical validation %X There are numerous reasons leading to change in software such as changing requirements, changing technology, increasing customer demands, fixing of defects etc. Thus, identifying and analyzing the change-prone classes of the software during software evolution is gaining wide importance in the field of software engineering. This would help software developers to judiciously allocate the resources used for testing and maintenance. Software metrics can be used for constructing various classification models which can be used for timely identification of change prone classes. Search based algorithms which form a subset of machine learning algorithms can be utilized for constructing prediction models to identify change prone classes of software. Search based algorithms use a fitness function to find the best optimal solution among all the possible solutions. In this work, we analyze the effectiveness of hybridized search based algorithms for change prediction. In other words, the aim of this work is to find whether search based algorithms are capable for accurate model construction to predict change prone classes. We have also constructed models using machine learning techniques and compared the performance of these models with the models constructed using Search Based Algorithms. The validation is carried out on two open source Apache projects, Rave and Commons Math. The results prove the effectiveness of hybridized search based algorithms in predicting change prone classes of software. Thus, they can be utilized by the software developers to produce an efficient and better developed software. %B Computer Languages, Systems & Structures %V 47 %P 211 - 231 %8 01/2017 %! Computer Languages, Systems & Structures %R 10.1016/j.cl.2016.10.001 %0 Book Section %B Advances in Ubiquitous Networking 2: Proceedings of the UNet'16 %D 2017 %T Knowledge Flows Within Open Source Software Projects: A Social Network Perspective %A Kerzazi, Noureddine %A El Asri, Ikram %E El-Azouzi, Rachid %E Menasche, Daniel Sadoc %E Sabir, Essaïd %E De Pellegrini, Francesco %E Benjillali, Mustapha %K expertise %K Knowledge flows %K open source %K SNA %X Developing software is knowledge-intensive activity, requiring extensive technical knowledge and awareness. The abstract part of development is the social interactions that drive knowledge flows between contributors, especially for Open Source Software (OSS). This study investigated knowledge sharing and propagation from social perspective using social network analysis (SNA). We mined and analyzed the issue and review histories of three OSS from GitHub. Particular attention has been paid to the socio-interactions through comments from contributors on reviews. We aim at explaining the propagation and density of knowledge flows within contributor networks. The results show that review requests flow from the core contributors toward peripheral contributors and comments on reviews are in a continuous loop from the core teams to the peripherals and back; and the core contributors leverage on their awareness and technical knowledge to increase their notoriety by playing the role of communication brokers supported by comments on work items. %B Advances in Ubiquitous Networking 2: Proceedings of the UNet'16 %I Springer Singapore %C Singapore %P 247–258 %@ 978-981-10-1627-1 %U http://dx.doi.org/10.1007/978-981-10-1627-1_19 %R 10.1007/978-981-10-1627-1_19 %0 Conference Proceedings %B 2017 IEEE/ACM 39th International Conference on Software Engineering %D 2017 %T Machine Learning-Based Detection of Open Source License Exceptions %A Vendome, Christopher %A Mario Linares-Vasquez %A Bavota, Gabriele %A Di Penta, Massimiliano %A Daniel M. German %A Poshyvanyk, Denys %K classifier %K empirical studies %K license %K machine learning %X From a legal perspective, software licenses govern the redistribution, reuse, and modification of software as both source and binary code. Free and Open Source Software (FOSS) licenses vary in the degree to which they are permissive or restrictive in allowing redistribution or modification under licenses different from the original one(s). In certain cases developers may modify the license by appending to it an exception to specifically allow reuse or modification under a particular condition. These exceptions are an important factor to consider for license compliance analysis since they modify the standard (and widely understood_ terms of the original license. In this work, we first perform a large-scale empirical study on the change history of over 51k FOSS systems aimed at quantitatively investigating the prevalence of known license exceptions and identifying new ones. Subsequently, we performed a study on the detection of license exceptions by relying on machine learning. We evaluated the license exception classification with four different supervised learners and sensitivity analysis. Finally we present a categorization of license exceptions and explain their implications. %B 2017 IEEE/ACM 39th International Conference on Software Engineering %P 118-129 %8 05/2017 %R 10.1109/ICSE.2017.19 %0 Journal Article %J Journal of Systems and Software %D 2017 %T Predicting bug-fixing time: A replication study using an open source software project %A Akbarinasaji, Shirin %A Caglayan, Bora %A Bener, Ayse %K Replication study; Bug fixing time; Effort estimation; Software maintainability; Deferred bugs %X Background: On projects with tight schedules and limited budgets, it may not be possible to resolve all known bugs before the next release. Estimates of the time required to fix known bugs (the “bug fixing time”) would assist managers in allocating bug fixing resources when faced with a high volume of bug reports. Aim: In this work, we aim to replicate a model for predicting bug fixing time with open source data from Bugzilla Firefox. Method: To perform the replication study, we follow the replication guidelines put forth by Carver [J. C. Carver, Towards reporting guidelines for experimental replications: a proposal, in: 1st International Workshop on Replication in Empirical Software Engineering, 2010.]. Similar to the original study, we apply a Markov-based model to predict the number of bugs that can be fixed monthly. In addition, we employ Monte-Carlo simulation to predict the total fixing time for a given number of bugs. We then use the k-nearest neighbors algorithm to classify fixing times into slow and fast. Result: The results of the replicated study on Firefox are consistent with those of the original study. The results show that there are similarities in the bug handling behaviour of both systems. Conclusion: We conclude that the model that estimates the bug fixing time is robust enough to be generalized, and we can rely on this model for our future research. %B Journal of Systems and Software %8 2/2017 %! Journal of Systems and Software %R 10.1016/j.jss.2017.02.021 %0 Journal Article %J IEEE Transactions on Software Engineering %D 2017 %T Process Aspects and Social Dynamics of Contemporary Code Review: Insights from Open Source Development and Industrial Practice at Microsoft %A Bosu, Amiangshu %A Carver, Jeffrey C. %A Christian Bird %A Orbeck, Jonathan %A Chockley, Christopher %K code review %K commercial projects %K peer impressions %K Survey %X Many open source and commercial developers practice contemporary code review, a lightweight, informal, tool-based code review process. To better understand this process and its benefits, we gathered information about code review practices via surveys of open source software developers and developers from Microsoft. The results of our analysis suggest that developers spend approximately 10-15 percent of their time in code reviews, with the amount of effort increasing with experience. Developers consider code review important, stating that in addition to finding defects, code reviews offer other benefits, including knowledge sharing, community building, and maintaining code quality. The quality of the code submitted for review helps reviewers form impressions about their teammates, which can influence future collaborations. We found a large amount of similarity between the Microsoft and OSS respondents. One interesting difference is that while OSS respondents view code review as an important method of impression formation, Microsoft respondents found knowledge dissemination to be more important. Finally, we found little difference between distributed and co-located Microsoft teams. Our findings identify the following key areas that warrant focused research: 1) exploring the non-technical benefits of code reviews, 2) helping developers in articulating review comments, and 3) assisting reviewers’ program comprehension during code reviews. %B IEEE Transactions on Software Engineering %V 43 %P 56 - 75 %8 1/2017 %U https://amiangshu.com/papers/CodeReview-TSE-2016.pdf %N 1 %! IIEEE Trans. Software Eng. %R 10.1109/TSE.2016.2576451 %> https://flosshub.org/sites/flosshub.org/files/CodeReview-TSE-2016.pdf %0 Conference Proceedings %B Open Source Systems: Towards Robust Practices 13th International Conference on Open Source Systems %D 2017 %T Technical Lag in Software Compilations: Measuring How Outdated a Software Deployment Is %A González-Barahona, J.M. %A Sherwood, P. %A Robles, G. %A Izquierdo, D. %E Balaguer, Federico %E Di Cosmo, Roberto %E Garrido, Alejandra %E Kon, Fabio %E Gregorio Robles %E Zacchiroli, Stefano %X Large software compilations based on free, open source software (FOSS) packages are the basis for many software systems. When they are deployed in production, specific versions of the packages in the compilation are selected for installation. Over time, those versions become outdated with respect to the upstream software from which they are produced, and from the components available in the compilations as well. The fact that deployed components are outdated is not a problem in itself, but there is a price to pay for not being "as much updated as reasonable". This includes bug fixes and new features that could, at least potentially, be interesting for the deployed system. Therefore, a balance has to be maintained between "being up-to-date" and "keeping the good old working versions". This paper proposes a theoretical model (the "technical lag") for measuring how outdated a system is, with the aim of assisting in the decisions about upgrading in production. The paper explores several ways in which technical lag can be implemented, depending on requirements. As an illustration, it presents as well some specific cases in which the evolution of technical lag is computed. %B Open Source Systems: Towards Robust Practices 13th International Conference on Open Source Systems %S IFIP Advances in Information and Communication Technology %I Springer International Publishing %V 496 %P 182 - 192 %8 05/2017 %@ 978-3-319-57735-7 %U https://link.springer.com/chapter/10.1007/978-3-319-57735-7_17 %N 235 %R 10.1007/978-3-319-57735-7_17 %0 Conference Proceedings %B 2017 IEEE/ACM 39th International Conference on Software Engineering %D 2017 %T Understanding the Impressions, Motivations, and Barriers of One Time Code Contributors to FLOSS Projects: A Survey %A Amanda Lee %A Carver, Jeffrey C. %A Bosu, Amiangshu %K newcomers %K One Time Contributors %K Qualitative Research %K Survey %X Successful Free/Libre Open Source Software (FLOSS) projects must attract and retain high-quality talent. Researchers have invested considerable effort in the study of core and peripheral FLOSS developers. To this point, one critical subset of developers that have not been studied are One-Time code Contributors (OTC) – those that have had exactly one patch accepted. To understand why OTCs have not contributed another patch and provide guidance to FLOSS projects on retaining OTCs, this study seeks to understand the impressions, motivations, and barriers experienced by OTCs. We conducted an online survey of OTCs from 23 popular FLOSS projects. Based on the 184 responses received, we observed that OTCs generally have positive impressions of their FLOSS project and are driven by a variety of motivations. Most OTCs primarily made contributions to fix bugs that impeded their work and did not plan on becoming long term contributors. Furthermore, OTCs encounter a number of barriers that prevent them from continuing to contribute to the project. Based on our findings, there are some concrete actions FLOSS projects can take to increase the chances of converting OTCs into long-term contributors. %B 2017 IEEE/ACM 39th International Conference on Software Engineering %P 187-197 %8 05/2017 %0 Book Section %B Open Source Systems: Integrating Communities: 12th IFIP WG 2.13 International Conference, OSS 2016, Gothenburg, Sweden, May 30 - June 2, 2016, Proceedings %D 2016 %T A Bayesian Belief Network for Modeling Open Source Software Maintenance Productivity %A Bibi, Stamatia %A Apostolos Ampatzoglou %A Ioannis Stamelos %E Kevin Crowston %E Hammouda, Imed %E Lundell, Björn %E Gregorio Robles %E Gamalielsson, Jonas %E Juho Lindman %X Maintenance is one of the most effort consuming activities in the software development lifecycle. Efficient maintenance within short release cycles depends highly on the underlying source code structure, in the sense that complex modules are more difficult to maintain. In this paper we attempt to unveil and discuss relationships between maintenance productivity, the structural quality of the source code and process metrics like the type of a release and the number of downloads. To achieve this goal, we developed a Bayesian Belief Network (BBN) involving several maintainability predictors and three managerial indices for maintenance (i.e., duration, production, and productivity) on 20 open source software projects. The results suggest that maintenance duration depends on inheritance, coupling, and process metrics. On the other hand maintenance production and productivity depend mostly on code quality metrics. %B Open Source Systems: Integrating Communities: 12th IFIP WG 2.13 International Conference, OSS 2016, Gothenburg, Sweden, May 30 - June 2, 2016, Proceedings %I Springer International Publishing %C Cham %P 32–44 %@ 978-3-319-39225-7 %U http://dx.doi.org/10.1007/978-3-319-39225-7_3 %& A Bayesian Belief Network for Modeling Open Source Software Maintenance Productivity %R 10.1007/978-3-319-39225-7_3 %0 Journal Article %J Empirical Software Engineering %D 2016 %T License usage and changes: a large-scale study on gitHub %A Vendome, Christopher %A Bavota, Gabriele %A Di Penta, Massimiliano %A Linares-Vásquez, Mario %A German, Daniel %A Poshyvanyk, Denys %X Open source software licenses determine, from a legal point of view, under which conditions software can be integrated and redistributed. The reason why developers of a project adopt (or change) a license may depend on various factors, e.g., the need for ensuring compatibility with certain third-party components, the perspective towards redistribution or commercialization of the software, or the need for protecting against somebody else’s commercial usage of the software. This paper reports a large empirical study aimed at quantitatively and qualitatively investigating when and why developers adopt or change software licenses. Specifically, we first identify license changes in 1,731,828 commits, representing the entire history of 16,221 Java projects hosted on GitHub. Then, to understand the rationale of license changes, we perform a qualitative analysis on 1,160 projects written in seven different programming languages, namely C, C++, C#, Java, Javascript, Python, and Ruby—following an open coding approach inspired by grounded theory—on commit messages and issue tracker discussions concerning licensing topics, and whenever possible, try to build traceability links between discussions and changes. On one hand, our results highlight how, in different contexts, license adoption or changes can be triggered by various reasons. On the other hand, the results also highlight a lack of traceability of when and why licensing changes are made. This can be a major concern, because a change in the license of a system can negatively impact those that reuse it. In conclusion, results of the study trigger the need for better tool support in guiding developers in choosing/changing licenses and in keeping track of the rationale of license changes. %B Empirical Software Engineering %! Empir Software Eng %R 10.1007/s10664-016-9438-4 %0 Book Section %B Open Source Systems: Integrating Communities: 12th IFIP WG 2.13 International Conference, OSS 2016, Gothenburg, Sweden, May 30 - June 2, 2016, Proceedings %D 2016 %T An Open Continuous Deployment Infrastructure for a Self-driving Vehicle Ecosystem %A Berger, Christian %E Kevin Crowston %E Hammouda, Imed %E Lundell, Björn %E Gregorio Robles %E Gamalielsson, Jonas %E Juho Lindman %X Self-driving vehicles are an ongoing research and engineering topic even though first automotive OEMs started to deploy such features to their premium vehicles. Chalmers University of Technology and University of Gothenburg are operating and maintaining a vehicle laboratory comprising 1/10 scaled cars, a Volvo XC90, and a Volvo FH truck to conduct studies with automated driving. This laboratory is used both from researchers from different disciplines and in education. The experimental software for all these platforms is powered by the same software environment for different hardware architectures. Therefore, maintaining and deploying new features and bugfixes to the users of this laboratory in a fast way needs to be organized in a reproducible yet easily maintainable manner. This paper outlines our open approach to encapsulate our build, test, and deployment process using VirtualBox, Docker, and Jenkins. %B Open Source Systems: Integrating Communities: 12th IFIP WG 2.13 International Conference, OSS 2016, Gothenburg, Sweden, May 30 - June 2, 2016, Proceedings %I Springer International Publishing %C Cham %P 177–183 %@ 978-3-319-39225-7 %U http://dx.doi.org/10.1007/978-3-319-39225-7_14 %& An Open Continuous Deployment Infrastructure for a Self-driving Vehicle Ecosystem %R 10.1007/978-3-319-39225-7_14 %0 Book Section %B Advanced Information Systems Engineering: 28th International Conference, CAiSE 2016, Ljubljana, Slovenia, June 13-17, 2016. Proceedings %D 2016 %T OSSAP – A Situational Method for Defining Open Source Software Adoption Processes %A López, Lidia %A Costal, Dolors %A Ralyté, Jolita %A Franch, Xavier %A Méndez, Lucía %A Annosi, Maria Carmela %E Nurcan, Selmin %E Soffer, Pnina %E Bajec, Marko %E Eder, Johann %X Organizations are increasingly becoming Open Source Software (OSS) adopters, either as a result of a strategic decision or just as a consequence of technological choices. The strategy followed to adopt OSS shapes organizations’ businesses; therefore methods to assess such impact are needed. In this paper, we propose OSSAP, a method for defining OSS Adoption business Processes, built using a Situational Method Engineering (SME) approach. We use SME to combine two well-known modelling methods, namely goal-oriented models (using i*) and business process models (using BPMN), with a pre-existing catalogue of goal-oriented OSS adoption strategy models. First, we define a repository of reusable method chunks, including the guidelines to apply them. Then, we define OSSAP as a composition of those method chunks to help organizations to improve their business processes in order to integrate the best fitting OSS adoption strategy. We illustrate it with an example of application in a telecommunications company. %B Advanced Information Systems Engineering: 28th International Conference, CAiSE 2016, Ljubljana, Slovenia, June 13-17, 2016. Proceedings %I Springer International Publishing %C Cham %P 524–539 %@ 978-3-319-39696-5 %U http://dx.doi.org/10.1007/978-3-319-39696-5_32 %R 10.1007/978-3-319-39696-5_32 %0 Conference Paper %B Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems - MODELS '16 %D 2016 %T The quest for open source projects that use UML %A fernandez, miguel angel %A Hebig, Regina %A Quang, Truong Ho %A Chaudron, Michel R. V. %Y Benoit Baudry %Y Combemale, Benoit %X Context: While industrial use of UML was studied intensely,little is known about UML use in Free/Open Source Soft-ware (FOSS) projects. Goal: We aim at systematically mining GitHub projects to answer the question when mod-els, if used, are created and updated throughout the whole project’s life-span. Method: We present a semi-automated approach to collect UML stored in images, .xmi, and .uml files and scanned ten percent of all GitHub projects (1.24million). Our focus was on number and role of contributors that created/updated models and the time span during which this happened. Results: We identified and studied 21 316 UML diagrams within 3 295 projects. Conclusion: Creating/updating of UML happens most often during a very short phase at the project start. For 12% of the models duplicates were found, which are in average spread across 1.88 projects. Finally, we contribute a list of GitHub projects that include UML files. %B Proceedings of the ACM/IEEE 19th International Conference on Model Driven Engineering Languages and Systems - MODELS '16 %I ACM Press %C New York, New York, USA %P 173 - 183 %@ 9781450343213 %U https://www.researchgate.net/publication/308869547_The_quest_for_open_source_projects_that_use_UML_mining_GitHub %! MODELS '16 %R 10.1145/2976767.2976778 %0 Conference Paper %B Proceedings of the 38th International Conference on Software Engineering (ICSE 2016) %D 2016 %T The Sky is Not the Limit: Multitasking Across GitHub Projects %A Vasilescu, Bogdan %A Blincoe, Kelly %A Xuan, Qi %A Casalnuovo, Casey %A Damian, Daniela %A Devanbu, Premkumar %A Filkov, Vladimir %K github %K multitasking %K productivity %X Software development has always inherently required multitasking: developers switch between coding, reviewing, testing, designing, and meeting with colleagues. The advent of software ecosystems like GitHub has enabled something new: the ability to easily switch between projects. Developers also have social incentives to contribute to many projects; prolific contributors gain social recognition and (eventually) economic rewards. Multitasking, however, comes at a cognitive cost: frequent context-switches can lead to distraction, sub-standard work, and even greater stress. In this paper, we gather ecosystem-level data on a group of programmers working on a large collection of projects. We develop models and methods for measuring the rate and breadth of a developers' context-switching behavior, and we study how context-switching affects their productivity. We also survey developers to understand the reasons for and perceptions of multitasking. We find that the most common reason for multitasking is interrelationships and dependencies between projects. Notably, we find that the rate of switching and breadth (number of projects) of a developer's work matter. Developers who work on many projects have higher productivity if they focus on few projects per day. Developers that switch projects too much during the course of a day have lower productivity as they work on more projects overall. Despite these findings, developers perceptions of the benefits of multitasking are varied. %B Proceedings of the 38th International Conference on Software Engineering (ICSE 2016) %S ICSE '16 %I ACM %C New York, NY, USA %P 994–1005 %@ 978-1-4503-3900-1 %U http://doi.acm.org/10.1145/2884781.2884875 %R 10.1145/2884781.2884875 %0 Journal Article %J Applied Economics %D 2016 %T Is there a wage premium for volunteer OSS engagement? – signalling, learning and noise %A Bitzer, Jürgen %A Geishecker, Ingo %A Schröder, Philipp J. H. %K open source software %K peer production %K signalling %K voluntary work %K wage formation %X Volunteer-based open-source production has become a significant new model for the organization of software development. Economics often pictures this phenomenon as a case of signaling: Individuals engage in the volunteer programming of open-source software (OSS) as a labor-market signal resulting in a wage premium. Yet, this explanation could so far not be empirically tested. The present paper fills this gap by estimating an upper-bound composite wage premium of voluntary OSS contributions and by separating the potential signaling effect of OSS engagement from other effects. Although some 70% of OSS contributors believe that OSS involvement benefits their careers, we find no actual labor market premium for OSS engagement. The presence of other motives such as fun of play or altruism render OSS contributions too noisy to function as a signal. %B Applied Economics %I Routledge %P 1 - 16 %8 09/2016 %! Applied Economics %R 10.1080/00036846.2016.1218427 %0 Book Section %B Open Source Systems: Integrating Communities: 12th IFIP WG 2.13 International Conference, OSS 2016, Gothenburg, Sweden, May 30 - June 2, 2016, Proceedings %D 2016 %T Towards Open Source/Data in the Context of Higher Education: Pragmatic Case Studies Deployed in Romania %A Coman, Alexandru %A Cîtea, Alexandru %A Buraga, Sabin C. %E Kevin Crowston %E Hammouda, Imed %E Lundell, Björn %E Gregorio Robles %E Gamalielsson, Jonas %E Juho Lindman %X The open source ideology is unfortunately not so popular in Romania. This subject represents, to this day, an untackled problem especially in various local educational areas. The paper describes an interesting initiative taken this year by the Faculty of Computer Science, University of Iaşi, Romania to change the collective opinion by progressively pushing the new generations of students through a binding process with the ideas involved in the open source philosophy. Three ongoing initiatives addressing this problem are detailed, including the results we have obtained so far through them, and also the steps that are planned to be taken soon on the matter. %B Open Source Systems: Integrating Communities: 12th IFIP WG 2.13 International Conference, OSS 2016, Gothenburg, Sweden, May 30 - June 2, 2016, Proceedings %I Springer International Publishing %C Cham %P 184–191 %@ 978-3-319-39225-7 %U http://dx.doi.org/10.1007/978-3-319-39225-7_15 %& Towards Open Source/Data in the Context of Higher Education: Pragmatic Case Studies Deployed in Romania %R 10.1007/978-3-319-39225-7_15 %0 Conference Paper %B 2015 3rd International Conference on Information and Communication Technology (ICoICT )2015 3rd International Conference on Information and Communication Technology (ICoICT) %D 2015 %T Big data analytics on large-scale socio-technical software engineering archives %A Bayati, Shahabedin %A Parsons, David %A Susnjak, Teo %A Heidary, Marzieh %X Given the fast growing nature of software engineering data in online software repositories and open source communities, it would be helpful to analyse these assets to discover valuable information about the software engineering development process and other related data. Big Data Analytics (BDA) techniques and frameworks can be applied on these data resources to achieve a high-performance and relevant data collection and analysis. Software engineering is a socio-technical process which needs development team collaboration and technical knowledge to develop a high-quality application. GitHub, as an online social coding foundation, contains valuable information about the software engineers' communications and project life cycles. In this paper, unsupervised data mining techniques are applied on the data collected by general Big Data approaches to analyse GitHub projects, source codes and interactions. Source codes and projects are clustered using features and metrics derived from historical data in repositories, object oriented programming metrics and the influences of developers on source codes. %B 2015 3rd International Conference on Information and Communication Technology (ICoICT )2015 3rd International Conference on Information and Communication Technology (ICoICT) %I IEEE %C Nusa Dua, Bali, Indonesia %P 65 - 69 %R 10.1109/ICoICT.2015.7231398 %0 Conference Proceedings %B 12th Working Conference on Mining Software Repositories (MSR 2015) %D 2015 %T A Dataset For API Usage %A Anand Ashok Sawant %A Bacchelli, Alberto %X An Application Programming Interface (API) provides a specific set of functionalities to a developer. The main aim of an API is to encourage the reuse of already existing functionality. There has been some work done into API popularity trends, API evolution and API usage. For all the aforementioned research avenues there has been a need to mine the usage of an API in order to perform any kind of analysis. Each one of the approaches that has been employed in the past involved a certain degree of inaccuracy as there was no type check that takes place. We introduce an approach that takes type information into account while mining API method invocations and annotation usages. This approach accurately makes a connection between a method invocation and the class of the API to which the method belongs to. We try collecting as many usages of an API as possible, this is achieved by targeting projects hosted on GitHub. Additionally, we look at the history of every project to collect the usage of an API from earliest version onwards. By making such a large and rich dataset public, we hope to stimulate some more research in the field of APIs with the aid of accurate API usage samples. %B 12th Working Conference on Mining Software Repositories (MSR 2015) %I IEEE %8 05/2015 %U http://sback.it/publications/msr2015data.pdf %> https://flosshub.org/sites/flosshub.org/files/msr2015data.pdf %0 Conference Paper %B 12th Working Conference on Mining Software Repositories %D 2015 %T Ecosystems in GitHub and a Method for Ecosystem Identification using Reference Coupling %A Blincoe, Kelly %A Harrison, Francis %A Damian, Daniela %X Software projects are not developed in isolation. Recent research has shifted to studying software ecosystems, communities of projects that depend on each other and are developed together. However, identifying technical dependencies at the ecosystem level can be challenging. In this paper, we propose a new method, known as reference coupling, for detecting technical dependencies between projects. The method establishes dependencies through user-specified cross-references between projects. We use our method to identify ecosystems in GitHubhosted projects, and we identify several characteristics of the identified ecosystems. We find that most ecosystems are centered around one project and are interconnected with other ecosystems. The predominant type of ecosystems are those that develop tools to support software development. We also found that the project owners’ social behaviour aligns well with the technical dependencies within the ecosystem, but project contributors’ social behaviour does not align with these dependencies. We conclude with a discussion on future research that is enabled by our reference coupling method. %B 12th Working Conference on Mining Software Repositories %S MSR %I IEEE %8 05/2015 %U http://kblincoe.github.io/publications/2015_MSR_Ecosystems_CameraReady.pdf %> https://flosshub.org/sites/flosshub.org/files/2015_MSR_Ecosystems_CameraReady.pdf %0 Conference Proceedings %B 12th Working Conference on Mining Software Repositories (MSR 2015) %D 2015 %T An Empirical Study of Architectural Change in Open-Source Software Systems %A Duc Minh Le %A Pooyan Behnamghader %A Joshua Garcia‡ Daniel Link %A Arman Shahbazian %A Nenad Medvidovic %K architectural change %K architecture recovery %K open-source systems %K software architecture %K software evolution %X From its very inception, the study of software architecture has recognized architectural decay as a regularly occurring phenomenon in long-lived systems. Architectural decay is caused by repeated changes to a system during its lifespan. Despite decay’s prevalence, there is a relative dearth of empirical data regarding the nature of architectural changes that may lead to decay, and of developers’ understanding of those changes. In this paper, we take a step toward addressing that scarcity by conducting an empirical study of changes found in software architectures spanning several hundred versions of 14 opensource systems. Our study reveals several new findings regarding the frequency of architectural changes in software systems, the common points of departure in a system’s architecture during maintenance and evolution, the difference between system-level and component-level architectural change, and the suitability of a system’s implementation-level structure as a proxy for its architecture. %B 12th Working Conference on Mining Software Repositories (MSR 2015) %I IEEE %8 05/2015 %U http://softarch.usc.edu/~pooyan/publications/emparch_msr15.pdf %> https://flosshub.org/sites/flosshub.org/files/emparch_msr15.pdf %0 Journal Article %J Information and Software Technology %D 2015 %T An empirically-based characterization and quantification of information seeking through mailing lists during open source developers’ software evolution %A Sharif, Khaironi Y %A English, Michael %A Ali, Nour %A Exton, Chris %A Collins, JJ %A Buckley, Jim %K Information seeking software maintenance; Open source software; Qualitative empirical study %X Context Several authors have proposed information seeking as an appropriate perspective for studying software evolution. Empirical evidence in this area suggests that substantial time delays can accrue, due to the unavailability of required information, particularly when this information must travel across geographically distributed sites. Objective As a first step in addressing the time delays that can occur in information seeking for distributed Open Source (OS) programmers during software evolution, this research characterizes the information seeking of OS developers through their mailing lists. Method A longitudinal study that analyses 17 years of developer mailing list activity in total, over 6 different OS projects is performed, identifying the prevalent information types sought by developers, from a qualitative, grounded analysis of this data. Quantitative analysis of the number-of-responses and response time-lag is also performed. Results The analysis shows that Open Source developers are particularly implementation centric and team focused in their use of mailing lists, mirroring similar findings that have been reported in the literature. However novel findings include the suggestion that OS developers often require support regarding the technology they use during development, that they refer to documentation fairly frequently and that they seek implementation-oriented specifics based on system design principles that they anticipate in advance. In addition, response analysis suggests a large variability in the response rates for different types of questions, and particularly that participants have difficulty ascertaining information on other developer’s activities. Conclusion The findings provide insights for those interested in supporting the information needs of OS developer communities: They suggest that the tools and techniques developed in support of co-located developers should be largely mirrored for these communities: that they should be implementation centric, and directed at illustrating “how” the system achieves its functional goals and states. Likewise they should be directed at determining the reason for system bugs: a type of question frequently posed by OS developers but less frequently responded to. %B Information and Software Technology %I Elsevier %V 57 %P 77–94 %U http://www.sciencedirect.com/science/article/pii/S095058491400202X %R 10.1016/j.infsof.2014.09.003 %0 Conference Proceedings %B Proceedings of the 2015 ACM CHI %D 2015 %T Gender and Tenure Diversity in GitHub Teams %A Vasilescu, Bogdan %A Posnett, Daryl %A Ray, Baishakhi %A van den Brand, Mark G.J. %A Serebrenik, Alexander %A Devanbu, Premkumar %A Filkov, Vladimir %K gender %K github %K team %X Software development is usually a collaborative venture. Open Source Software (OSS) projects are no exception; indeed, by design, the OSS approach can accommodate teams that are more open, geographically distributed, and dynamic than commercial teams. This, we find, leads to OSS teams that are quite diverse. Team diversity, predominantly in of- fline groups, is known to correlate with team output, mostly with positive effects. How about in OSS? Using GITHUB, the largest publicly available collection of OSS projects, we studied how gender and tenure diversity relate to team productivity and turnover. Using regression modeling of GITHUB data and the results of a survey, we show that both gender and tenure diversity are positive and significant predictors of productivity, together explaining a sizable fraction of the data variability. These results can inform decision making on all levels, leading to better outcomes in recruiting and performance. %B Proceedings of the 2015 ACM CHI %U http://bvasiles.github.io/papers/chi15.pdf %> https://flosshub.org/sites/flosshub.org/files/chi15.pdf %0 Book Section %B Open Source Systems: Adoption and Impact %D 2015 %T How Developers Acquire FLOSS Skills %A Barcomb, Ann %A Grottke, Michael %A Stauffert, Jan-Philipp %A Dirk Riehle %A Jahn, Sabrina %E Damiani, Ernesto %E Frati, Fulvio %E Dirk Riehle %E Wasserman, Anthony I. %K competencies %K Informal learning %K Non-formal learning %K open source %K Skills %K Software developer %X With the increasing prominence of open collaboration as found in free/libre/open source software projects and other joint production communities, potential participants need to acquire skills. How these skills are learned has received little research attention. This article presents a large-scale survey (5,309 valid responses) in which users and developers of the beta release of a popular file download application were asked which learning styles were used to acquire technical and social skills. We find that the extent to which a person acquired the relevant skills through informal methods tends to be higher if the person is a free/libre/open source code contributor, while being a professional software developer does not have this effect. Additionally, younger participants proved more likely to make use of formal methods of learning. These insights will help individuals, commercial companies, educational institutions, governments and open collaborative projects decide how they promote learning. %B Open Source Systems: Adoption and Impact %S IFIP Advances in Information and Communication Technology %I Springer International Publishing %V 451 %P 23-32 %@ 978-3-319-17836-3 %U http://dx.doi.org/10.1007/978-3-319-17837-0_3 %R 10.1007/978-3-319-17837-0_3 %> https://flosshub.org/sites/flosshub.org/files/oss-2015.pdf %0 Book Section %B Open Source Systems: Adoption and Impact %D 2015 %T Implicit Coordination: A Case Study of the Rails OSS Project %A Blincoe, Kelly %A Damian, Daniela %E Damiani, Ernesto %E Frati, Fulvio %E Dirk Riehle %E Wasserman, Anthony I. %X Previous studies on coordination in OSS projects have studied explicit communication. Research has theorized on the existence of coordination without direct communication or implicit coordination in OSS projects, suggesting that it contributes to their success. However, due to the intangible nature of implicit coordination, no studies have confirmed these theories. We describe how implicit coordination can now be measured in modern collaborative development environments. Through a case study of a popular OSS GitHub-hosted project, we report on how and why features that support implicit coordination are used. %B Open Source Systems: Adoption and Impact %S IFIP Advances in Information and Communication Technology %I Springer International Publishing %V 451 %P 35-44 %@ 978-3-319-17836-3 %U http://dx.doi.org/10.1007/978-3-319-17837-0_4 %R 10.1007/978-3-319-17837-0_4 %0 Journal Article %J Empirical Software Engineering %D 2015 %T An in-depth study of the promises and perils of mining GitHub %A Kalliamvakou, Eirini %A Gousios, Georgios %A Blincoe, Kelly %A Singer, Leif %A Daniel M. German %A Damian, Daniela %K github %X With over 10 million git repositories, GitHub is becoming one of the most important sources of software artifacts on the Internet. Researchers mine the information stored in GitHub’s event logs to understand how its users employ the site to collaborate on software, but so far there have been no studies describing the quality and properties of the available GitHub data. We document the results of an empirical study aimed at understanding the characteristics of the repositories and users in GitHub; we see how users take advantage of GitHub’s main features and how their activity is tracked on GitHub and related datasets to point out misalignment between the real and mined data. Our results indicate that while GitHub is a rich source of data on software development, mining GitHub for research purposes should take various potential perils into consideration. For example, we show that the majority of the projects are personal and inactive, and that almost 40% of all pull requests do not appear as merged even though they were. Also, approximately half of GitHub’s registered users do not have public activity, while the activity of GitHub users in repositories is not always easy to pinpoint. We use our identified perils to see if they can pose validity threats; we review selected papers from the MSR 2014 Mining Challenge and see if there are potential impacts to consider. We provide a set of recommendations for software engineering researchers on how to approach the data in GitHub. %B Empirical Software Engineering %I Springer %U http://www.gousios.gr/pub/promises-perils-github-extended.pdf %! Empir Software Eng %R 10.1007/s10664-015-9393-5 %> https://flosshub.org/sites/flosshub.org/files/promises-perils-github-extended.pdf %0 Conference Paper %B Proceedings of the 11th International Symposium on Open Collaboration (OpenSym 2015) %D 2015 %T A multiple case study of small free software businesses as social entrepreneurships %A Barcomb, Ann %K free software %K open source software %K public good %K small business %K social entrepreneurship %K social ventures %X Free/libre and open source software are frequently described as a single community or movement. The difference between free software and open source ideology may influence founders, resulting in different types of companies being created. Specifically, the relationship between free/libre software ideology and social entrepreneurships is investigated. This paper presents seven case studies of businesses, five of which were founded by people who identify with the free/libre software movement. The result is a theory that small businesses founded by free/libre software advocates have three characteristics of social entrepreneurships. First, social benefit is prioritized over wealth creation. Second, the business’s social mission is not incidental but is furthered through its for-profit activities, rather than supported by the company’s profits. Third, the company’s success is defined in part by the success of its social mission Free/libre software entrepreneurs who recognize their activities as social entrepreneurships can benefit from the existing literature on the unique challenges faced by socially-oriented businesses. %B Proceedings of the 11th International Symposium on Open Collaboration (OpenSym 2015) %U https://opus4.kobv.de/opus4-fau/frontdoor/index/index/docId/6334 %> https://flosshub.org/sites/flosshub.org/files/p100-barcomb.pdf %0 Book Section %B Handbook of Science and Technology Convergence %D 2015 %T Open Source Technology Development %A Kevin Crowston %E Bainbridge, William Sims %E Roco, Mihail C. %K Free/Libre Open Source Software %X In this chapter, we introduce the practices of free/libre open source software (FLOSS) development as an instance of the convergence of technological affordances with novel social practices to create a novel mode of work. We then consider how FLOSS software might be used for various scientific applications, perhaps leading to a convergence of current distinct disciplines. We conclude by considering how the technologies and practices of FLOSS development might be applied to other settings, thus leading to further convergence of those settings. %B Handbook of Science and Technology Convergence %I Springer International Publishing %P 1-9 %U http://dx.doi.org/10.1007/978-3-319-04033-2_29-1 %R 10.1007/978-3-319-04033-2_29-1 %0 Conference Proceedings %B 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering %D 2015 %T Open Source-Style Collaborative Development Practices in Commercial Projects Using GitHub %A Kalliamvakou, E %A Damian, Daniela %A Blincoe, Kelly %A Singer, L. %A German, Daniel %K github %X Researchers are currently drawn to study projects hosted on GitHub due to its popularity, ease of obtaining data, and its distinctive built-in social features. GitHub has been found to create a transparent development environment, which together with a pull request-based workflow, provides a lightweight mechanism for committing, reviewing and managing code changes. These features impact how GitHub is used and the benefits it provides to teams’ development and collaboration. While most of the evidence we have is from GitHub’s use in open source software (OSS) projects, GitHub is also used in an increasing number of commercial projects. It is unknown how GitHub supports these projects given that GitHub’s workflow model does not intuitively fit the commercial development way of working. In this paper, we report findings from an online survey and interviews with GitHub users on how GitHub is used for collaboration in commercial projects. We found that many commercial projects adopted practices that are more typical of OSS projects including reduced communication, more independent work, and self-organization. We discuss how GitHub’s transparency and popular workflow can promote open collaboration, allowing organizations to increase code reuse and promote knowledge sharing across their teams. %B 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering %I ACM/IEEE %V 1 %P 574-585 %8 05/2015 %R 10.1109/ICSE.2015.74 %> https://flosshub.org/sites/flosshub.org/files/icse-camera.pdf %0 Book Section %B Open Source Systems: Adoption and Impact %D 2015 %T The RISCOSS Platform for Risk Management in Open Source Software Adoption %A Franch, X. %A Kenett, R. %A Mancinelli, F. %A Susi, A. %A Ameller, D. %A Annosi, M.C. %A Ben-Jacob, R. %A Blumenfeld, Y. %A Franco, O.H. %A Gross, D. %A Lopez, L. %A Morandini, M. %A Oriol, M. %A Siena, A. %E Damiani, Ernesto %E Frati, Fulvio %E Dirk Riehle %E Wasserman, Anthony I. %K Open source adoption %K Open Source Projects %K open source software %K OSS %K Risk Management %K Software platform %X Managing risks related to OSS adoption is a must for organizations that need to smoothly integrate OSS-related practices in their development processes. Adequate tool support may pave the road to effective risk management and ensure the sustainability of such activity. In this paper, we present the RISCOSS platform for managing risks in OSS adoption. RISCOSS builds upon a highly configurable data model that allows customization to several types of scopes. It implements two different working modes: exploration, where the impact of decisions may be assessed before making them; and continuous assessment, where risk variables (and their possible consequences on business goals) are continuously monitored and reported to decision-makers. The blackboard-oriented architecture of the platform defines several interfaces for the identified techniques, allowing new techniques to be plugged in. %B Open Source Systems: Adoption and Impact %S IFIP Advances in Information and Communication Technology %I Springer International Publishing %V 451 %P 124-133 %@ 978-3-319-17836-3 %U http://dx.doi.org/10.1007/978-3-319-17837-0_12 %R 10.1007/978-3-319-17837-0_12 %0 Journal Article %J Cognitive Systems Research %D 2015 %T Stigmergic coordination in FLOSS development teams: Integrating explicit and implicit mechanisms %A Bolici, Francesco %A Howison, James %A Kevin Crowston %K Coordination mechanisms %K distributed teams %K FLOSS teams %K Stigmergic coordination %X The vast majority of literature on coordination in team-based projects has drawn on a conceptual separation between explicit (e.g. plans, feedbacks) and implicit coordination mechanisms (e.g. mental maps, shared knowledge). This analytical distinction presents some limitations in explaining how coordination is reached in organizations characterized by distributed teams, scarce face to face meetings and fuzzy and changing lines of authority, as in free/libre open source software (FLOSS) development. Analyzing empirical illustrations from two FLOSS projects, we highlight the existence of a peculiar model, stigmergic coordination, which includes aspects of both implicit and explicit mechanisms. The work product itself (implicit) and the characteristics under which it is shared (explicit) play an under-appreciated role in helping software developers manage dependencies as they arise. We develop this argument beyond the existing literature by working with an existing coordination framework, considering the role that the codebase itself might play at each step. We also discuss the features and the practices to support stigmergic coordination in distributed teams, as well as recommendations for future research. “Not everything that implicitly exists needs to be rendered explicit” (Sloterdijk, 2009, p. 3). %B Cognitive Systems Research %8 12/2015 %U http://www.sciencedirect.com/science/article/pii/S1389041715000339 %! Cognitive Systems Research %R 10.1016/j.cogsys.2015.12.003 %> https://flosshub.org/sites/flosshub.org/files/COGSYS-RS-%28HHS%29-%282015%29-%283%29.pdf %0 Conference Proceedings %B 12th Working Conference on Mining Software Repositories (MSR 2015) %D 2015 %T The Uniqueness of Changes: Characteristics and Applications %A Ray, Baishakhi %A Meiyappan Nagappan %A Christian Bird %A Nachiappan Nagappan %A Zimmermann, Thomas %K linux kernel %X Changes in software development come in many forms. Some changes are frequent, idiomatic, or repetitive (e.g. adding checks for nulls or logging important values) while others are unique. We hypothesize that unique changes are different from the more common similar (or non-unique) changes in important ways; they may require more expertise or represent code that is more complex or prone to mistakes. As such, these changes are worthy of study. In this paper, we present a definition of unique changes and provide a method for identifying them in software project history. Based on the results of applying our technique on the Linux kernel and two large projects at Microsoft, we present an empirical study of unique changes. We explore how prevalent unique changes are and characterize where they occur along the architecture of the project. We further investigate developers' contribution towards uniqueness of changes. We also describe potential applications of leveraging the uniqueness of change and implement two such applications, evaluating the risk of changes based on uniqueness and providing change recommendations for non-unique changes. %B 12th Working Conference on Mining Software Repositories (MSR 2015) %8 05/2015 %U http://research.microsoft.com/apps/pubs/default.aspx?id=232407 %> https://flosshub.org/sites/flosshub.org/files/MSR-TR-2014-149.pdf %0 Conference Paper %B 12th Working Conference on Mining Software Repositories (MSR 2015) %D 2015 %T Using Developer-Interaction Trails to Triage Change Requests %A Motahareh Bahrami Zanjani %A Kagdi, Huzefa %A Christian Bird %X The paper presents an approach, namely iHDev, to recommend developers who are most likely to implement incoming change requests. The basic premise of iHDev is that the developers who interacted with the source code relevant to a given change request are most likely to best assist with its resolution. A machine-learning technique is first used to locate source-code entities relevant to the textual description of a given change request. iHDev then mines interaction trails (i.e., Mylyn sessions) associated with these source-code entities to recommend a ranked list of developers. iHDev integrates the interaction trails in a unique way to perform its task, which was not investigated previously. An empirical study on open source systems Mylyn and Eclipse Project was conducted to assess the effectiveness of iHDev. A number of change requests were used in the evaluated benchmark. Recall for top one to five recommended developers and Mean Reciprocal Rank (MRR) values are reported. Furthermore, a comparative study with two previous approaches that use commit histories and/or the source-code authorship information for developer recommendation was performed. Results show that iHDev could provide a recall gain of up to 127.27% with equivalent or improved MRR values by up to 112.5%. %B 12th Working Conference on Mining Software Repositories (MSR 2015) %I IEEE %8 05/2015 %U http://www.cabird.com/wp/zanjani2015developer/ %> https://flosshub.org/sites/flosshub.org/files/zanjani2015developer.pdf %0 Report %D 2015 %T Volunteer Management in Open Source Communities %A Barcomb, Ann %X Open source community management is largely ad-hoc and relies on practitioner guides. Yet there is a great deal of information about volunteer management in the general volunteering literature, open source literature and general volunteering guides which could be relevant to open source communities if it were categorized and validated. Bringing these different sources of information together also reveals gaps in our understanding of volunteer management in open source which I hope to address. %B OpenSym 2015, the 11th International Symposium on Open Collaboration %8 08/2015 %> https://flosshub.org/sites/flosshub.org/files/c101-barcomb.pdf %0 Conference Proceedings %B 12th Working Conference on Mining Software Repositories (MSR 2015) %D 2015 %T Will they like this? Evaluating Code Contributions With Language Models %A Vincent J. Hellendoorn %A Premkumar T. Devanbu %A Bacchelli, Alberto %X Popular open-source software projects receive and review contributions from a diverse array of developers, many of whom have little to no prior involvement with the project. A recent survey reported that reviewers consider conformance to the project’s code style to be one of the top priorities when evaluating code contributions on Github. We propose to quantitatively evaluate the existence and effects of this phenomenon. To this aim we use language models, which were shown to accurately capture stylistic aspects of code. We find that rejected changesets do contain code significantly less similar to the project than accepted ones; furthermore, the less similar changesets are more likely to be subject to thorough review. Armed with these results we further investigate whether new contributors learn to conform to the project style and find that experience is positively correlated with conformance to the project’s code style. %B 12th Working Conference on Mining Software Repositories (MSR 2015) %I IEEE %8 05/2015 %U http://sback.it/publications/msr2015.pdf %> https://flosshub.org/sites/flosshub.org/files/msr2015_0.pdf %0 Conference Proceedings %B 30th IEEE International Conference on Software Maintenance - Early Research Achievements (ICSM 2014 ERA) %D 2014 %T Continuous integration in a social-coding world: Empirical evidence from GitHub %A Vasilescu, Bogdan %A Serebrenik, Alexander %A Schuylenberg, Stef %A Wulms, Jules %A Brand, Mark G.J. %K github %X Continuous integration is a software engineering practice of frequently merging all developer working copies with a shared main branch, e.g., several times a day. With the advent of GITHUB, a platform well known for its “social coding” features that aid collaboration and sharing, and currently the largest code host in the open source world, collaborative software development has never been more prominent. In GITHUB development one can distinguish between two types of developer contributions to a project: direct ones, coming from a typically small group of developers with write access to the main project repository, and indirect ones, coming from developers who fork the main repository, update their copies locally, and submit pull requests for review and merger. In this paper we explore how GITHUB developers use continuous integration as well as whether the contribution type (direct versus indirect) and different project characteristics (e.g., main programming language, or project age) are associated with the success of the automatic builds. %B 30th IEEE International Conference on Software Maintenance - Early Research Achievements (ICSM 2014 ERA) %P 5 pages %U http://conferences.computer.org/icsme/2014/papers/6146a401.pdf %> https://flosshub.org/sites/flosshub.org/files/ICSME2014ERA.pdf %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T Crafting a Systematic Literature Review on Open-Source Platforms %A Teixeira, Jose %A Baiyere, Abayomi %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %K Ecosystems %K FLOSS %K open-source %K Platforms %K R&D Management %X This working paper unveils the crafting of a systematic literature review on open-source platforms. The high-competitive mobile devices market, where several players such as Apple, Google, Nokia and Microsoft run a platforms- war with constant shifts in their technological strategies, is gaining increasing attention from scholars. It matters, then, to review previous literature on past platforms-wars, such as the ones from the PC and game-console industries, and assess its implications to the current mobile devices platforms-war. The paper starts by justifying the purpose and rationale behind this literature review on open-source platforms. The concepts of open-source software and computer-based platforms were then discussed both individually and in unison, in order to clarify the core-concept of “open-source platform” that guides this literature review. The detailed design of the employed methodological strategy is then presented as the central part of this paper. The paper concludes with preliminary findings organizing previous literature on open-source platforms for the purpose of guiding future research in this area. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 113-122 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_16 %R 10.1007/978-3-642-55128-4_16 %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T Do Developers Discuss Design? %A Brunet, João %A Murphy, Gail C. %A Terra, Ricardo %A Figueiredo, Jorge %A Serey, Dalton %K Design Discussions %K empirical study %K machine learning %K mining challenge %K msr challenge %X Design is often raised in the literature as important to attaining various properties and characteristics in a software system. At least for open-source projects, it can be hard to find evidence of ongoing design work in the technical artifacts produced as part of the development. Although developers usually do not produce specific design documents, they do communicate about design in different ways. In this paper, we provide quantitative evidence that developers address design through discussions in commits, issues, and pull requests. To achieve this, we built a discussions' classifier and automatically labeled 102,122 discussions from 77 projects. Based on this data, we make four observations about the projects: i) on average, 25% of the discussions in a project are about design; ii) on average, 26% of developers contribute to at least one design discussion; iii) only 1% of the developers contribute to more than 15% of the discussions in a project; and iv) these few developers who contribute to a broad range of design discussions are also the top committers in a project. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 340–343 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597115 %R 10.1145/2597073.2597115 %> https://flosshub.org/sites/flosshub.org/files/brunet.pdf %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T Gentoo Package Dependencies over Time %A Bloemen, Remco %A Amrit, Chintan %A Kuhlmann, Stefan %A Ordóñez–Matamoros, Gonzalo %K dependencies %K gentoo %K graph %K INNOVATION %X Open source distributions such as Gentoo need to accurately track dependency relations between software packages in order to install working systems. To do this, Gentoo has a carefully authored database containing those relations. In this paper, we extract the Gentoo package dependency graph and its changes over time. The final dependency graph spans 15 thousand open source projects and 80 thousand dependency relations. Furthermore, the development of this graph is tracked over time from the beginning of the Gentoo project in 2000 to the first quarter of 2012, with monthly resolution. The resulting dataset provides many opportunities for research. In this paper we explore cluster analysis to reveals meaningful relations between packages and in a separate paper we analyze changes in the dependencies over time to get insights in the innovation dynamics of open source software. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 404–407 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597131 %R 10.1145/2597073.2597131 %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T How Do Social Interaction Networks Influence Peer Impressions Formation? A Case Study %A Bosu, Amiangshu %A Carver, JeffreyC. %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %K COLLABORATION %K FOSS %K open source %K OSS %K social network analysis %X Due to their lack of physical interaction, Free and Open Source Software (FOSS) participants form impressions of their teammates largely based on sociotechnical mechanisms including: code commits, code reviews, mailing-lists, and bug comments. These mechanisms may have different effects on peer impression formation. This paper describes a social network analysis of the WikiMedia project to determine which type of interaction has the most favorable characteristics for impressions formation. The results suggest that due to lower centralization, high interactivity, and high degree of interactions between participants, the code review interactions have the most favorable characteristics to support impression formation among FOSS participants. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 31-40 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_4 %R 10.1007/978-3-642-55128-4_4 %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T Improving Mozilla’s In-App Payment Platform %A Janczukowicz, Ewa %A Bouabdallah, Ahmed %A Braud, Arnaud %A Fromentoux, Gaël %A Bonnin, Jean-Marie %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %X Nowadays, an in-app payment mechanism is offered in most existing mobile payment solutions. However, current solutions are not flexible and impose certain restrictions: users are limited to predefined payment options and merchants need to adapt their payment mechanisms to each payment provider they use. Ideally mobile payments should be as flexible as possible to be able to target various markets together with users’ spending habits. Mozilla wants to promote an open approach in mobile payments by offering a flexible, easily accessible solution. This solution is analyzed, its shortcomings and possible improvements are discussed leading to an original proposal. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 103-106 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_13 %R 10.1007/978-3-642-55128-4_13 %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T Innovation Diffusion in Open Source Software: Preliminary Analysis of Dependency Changes in the Gentoo Portage Package Database %A Bloemen, Remco %A Amrit, Chintan %A Kuhlmann, Stefan %A Ordóñez–Matamoros, Gonzalo %K dependencies %K gentoo %K graph %K INNOVATION %X In this paper we make the case that software dependencies are a form of innovation adoption. We then test this on the time-evolution of the Gentoo package dependency graph. We find that the Bass model of innovation diffusion fits the growth of the number of packages depending on a given library. Interestingly, we also find that low-level packages have a primarily imitation driven adoption and multimedia libraries have primarily innovation driven growth. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 316–319 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597079 %R 10.1145/2597073.2597079 %> https://flosshub.org/sites/flosshub.org/files/bloeman.pdf %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T A Layered Approach to Managing Risks in OSS Projects %A Franch, Xavier %A Kenett, Ron %A Mancinelli, Fabio %A Susi, Angelo %A Ameller, David %A Ben-Jacob, Ron %A Siena, Alberto %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %K Layered Model %K open source %K OSS %K Risk Management %X In this paper, we propose a layered approach to managing risks in OSS projects. We define three layers: the first one for defining risk drivers by collecting and summarising available data from different data sources, including human-provided contextual information; the second layer, for converting these risk drivers into risk indicators; the third layer for assessing how these indicators impact the business of the adopting organisation. The contributions are: 1) the complexity of gathering data is isolated in one layer using appropriate techniques, 2) the context needed to interpret this data is provided by expert involvement evaluating risk scenarios and answering questionnaires in a second layer, 3) a pattern-based approach and risk reasoning techniques to link risks to business goals is proposed in the third layer. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 168-171 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_23 %R 10.1007/978-3-642-55128-4_23 %0 Conference Paper %B Proceedings of the 5th International Workshop on Emerging Trends in Software Metrics - WETSoM 2014 %D 2014 %T "May the fork be with you": novel metrics to analyze collaboration on GitHub %A Marco Biazzini %A Benoit Baudry %Y Counsell, Steve %Y Marchesi, Michele L. %Y Visaggio, Aaron %Y Zhang, Hongyu %Y Venkatasubramanyam, Radhika %K flossmole %K github %X Multi–repository software projects are becoming more and more popular, thanks to web–based facilities such as GitHub. Code and process metrics generally assume a single repository must be analyzed, in order to measure the characteristics of a codebase. Thus they are not apt to measure how much relevant information is hosted in multiple repositories contributing to the same codebase. Nor can they feature the characteristics of such a distributed development process. We present a set of novel metrics, based on an original classification of commits, conceived to capture some interesting aspects of a multi–repository development process. We also describe an efficient way to build a data structure that allows to compute these metrics on a set of Git repositories. Interesting outcomes, obtained by applying our metrics on a large sample of projects hosted on GitHub, show the usefulness of our contribution. %B Proceedings of the 5th International Workshop on Emerging Trends in Software Metrics - WETSoM 2014 %I ACM Press %C New York, New York, USA %P 37 - 43 %@ 9781450328548 %U http://marbiaz.github.io/docs/Biazzini14b.pdf %! WETSoM 2014 %R 10.1145/2593868.2593875 %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T A Methodology for Managing FOSS Migration Projects %A Goñi, Angel %A Boodraj, Maheshwar %A Cabreja, Yordanis %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %X Since 2005, the Free Software Center (CESOL) at the University of Information Science (UCI) in Havana, Cuba, has conducted several free and open source software (FOSS) migration projects for various organizations. The experience gained from these projects enabled the creation of a FOSS Migration Methodology which documented how the technical elements of a project of this kind should be executed. Despite the usefulness of this methodology, the projects that have been undertaken experienced difficulties that were, in most cases, directly related to their management. This research aims to improve the methodology and minimize management-related challenges thereby improving the quality of migration projects. The proposed methodology was applied in a project that ran in a higher education organization and the results prove that the methodology enhanced the quality of the migration project. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 172-175 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_24 %R 10.1007/978-3-642-55128-4_24 %0 Journal Article %J Advances in Complex Systems %D 2014 %T MODELING DISTRIBUTED COLLABORATION ON GITHUB %A McDONALD, NORA %A Blincoe, Kelly %A PETAKOVIC, EVA %A Goggins, Sean %X In this paper, we apply concepts from Distributed Leadership, a theory suggesting that leadership is shared among members of an organization, to frame models of contribution that we uncover in five relatively successful open source software (OSS) projects hosted on GitHub. In this qualitative, comparative case study, we show how these projects make use of GitHub features such as pull requests (PRs). We find that projects in which member PRs are more frequently merged with the codebase experience more sustained participation. We also find that projects with higher success rates among contributors and higher contributor retention tend to have more distributed (non-centralized) practices for reviewing and processing PRs. The relationships between organizational form and GitHub practices are enabled and made visible as a result of GitHub's novel interface. Our results demonstrate specific dimensions along which these projects differ and explicate a framework that warrants testing in future studies of OSS, particularly GitHub. %B Advances in Complex Systems %I World Scientific %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T Modern Code Reviews in Open-source Projects: Which Problems Do They Fix? %A Beller, Moritz %A Bacchelli, Alberto %A Zaidman, Andy %A Juergens, Elmar %K code review %K defects %K open source software %X Code review is the manual assessment of source code by humans, mainly intended to identify defects and quality problems. Modern Code Review (MCR), a lightweight variant of the code inspections investigated since the 1970s, prevails today both in industry and open-source software (OSS) systems. The objective of this paper is to increase our understanding of the practical benefits that the MCR process produces on reviewed source code. To that end, we empirically explore the problems fixed through MCR in OSS systems. We manually classified over 1,400 changes taking place in reviewed code from two OSS projects into a validated categorization scheme. Surprisingly, results show that the types of changes due to the MCR process in OSS are strikingly similar to those in the industry and academic systems from literature, featuring the similar 75:25 ratio of maintainability-related to functional problems. We also reveal that 7–35% of review comments are discarded and that 10–22% of the changes are not triggered by an explicit review comment. Patterns emerged in the review data; we investigated them revealing the technical factors that influence the number of changes due to the MCR process. We found that bug-fixing tasks lead to fewer changes and tasks with more altered files and a higher code churn have more changes. Contrary to intuition, the person of the reviewer had no impact on the number of changes. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 202–211 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597082 %R 10.1145/2597073.2597082 %> https://flosshub.org/sites/flosshub.org/files/beller.pdf %0 Conference Paper %B Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing %D 2014 %T OCData Hackathon @ CSCW 2014 %A Goggins, Sean %A Andrea Wiggins %A Susan Winter %A Brian Butler %Y Fussell, Susan %Y Lutters, Wayne %Y Morris, Meredith Ringel %Y Reddy, Madhu %X Online Communities data is prevalent in CSCW research, but the approaches to collecting, managing, analyzing and visualizing large scale social data varies on a lab by lab basis. The OCData hackathon is aimed at creating a community opportunity to share approaches to online communities research at the level of data. Integrating data, tools and theories to address interesting research questions remains a challenge for the community. %B Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing %I ACM Press %C Baltimore, Maryland, USANew York, New York, USA %P 317 - 318 %@ 9781450325417 %! CSCW Companion '14 %R 10.1145/2556420.2558865 %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T PROINFODATA: Monitoring a Large Park of Computational Laboratories %A Possamai, CleideL.B. %A Pasqualin, Diego %A Weingaertner, Daniel %A Todt, Eduardo %A Castilho, MarcosA. %A Bona, LuisC.E. %A Almeida, EduardoCunha %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %X This paper briefly presents a model for monitoring a large, heterogeneous and geographically scattered computer park. The data collection is performed by a software agent. The collected data are sent to the central server over the Internet, and stored by the storage system. An on-line portal makes up the visualization system, featuring charts, reports, and other tools for assessing the state of the park. This system is currently monitoring circa 150,000 machines. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 226-229 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_34 %R 10.1007/978-3-642-55128-4_34 %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T The Promises and Perils of Mining GitHub %A Kalliamvakou, Eirini %A Gousios, Georgios %A Blincoe, Kelly %A Singer, Leif %A Daniel M. German %A Damian, Daniela %K bias %K code reviews %K git %K github %K mining software repositories %X With over 10 million git repositories, GitHub is becoming one of the most important source of software artifacts on the Internet. Researchers are starting to mine the information stored in GitHub's event logs, trying to understand how its users employ the site to collaborate on software. However, so far there have been no studies describing the quality and properties of the data available from GitHub. We document the results of an empirical study aimed at understanding the characteristics of the repositories in GitHub and how users take advantage of GitHub's main features---namely commits, pull requests, and issues. Our results indicate that, while GitHub is a rich source of data on software development, mining GitHub for research purposes should take various potential perils into consideration. We show, for example, that the majority of the projects are personal and inactive; that GitHub is also being used for free storage and as a Web hosting service; and that almost 40% of all pull requests do not appear as merged, even though they were. We provide a set of recommendations for software engineering researchers on how to approach the data in GitHub. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 92–101 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597074 %R 10.1145/2597073.2597074 %> https://flosshub.org/sites/flosshub.org/files/perils.pdf %0 Journal Article %J Science of Computer Programming %D 2014 %T Sourcerer: An infrastructure for large-scale collection and analysis of open-source code %A Bajracharya, Sushil %A Ossher, Joel %A Lopes, Cristina %X A large amount of open source code is now available online, presenting a great potential resource for software developers. This has motivated software engineering researchers to develop tools and techniques to allow developers to reap the benefits of these billions of lines of source code. However, collecting and analyzing such a large quantity of source code presents a number of challenges. Although the current generation of open source code search engines provides access to the source code in an aggregated repository, they generally fail to take advantage of the rich structural information contained in the code they index. This makes them significantly less useful than Sourcerer for building state-of-the-art software engineering tools, as these tools often require access to both the structural and textual information available in source code.We have developed Sourcerer, an infrastructure for large-scale collection and analysis of open source code. By taking full advantage of the structural information extracted from source code in its repository, Sourcerer provides a foundation upon which state-of-the-art search engines and related tools can easily be built. We describe the Sourcerer infrastructure, present the applications that we have built on top of it, and discuss how existing tools could benefit from using Sourcerer. %B Science of Computer Programming %V 79 %P 241 - 259 %8 1/2014 %! Science of Computer Programming %R 10.1016/j.scico.2012.04.008 %0 Journal Article %J Science of Computer Programming %D 2014 %T Studying software evolution using topic models %A Stephen W. Thomas %A Adams, Bram %A Hassan, Ahmed E. %A Blostein, Dorothea %K Latent Dirichlet allocation %K mining software repositories %K software evolution %K topic model %X Topic models are generative probabilistic models which have been applied to information retrieval to automatically organize and provide structure to a text corpus. Topic models discover topics in the corpus, which represent real world concepts by frequently cooccurring words. Recently, researchers found topics to be effective tools for structuring various software artifacts, such as source code, requirements documents, and bug reports. This research also hypothesized that using topics to describe the evolution of software repositories could be useful for maintenance and understanding tasks. However, research has yet to determine whether these automatically discovered topic evolutions describe the evolution of source code in a way that is relevant or meaningful to project stakeholders, and thus it is not clear whether topic models are a suitable tool for this task. In this paper, we take a first step towards evaluating topic models in the analysis of software evolution by performing a detailed manual analysis on the source code histories of two well-known and well-documented systems, JHotDraw and jEdit. We define and compute various metrics on the discovered topic evolutions and manually investigate how and why the metrics evolve over time. We find that the large majority (87%–89%) of topic evolutions correspond well with actual code change activities by developers. We are thus encouraged to use topic models as tools for studying the evolution of a software system. %B Science of Computer Programming %I Elsevier %V 80 %P 457–479 %U http://sail.cs.queensu.ca/publications/pubs/Thomas-2012-SCP.pdf %0 Conference Paper %B Proceedings of the 11th Working Conference on Mining Software Repositories %D 2014 %T Understanding "Watchers" on GitHub %A Sheoran, Jyoti %A Blincoe, Kelly %A Kalliamvakou, Eirini %A Damian, Daniela %A Ell, Jordan %K github %K mining challenge %K msr challenge %K repositories %K Software Teams %K Watchers %X Users on GitHub can watch repositories to receive notifications about project activity. This introduces a new type of passive project membership. In this paper, we investigate the behavior of watchers and their contribution to the projects they watch. We find that a subset of project watchers begin contributing to the project and those contributors account for a significant percentage of contributors on the project. As contributors, watchers are more confident and contribute over a longer period of time in a more varied way than other contributors. This is likely attributable to the knowledge gained through project notifications. %B Proceedings of the 11th Working Conference on Mining Software Repositories %S MSR 2014 %I ACM %C New York, NY, USA %P 336–339 %@ 978-1-4503-2863-0 %U http://doi.acm.org/10.1145/2597073.2597114 %R 10.1145/2597073.2597114 %0 Conference Paper %B Proceedings of The International Symposium on Open Collaboration %D 2014 %T Volunteer Attraction and Retention in Open Source Communities %A Barcomb, Ann %K Community Management %K FLOSS %K open source %K Recruitment %K Service Duration %K Volunteer Management %K Volunteer Retention %K Volunteers %X The importance of volunteers in open source has led to the position of community manager becoming more common in foundations and projects. Yet the advice for volunteer management and retention is fragmented, incomplete, contradictory, and has not been empirically examined. Our aim is to fill this gap by creating a comprehensive guidebook of best practices drawing from open source practitioner guides and general literature on volunteering, and to subject a subset of practices to empirical study. A method for evaluating volunteer attrition in terms of value to the organization will also be developed. %B Proceedings of The International Symposium on Open Collaboration %S OpenSym '14 %I ACM %C New York, NY, USA %P 40:1–40:2 %@ 978-1-4503-3016-9 %U http://doi.acm.org/10.1145/2641580.2641628 %R 10.1145/2641580.2641628 %0 Book Section %B Open Source Software: Mobile Open Source Technologies %D 2014 %T When Are OSS Developers More Likely to Introduce Vulnerable Code Changes? A Case Study %A Bosu, Amiangshu %A Carver, JeffreyC. %A Hafiz, Munawar %A Hilley, Patrick %A Janni, Derek %E Corral, Luis %E Sillitti, Alberto %E Succi, Giancarlo %E Vlasenko, Jelena %E Wasserman, AnthonyI. %K FOSS %K open source %K OSS %K security %K vulnerability %X We analyzed peer code review data of the Android Open Source Project (AOSP) to understand whether code changes that introduce security vulnerabilities, referred to as vulnerable code changes (VCC), occur at certain intervals. Using a systematic manual analysis process, we identified 60 VCCs. Our results suggest that AOSP developers were more likely to write VCCs prior to AOSP releases, while during the post-release period they wrote fewer VCCs. %B Open Source Software: Mobile Open Source Technologies %S IFIP Advances in Information and Communication Technology %I Springer Berlin Heidelberg %V 427 %P 234-236 %@ 978-3-642-55127-7 %U http://dx.doi.org/10.1007/978-3-642-55128-4_37 %R 10.1007/978-3-642-55128-4_37 %0 Book %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %D 2013 %T Authoritative Linked Data Descriptions of Debian Source Packages Using ADMS.SW %A Olivier Berger %A Christian Bac %E Petrinja, Etiel %E Succi, Giancarlo %E Ioini, Nabil %E Sillitti, Alberto %K debian %X he Debian Package Tracking System is a Web dashboard for Debian contributors and advanced users. This central tool publishes the status of subsequent releases of source packages in the Debian distribution. It has been improved to generate RDF meta-data documenting the source packages, their releases and links to other packaging artifacts, using the ADMS.SW 1.0 model. This constitutes an authoritative source of machine-readable Debian “facts” and proposes a reference URI naming scheme for Linked Data resources about Debian packages. This should enable the interlinking of these Debian package descriptions with other ADMS.SW or DOAP descriptions of FLOSS projects available on the Semantic Web also using Linked Data principles. This will be particularly interesting for traceability with upstream projects whose releases are packaged in Debian, derivative distributions reusing Debian source packages, or with other FLOSS distributions. %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %I Springer Berlin Heidelberg %C Berlin, Heidelberg %V 404 %P 168 - 181 %@ 978-3-642-38928-3 %R 10.1007/978-3-642-38928-3_12 %0 Book Section %B Social Informatics %D 2013 %T The Babel of Software Development: Linguistic Diversity in Open Source %A Vasilescu, Bogdan %A Serebrenik, Alexander %A Brand, MarkG.J. %E Jatowt, Adam %E Lim, Ee-Peng %E Ding, Ying %E Miura, Asako %E Tezuka, Taro %E Dias, Gaël %E Tanaka, Katsumi %E Flanagin, Andrew %E Dai, BingTian %B Social Informatics %S Lecture Notes in Computer Science %I Springer International Publishing %V 8238 %P 391-404 %@ 978-3-319-03259-7 %U http://dx.doi.org/10.1007/978-3-319-03260-3_34 %R 10.1007/978-3-319-03260-3_34 %> https://flosshub.org/sites/flosshub.org/files/socinfo13.pdf %0 Conference Proceedings %B 10th Working Conference on Mining Software Repositories %D 2013 %T Communication in Open Source Software Development Mailing Lists %A Guzzi, Anja %A Bacchelli, Alberto %A Lanza, Michele %A Pinzger, Martin %A van Deursen, Arie %K email %K lucene %K mailling list %X Open source software (OSS) development teams use electronic means, such as emails, instant messaging, or forums, to conduct open and public discussions. Researchers investigated mailing lists considering them as a hub for project communication. Prior work focused on specific aspects of emails, for example the handling of patches, traceability concerns, or social networks. This led to insights pertaining to the investigated aspects, but not to a comprehensive view of what developers communicate about. Our objective is to increase the understanding of development mailing lists communication. We quantitatively and qualitatively analyzed a sample of 506 email threads from the development mailing list of a major OSS project, Lucene. Our investigation reveals that implementation details are discussed only in about 35% of the threads, and that a range of other topics is discussed. Moreover, core developers participate in less than 75% of the threads. We observed that the development mailing list is not the main player in OSS project communication, as it also includes other channels such as the issue repository. %B 10th Working Conference on Mining Software Repositories %P 277-286 %8 05/2013 %U http://www.st.ewi.tudelft.nl/~guzzi/downloads/Guzzi2013msr.pdf %> https://flosshub.org/sites/flosshub.org/files/Guzzi2013msr.pdf %0 Conference Paper %B Proceedings of the 22Nd International Conference on World Wide Web Companion %D 2013 %T Discovery of Technical Expertise from Open Source Code Repositories %A Venkataramani, Rahul %A Gupta, Atul %A Asadullah, Allahbaksh %A Muddu, Basavaraju %A Bhat, Vasudev %K github %K knowledge discovery %K recommendations %K source code repository %K stackoverflow %K technical expertise %X Online Question and Answer websites for developers have emerged as the main forums for interaction during the software development process. The veracity of an answer in such websites is typically verified by the number of 'upvotes' that the answer garners from peer programmers using the same forum. Although this mechanism has proved to be extremely successful in rating the usefulness of the answers, it does not lend itself very elegantly to model the expertise of a user in a particular domain. In this paper, we propose a model to rank the expertise of the developers in a target domain by mining their activity in different opensource projects. To demonstrate the validity of the model, we built a recommendation system for StackOverflow which uses the data mined from GitHub. %B Proceedings of the 22Nd International Conference on World Wide Web Companion %S WWW '13 Companion %I International World Wide Web Conferences Steering Committee %C Republic and Canton of Geneva, Switzerland %P 97–98 %@ 978-1-4503-2038-2 %U http://dl.acm.org/citation.cfm?id=2487788.2487832 %0 Book %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %D 2013 %T The Emergence of Quality Assurance Practices in Free/Libre Open Source Software: A Case Study %A Barham, Adina %E Petrinja, Etiel %E Succi, Giancarlo %E Ioini, Nabil %E Sillitti, Alberto %X As the user base of Free/Libre Open Source Software (FLOSS) diversifies, the need for higher quality is becoming more evident. This implies a more complex development model that includes various steps which were previously associated exclusively with proprietary development such as a formal quality assurance step (QA). However, little research has been done on how implementing formal quality assurance impacts the structure of FLOSS communities. This study aims to start filling this gap by analyzing interactions within such a community. Plone is just one among many FLOSS projects that acknowledged the importance of verification by implementing a quality assurance step. %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %I Springer Berlin Heidelberg %C Berlin, Heidelberg %V 404 %P 271 - 276 %@ 978-3-642-38928-3 %R 10.1007/978-3-642-38928-3_21 %0 Book %B Infrastructure for Building Code Search Applications for Developers %D 2013 %T Finding Source Code on the Web for Remix and Reuse %A Bajracharya, Sushil Krishna %E Sim, Susan Elliott %E Gallardo-Valencia, Rosalva E. %K code search %K flossmole cited %X The large availability of open source code on the Web provides great opportunities to build useful code search applications for developers. Building such applications requires addressing several challenges inherent in collecting and analyzing code from open source repositories to make them available for search. An infrastructure that supports collection, analysis, and search services for open source code available on the Web can greatly facilitate building effective code search applications. This chapter presents such an infrastructure called Sourcerer that facilitates collection, analysis, and search of source code available in code repositories on the Web. This chapter provides useful information to researchers and implementors of code search applications interested in harnessing the large availability of source code in the repositories on the Web. In particular, this chapter highlights key aspects of Sourcerer that supports combining Software Engineering and Information Retrieval techniques to build effective code search applications. %B Infrastructure for Building Code Search Applications for Developers %I Springer New York %C New York, NY %P 135 - 164 %@ 978-1-4614-6596-6 %U http://www.drsusansim.org/papers/FindingCodeontheWeb-20120822.pdf %R 10.1007/978-1-4614-6596-6_8 %0 Conference Proceedings %B 10th Working Conference on Mining Software Repositories %D 2013 %T INVocD: Identifier Name Vocabulary Dataset %A Simon Butler %A Wermelinger, Michel %A Yu, Yijun %A Helen Sharp %X INVocD is a database of the identifier name declarations and vocabulary found in 60 FLOSS Java projects where the source code structure is recorded and the identifier name vocabulary is made directly available, offering advantages for identifier name research over conventional source code models. The database has been used to support a range of research projects from identifier name analysis to concept location, and provides many opportunities to researchers. INVocD may be downloaded from http://oro.open.ac.uk/36992 %B 10th Working Conference on Mining Software Repositories %8 05/2013 %0 Journal Article %J Empirical Software Engineering %D 2013 %T Management of community contributions %A Bettenburg, Nicolas %A Hassan, Ahmed E. %A Adams, Bram %A Daniel M. German %K android %K contribution %K linux %K management %X In recent years, many companies have realized that collaboration with a thriving user or developer community is a major factor in creating innovative technology driven by market demand. As a result, businesses have sought ways to stimulate contributions from developers outside their corporate walls, and integrate external developers into their development process. To support software companies in this process, this paper presents an empirical study on the contribution management processes of two major, successful, open source software ecosystems. We contrast a for-profit (ANDROID) system having a hybrid contribution style, with a not-for-profit (LINUX kernel) system having an open contribution style. To guide our comparisons, we base our analysis on a conceptual model of contribution management that we derived from a total of seven major open-source software systems. A quantitative comparison based on data mined from the ANDROID code review system and the LINUX kernel code review mailing lists shows that both projects have significantly different contribution management styles, suited to their respective market goals, but with individual advantages and disadvantages that are important for practitioners. Contribution management is a real-world problem that has received very little attention from the research community so far. Both studied systems (LINUX and ANDROID) employ different strategies and techniques for managing contributions, and both approaches are valuable examples for practitioners. Each approach has specific advantages and disadvantages that need to be carefully evaluated by practitioners when adopting a contribution management process in practice. %B Empirical Software Engineering %I Springer %P 1–38 %U http://link.springer.com/article/10.1007/s10664-013-9284-6 %0 Conference Paper %B 2013 Third International Conference on Intelligent System Design and Engineering Applications (ISDEA) %D 2013 %T Mining Developer Contribution in Open Source Software Using Visualization Techniques %A Ben, Xu %A Beijun, Shen %A Weicheng, Yang %K github %X The research of developers' contribution is an important part of the software evolution area. It allows project owners to find potential long-term contributors earlier and helps the newcomers to improve their behaviors. In this paper, we examined the contribution characteristics of developers in open source environment based on visual analysis, and presented approaches from three aspects-influencing factors, time characteristics and region characteristics. Our analysis used data from github and revealed some regular patterns. We found that the code which newcomers started to contribute with more people engaged in would lead to less contribution in some degree. We also found that there's a relation between developers' early and later period contribution. In addition, developers from different regions were more likely to have dominant relationship. Our findings may provide some support for future research in the area of software evolution. %B 2013 Third International Conference on Intelligent System Design and Engineering Applications (ISDEA) %I IEEE %C China, Hong Kong %P 934 - 937 %@ 978-0-7695-4923-1 %R 10.1109/ISDEA.2012.223 %0 Book %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %D 2013 %T Misconceptions and Barriers to Adoption of FOSS in the U.S. Energy Industry %A Kuechler, Victor %A Jensen, Carlos %A Bryant, Deborah %E Petrinja, Etiel %E Succi, Giancarlo %E Ioini, Nabil %E Sillitti, Alberto %X In this exploratory study, we map the use of free and open source software (FOSS) in the United States energy sector, especially as it relates to cyber security. Through two surveys and a set of semi-structured interviews—targeting both developers and policy makers—we identified key stakeholders, organizations, and FOSS projects, be they rooted in industry, academia, or public policy space that influence software and security practices in the energy sector. We explored FOSS tools, common attitudes and concerns, and challenges with regard to FOSS adoption. More than a dozen themes were identified from interviews and surveys. Of these, drivers for adoption and risks associated with FOSS were the most prevalent. More specifically, the misperceptions of FOSS, the new security challenges presented by the smart grid, and the extensive influence of vendors in this space play the largest roles in FOSS adoption in the energy sector. %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %I Springer Berlin Heidelberg %C Berlin, Heidelberg %V 404 %P 232 - 244 %@ 978-3-642-38928-3 %R 10.1007/978-3-642-38928-3_17 %0 Book %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %D 2013 %T Modeling Practices in Open Source Software %A Badreddin, Omar %A Lethbridge, Timothy %A Elassar, Maged %E Petrinja, Etiel %E Succi, Giancarlo %E Ioini, Nabil %E Sillitti, Alberto %X It is widely accepted that modeling in software engineering increases productivity and results in better code quality. Yet, modeling adoption remains low. The open source community, in particular, remains almost entirely code centric. In this paper, we explore the reasons behind such limited adoption of modeling practices among open source developers. We highlight characteristics of modeling tools that would encourage their adoption. We propose Umple as a solution where both modeling and coding elements are treated uniformly. In this approach, models can be manipulated textually and code can be edited visually. We also report on the Umple compiler itself as a case study of an open source project where contributors, using the above approach, have and continue to routinely commit code and model over a number of years. %B IFIP Advances in Information and Communication TechnologyOpen Source Software: Quality Verification %I Springer Berlin Heidelberg %C Berlin, Heidelberg %V 404 %P 127 - 139 %@ 978-3-642-38928-3 %R 10.1007/978-3-642-38928-3_9 %> https://flosshub.org/sites/flosshub.org/files/Modeling-Practices-in-Open-Source-Software.pdf %0 Conference Paper %B 2013 18th International Conference on Engineering of Complex Computer Systems (ICECCS) %D 2013 %T Orion: A Software Project Search Engine with Integrated Diverse Software Artifacts %A Bissyande, Tegawende F. %A Thung, Ferdian %A Lo, David %A Jiang, Lingxiao %A Reveillere, Laurent %K flossmole cited %X What projects contain more than 10, 000 lines of code developed by less than 10 people and are still actively maintained with a high bug-fixing rate? To address the challenges for answering such enquiries, we develop an integrated search engine architecture that combines information from different types of software repositories from multiple sources. Our search engine facilitates the construction and execution of complex search queries using a uniform interface that transparently correlates different artifacts of project development and maintenance, such as source code information, version control systems metadata, bug tracking systems elements, and metadata on developer activities and interactions extracted from hosting platforms. We have built an extensible system with an initial capability of over 100, 000 projects collected from the web, featuring various software development artifacts. Using scenarios, we illustrate the benefits of such a search engine for different kinds of project seekers. %B 2013 18th International Conference on Engineering of Complex Computer Systems (ICECCS) %I IEEE %C Singapore, Singapore %P 242 - 245 %U http://www.mysmu.edu/faculty/davidlo/papers/iceccs13-projectsearch.pdf %R 10.1109/ICECCS.2013.42 %0 Journal Article %J Science of Computer Programming %D 2013 %T Towards base rates in software analytics %A Bruntink, Magiel %K ohloh %X Nowadays a vast and growing body of open source software (OSS) project data is publicly available on the internet. Despite this public body of project data, the field of software analytics has not yet settled on a solid quantitative base for basic properties such as code size, growth, team size, activity, and project failure. What is missing is a quantification of the base rates of such properties, where other fields (such as medicine) commonly rely on base rates for decision making and the evaluation of experimental results. The lack of knowledge in this area impairs both research activities in the field of software analytics and decision making on software projects in general. This paper contributes initial results of our research towards obtaining base rates using the data available at Ohloh (a large-scale index of OSS projects). Zooming in on the venerable ‘lines of code’ metric for code size and growth, we present and discuss summary statistics and identify further research challenges. %B Science of Computer Programming %8 11/2013 %U http://www.sciencedirect.com/science/article/pii/S0167642313003079 %! Science of Computer Programming %R 10.1016/j.scico.2013.11.023 %0 Journal Article %J Empirical Software Engineering %D 2012 %T Analyzing and mining a code search engine usage log %A Bajracharya, Sushil Krishna %A Lopes, Cristina Videira %K code search %K koders %K search %K search engine %K topics %X This paper presents an analysis of a year long usage log of Koders, the first commercially available Internet-Scale code search engine (http://www.koders.com). The usage log comprises about ten million activities from more than three million users. Analysis of the usage data shows that despite of attracting a large number of visitors, Koders has a very sparse usage and that it lacks regular usage from many of its users. When compared to Web search, search behavior in Koders showed many similar patterns. A topic modeling analysis of the usage data shows what topics users of Koders are looking for. Observations on the prevalence of these topics among the users, and observations on how search and download activities vary across topics, lead to the conclusion that users who find code search engines usable are those who already know to a high level of specificity what to look for. This paper also presents a general categorization of these topics that provides insights on the different ways code search engine users express their queries. It identifies various forms of queries in Koders’s log and the kinds of results addressed by the queries. It also provides several suggestions for improvements in code search engines based on the analysis of usage, topics, and query forms. The work presented in this paper is the first of its kind that reveals several insights on the usage of an Internet-Scale code search engine. %B Empirical Software Engineering %V 17 %P 424 - 466 %8 8/2012 %N 4-5 %! Empir Software Eng %R 10.1007/s10664-010-9144-6 %0 Journal Article %J Empirical Software Engineering %D 2012 %T Clones: what is that smell? %A Rahman, Foyzur %A Christian Bird %A Devanbu, Premkumar %X Clones are generally considered bad programming practice in software engineering folklore. They are identified as a bad smell (Fowler et al. 1999) and a major contributor to project maintenance difficulties. Clones inherently cause code bloat, thus increasing project size and maintenance costs. In this work, we try to validate the conventional wisdom empirically to see whether cloning makes code more defect prone. This paper analyses the relationship between cloning and defect proneness. For the four medium to large open source projects that we studied, we find that, first, the great majority of bugs are not significantly associated with clones. Second, we find that clones may be less defect prone than non-cloned code. Third, we find little evidence that clones with more copies are actually more error prone. Fourth, we find little evidence to support the claim that clone groups that span more than one file or directory are more defect prone than collocated clones. Finally, we find that developers do not need to put a disproportionately higher effort to fix clone dense bugs. Our findings do not support the claim that clones are really a “bad smell” (Fowler et al. 1999). Perhaps we can clone, and breathe easily, at the same time. %B Empirical Software Engineering %V 17 %P 503 - 530 %8 8/2012 %N 4-5 %! Empir Software Eng %R 10.1007/s10664-011-9195-3 %0 Conference Paper %B Proceedings of the 34th IEEE/ACM International Conference On Software Engineering (ICSE 2012) %D 2012 %T Content classification of developer emails %A Bacchelli, Alberto %A Dal Sasso, Tommaso %A D'Ambros, Marco %A Lanza, Michele %K email %K Emails %K Empirical software engineering %K mailing list %K natural language %K Unstructured Data Mining %X Emails related to the development of a software system contain information about design choices and issues encountered during the development process. Exploiting the knowledge embedded in emails with automatic tools is challenging, due to the unstructured, noisy and mixed language nature of this communication medium. Natural language text is often not well-formed and is interleaved with languages with other syntaxes, such as code or stack traces. We present an approach to classify email content at line level. Our technique classifies email lines in five categories (i.e., text, junk, code, patch, and stack trace) to allow one to subsequently apply ad hoc analysis techniques for each category. We evaluated our approach on a statistically significant set of emails gathered from mailing lists of four unrelated open source systems. %B Proceedings of the 34th IEEE/ACM International Conference On Software Engineering (ICSE 2012) %8 06/2012 %U http://www.inf.usi.ch/phd/bacchelli/publications.php %> https://flosshub.org/sites/flosshub.org/files/icse2012.pdf %0 Conference Paper %B 45th Hawai'i International Conference on System Sciences %D 2012 %T An Empirical Study of Volunteer Members' Perceived Turnover in Open Source Software Projects %A Yu, Yiqing %A Benlian, Alexander %A Hess, Thomas %K developers %K launchpad %K sourceforge %K Survey %X Turnover of volunteer members and the ensuing instability bring about severe problems to open source software (OSS) projects. To better understand it, we based our study on Herzberg ́s two-factor theory to investigate the influence of hygiene factors on volunteer members ́ dissatisfaction and perceived turnover. After empirically testing the research model, we found shortcomings in project regulation and administration are the key reason for volunteer members ́ dissatisfaction, followed by future rewards and personal needs for software functionalities. By contrast, a possible lack of supportive working relationship among OSS developers was not found to be a trigger for developer dissatisfaction. Dissatisfaction was confirmed to be a significant predictor of perceived turnover. The results demonstrates generalized hygiene factors cannot unreflectively be transferred into the OSS context because volunteer members ́ personal expectation has a weaker influence on perceived turnover than objective attributes of OSS project. Our study further makes suggestions for project administrators. %B 45th Hawai'i International Conference on System Sciences %P 3396-3405 %8 01/2012 %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T Exploring the Role of Commercial Stakeholders in Open Source Software Evolution %A Capiluppi, Andrea %A Stol, Klaas-Jan %A Boldyreff, Cornelia %X It has been lately established that a major success or failure factor of an OSS project is whether or not it involves a commercial company, or more extremely, when a project is managed by a commercial software corporation. As documented recently, the success of the Eclipse project can be largely attributed to IBM’s project management, since the upper part of the developer hierarchy is dominated by its staff. This paper reports on the study of the evolution of three different Open Source (OSS) projects — the Eclipse and jEdit IDEs and the Moodle e-learning system — looking at whether they have benefited from the contribution of commercial companies. With the involvement of commercial companies, it is found that OSS projects achieve sustained productivity, increasing amounts of output produced and intake of new developers. It is also found that individual and commercial contributions show similar stages: developer intake, learning effect, sustained contributions and, finally, abandonment of the project. This preliminary evidence suggests that a major success factor for OSS is the involvement of a commercial company, or more radically, when project management is in hands of a commercial entity. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT %V 378 %P 178-200 %8 09/2012 %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T Free and Open Source Software Adoption in Emerging Markets: An Empirical Study in the Education Sector %A Gangadharan, G.R. %A Butler, Martin %X The adoption of Free and Open Source Software (FOSS) in the education sector in emerging markets holds much promise, but should be accompanied by a well-informed decision to ensure that the potential value is realized. The research conducted provides insight into the pragmatic factors driving the adoption of FOSS in the education environment, as well as those aspects inhibiting adoption. This study indicates an increasing readiness to accept FOSS in the education sector, where the more successful organizations show a readiness to adopt a comprehensive decision model to ensure the installation of appropriate ICT infrastructure, including FOSS, for the future. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT, Springer %V 378 %P 244-249 %8 09/2012 %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T How Can Open Standards Be Effectively Implemented in Open Source? Challenges and the ORIOS Project %A Lundell, Björn %A AbduraHmanovic, Admir %A Andersson, Stefan %A Bergström, Erik %A Feist, Jonas %A Gamalielsson, Jonas %A Gustavsson, Tomas %A Kahlbom, Roger %A Papaxanthis, Konstantin %X Many organisations are currently restricted in their choice of software because of restrictions imposed by existing systems. Challenges include a lack of interoperability and a risk of technological lock-in, which many small companies seek to address by utilising Open Standards and Open Source implementations of such standards when developing and deploying systems. This paper presents an overview of how the industrial research project ORIOS (Open Source software Reference Implementations of Open Standards) seeks to address identified challenges. An overarching goal of the project is to improve understanding within organisations of Open Standards, Open Source Reference Implementations, and the ecosystems around them. This will be done by developing a reference model of necessary and desirable features of an Open Standard, and how Open Standards and their implementations can be utilised by small companies in different usage contexts. An action case study approach will be used as a core strategy for evolving a reference model together with Swedish companies. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT, Springer %C Eighth International Conference on Open Source Systems %V 378 %P 383-388 %8 09/2012 %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T The Impact of Formal QA Practices on FLOSS Communities – The Case of Mozilla %A Barham, Adina %K email %K information flow %K mailing lists %K mozilla %K quality assurance %K social network analysis %K test %X The number of FLOSS projects that include a QA step in the development model is increasing which suggests that a new layer may be emerging in the classic “onion model”. This change might affect the information flow within projects and implicitly their sustainability. Communities, the essential resource of FLOSS projects, have been extensively studied but questions concerning QA remain. This paper takes a step towards answering such questions by analyzing QA mailing lists and issue tracker data for the Mozilla group of projects. Because the Bugzilla data set contains over half a million bugs, data processing and analysis is a considerable challenge for this research. The provisional conclusions are that QA activity may not be increasing steadily over time but is dependent on other factors and that the QA team and other groups of contributors form a highly connected network that doesn’t contain isolates. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT, Springer %V 378 %P 262-267 %8 09/2012 %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T A Linguistic Analysis on How Contributors Solve Software Problems in a Distributed Context %A Masmoudi, Héla %A Boughzala, Imed %K bug report %K bugzilla %K linguistic %K text mining %X There is a little understanding of distributed solving activities in Open Source communities. This study aimed to provide some insights in this way. It was applied to the context of Bugzilla, the bug tracking system of Mozilla community. This study investigated the organizational aspects of this meditated, complex and highly distributed context through a linguistic analysis method. The main finding of this research shows that the organization of distributed problem-solving activities in Bugzilla isn’t based only on the hierarchical distribution of the work between core and periphery participants but on their implication in the interactions. This implication varies according to the status of each one participant in the community. That is why we distinguish their roles, as well as, the established modes to manage such activity. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT, Springer %V 378 %P 322-330 %8 09/2012 %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T A Model of Open Source Developer Foundations %A Dirk Riehle %A Sebastian Berschneider %X Many community open source projects are of high economic relevance. As these projects mature, their leaders face a choice of continuing the project as is, making the project join an existing foundation, or creating their own foundation for the project. This article presents a model of open source developer foundations that project leaders can use to compare existing foundations with their needs or to design their own. The model is based on a three- iteration qualitative study involving interviews and supplementary materials review. To demonstrate its usefulness, we apply the model to nine foundations and present their organizational choices in a comparative table format. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT %V 378 %P 15-28 %8 09/2012 %U http://dirkriehle.com/uploads/2012/05/Riehle-MOSDF-v12-Final-Web.pdf %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T A Qualitative Method for Mining Open Source Software Repositories %A Noll, John %A Seichter, Dominik %A Beecham, Sarah %K content analysis %K Electronic Medical Record %K Qualitative Research %X The volume of data archived in open source software project repositories makes automated, quantitative techniques attractive for extracting and analyzing information from these archives. However, many kinds of archival data include blocks of natural language text that are difficult to analyze automatically. This paper introduces a qualitative analysis method that is transparent and repeatable, leads to objective findings when dealing with qualitative data, and is efficient enough to be applied to large archives. The method was applied in a case study of developer and user forum discussions of an open source electronic medical record project. The study demonstrates that the qualitative repository mining method can be employed to derive useful results quickly yet accurately. These results would not be possible using a strictly automated approach. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT, Springer %V 378 %P 256-261 %8 09/2012 %0 Conference Proceedings %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %D 2012 %T A Study on OSS Marketing and Communication Strategies %A del Bianco, Vieri %A Lavazza, Luigi %A Lenarduzzi, Valentina %A Morasca, Sandro %A Taibi, Davide %A Tosi, Davide %X The goal of every open source project is to gain as many satisfied users as possible. To this end, open source software producers should focus on both product development and communication. Currently, most open source projects are mainly concerned with developing code using the most appealing technologies and introducing fancy features. On the contrary, open source software producers seem to lack good communication strategies. In this paper we describe the communication strategies adopted by three successful companies that are active in open source software development. The goal of the paper is to provide some hints that could help other open source software producers identify communication strategies that are effective in promoting their products on the market. %B IFIP Advances in Information and Communication Technology 378 (OSS 2012) %I IFIP AICT, Springer %C Eighth International Conference on Open Source Systems %V 378 %P 338-343 %8 09/2012 %0 Journal Article %J Empirical Software Engineering %D 2012 %T Studying the impact of social interactions on software quality %A Bettenburg, Nicolas %A Hassan, Ahmed E. %K bug tracker %K eclipse %K Firefox %K Human Factors %K measurement %K metrics %K software evolution %K Software quality assurance %X Correcting software defects accounts for a significant amount of resources in a software project. To make best use of testing efforts, researchers have studied statistical models to predict in which parts of a software system future defects are likely to occur. By studying the mathematical relations between predictor variables used in these models, researchers can form an increased understanding of the important connections between development activities and software quality. Predictor variables used in past top-performing models are largely based on source code-oriented metrics, such as lines of code or number of changes. However, source code is the end product of numerous interlaced and collaborative activities carried out by developers. Traces of such activities can be found in the various repositories used to manage development efforts. In this paper, we develop statistical models to study the impact of social interactions in a software project on software quality. These models use predictor variables based on social information mined from the issue tracking and version control repositories of two large open-source software projects. The results of our case studies demonstrate the impact of metrics from four different dimensions of social interaction on post-release defects. Our findings show that statistical models based on social information have a similar degree of explanatory power as traditional models. Furthermore, our results demonstrate that social information does not substitute, but rather augments traditional source code-based metrics used in defect prediction models. %B Empirical Software Engineering %! Empir Software Eng %R 10.1007/s10664-012-9205-0 %0 Conference Paper %B CSMR '12: Proceedings of the 16th European Conference on Software Maintenance and Reengineering %D 2012 %T Using Code Search to Link Code Fragments in Discussions and Source Code %A Bettenburg, Nicolas %A Stephen W. Thomas %A Hassan, Ahmed E. %X When discussing software, practitioners often reference parts of the project’s source code. Such references have different motivations, such as mentoring and guiding less experienced developers, pointing out code that needs changes, or proposing possible strategies for the implementation of future changes. The fact that particular parts of a source code are being discussed makes these parts of the software special. Knowing which code is being talked about the most can not only help practitioners to guide important software engineering and maintenance activities, but also act as a high-level documentation of development activities for managers. In this paper, we use clone-detection as specific instance of a code search based approach for establishing links between code fragments that are discussed by developers and the actual source code of a project. Through a case study on the Eclipse project we explore the traceability links established through this approach, both quantitatively and qualitatively, and compare fuzzy code search based traceability linking to classical approaches, in particular change log analysis and information retrieval. We demonstrate a sample application of code search based traceability links by visualizing those parts of the project that are most discussed in issue reports with a Treemap visualization. The results of our case study show that the traceability links established through fuzzy code search-based traceability linking are conceptually different than classical approaches based on change log analysis or information retrieval. %B CSMR '12: Proceedings of the 16th European Conference on Software Maintenance and Reengineering %I IEEE %P 319-329 %> https://flosshub.org/sites/flosshub.org/files/Bettenburg_2012_CSMR.pdf %0 Conference Paper %B Proceedings of the Working Conference on Mining Software Repositories %D 2012 %T Who? What? Where? Examining Distributed Development in Two Large Open Source Projects %A Christian Bird %A Nachiappan Nagappan %K eclipse %K Firefox %X To date, a large body of knowledge has been built up around understanding open source software development. However, there is limited research on examining levels of geographic and organizational distribution within open source software projects, despite many studies examining these same aspects in commercial contexts. We set out to fill this gap in OSS knowledge by manually collecting data for two large, mature, successful projects in an effort to assess how distributed both geographically and organizationally. Both Firefox and Eclipse have been the subject of many studies and are ubiquitous in the areas of software development and internet usage respectively. Further, both receive substantial development contributions from many companies. As such, both are worthy of study in order to understand the development processes that they use, how distributed the projects are, and what, if any, relationship distribution has with quality. To this end, we identified the top contributors that made 95% of the changes over multiple major releases of Firefox and Eclipse and determined their geographic locations and organizational affiliations. We found that Firefox is very geographically distributed with over a third of its components receiving major contributions from developers on different continents, and that components that are highly distributed have no more defects than those that are not. In contrast, Eclipse is directed and developed largely by one company; with IBM making 96% of the total commits (49% coming from one lab in Ottawa, Canada). We further examined the distribution in each project’s constituent subsystems and report the relationship of pre- and post-release defects with geographic and organizational factors. %B Proceedings of the Working Conference on Mining Software Repositories %> https://flosshub.org/sites/flosshub.org/files/bird2012www.pdf %0 Conference Proceedings %B Open Source Systems: Grounding Research (OSS 2011) %D 2011 %T Developing Architectural Documentation for the Hadoop Distributed File System %A Bass, Len %A Kazman, Rick %A Ozkaya, Ipek %X Many open source projects are lacking architectural documentation that describes the major pieces of the system, how they are structured, and how they interact. We have produced architectural documentation for the Hadoop Distributed File System (HDFS), a major open source project. This paper describes our process and experiences in developing this documentation. We illustrate the documentation we have produced and how it differs from existing documentation by describing the redundancy mechanisms used in HDFS for reliability. %B Open Source Systems: Grounding Research (OSS 2011) %I Springer %P 50-61 %8 10/2011 %0 Conference Paper %B Companion to the Proceedings of the 33rd International Conference on Software Engineering %D 2011 %T Exploring, exposing, and exploiting emails to include human factors in software engineering %A Bacchelli, Alberto %K email communication %K toolset %K unstructured data %X Researchers mine software repositories to support software maintenance and evolution. The analysis of the structured data, mainly source code and changes, has several benefits and offers precise results. This data, however, leaves communication in the background, and does not permit a deep investigation of the human factor, which is crucial in software engineering. Software repositories also archive documents, such as emails or comments, that are used to exchange knowledge among people - we call it "people-centric information." By covering this data, we include the human factor in our analysis, yet its unstructured nature makes it currently sub-exploited. Our work, by focusing on email communication and by implementing the necessary tools, investigates methods for exploring, exposing, and exploiting unstructured data. We believe it is possible to close the gap between development and communication, extract opinions, habits, and views of developers, and link implementation to its rationale; we see in a future where software analysis and development is routinely augmented with people-centric information. %B Companion to the Proceedings of the 33rd International Conference on Software Engineering %S ICSE '11 %I ACM %C New York, NY, USA %P 1074–1077 %@ 978-1-4503-0445-0 %U http://doi.acm.org/10.1145/1985793.1985999 %R 10.1145/1985793.1985999 %0 Conference Paper %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %D 2011 %T Java generics adoption %A Christian Bird %A Murphy-Hill, Emerson %A Parnin, Chris %Y van Deursen, Arie %Y Xie, Tao %Y Zimmermann, Thomas %K commits %K generics %K java %K source code %K version history %X Support for generic programming was added to the Java language in 2004, representing perhaps the most significant change to one of the most widely used programming languages today. Researchers and language designers anticipated this addition would relieve many long-standing problems plaguing developers, but surprisingly, no one has yet measured whether generics actually provide such relief. In this paper, we report on the first empirical investigation into how Java generics have been integrated into open source software by automatically mining the history of 20 popular open source Java programs, traversing more than 500 million lines of code in the process. We evaluate five hypotheses, each based on assertions made by prior researchers, about how Java developers use generics. For example, our results suggest that generics do not significantly reduce the number of type casts and that generics are usually adopted by a single champion in a project, rather than all committers. %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %I ACM Press %C New York, New York, USA %P 3-12 %8 05/2011 %@ 9781450305747 %! MSR '11 %R 10.1145/1985441.1985446 %0 Conference Paper %B 15th European Conference on Software Maintenance and Reengineering (CSMR 2011) %D 2011 %T Process Mining Software Repositories %A Poncin, Wouter %A Serebrenik, Alexander %A Brand, Mark van den %K amsn %K email %K email archives %K gcc %K mailing list %K Process mining %K software repositories %X Software developers’ activities are in general recorded in software repositories such as version control systems, bug trackers and mail archives. While abundant information is usually present in such repositories, successful information extraction is often challenged by the necessity to simultaneously analyze different repositories and to combine the information obtained. We propose to apply process mining techniques, originally developed for business process analysis, to address this challenge. However, in order for process mining to become applicable, different software repositories should be combined, and “related” software development events should be matched: e.g., mails sent about a file, modifications of the file and bug reports that can be traced back to it. The combination and matching of events has been implemented in FRASR (FRamework for Analyzing Software Repositories), augmenting the process mining framework ProM. FRASR has been successfully applied in a series of case studies addressing such aspects of the development process as roles of different developers and the way bug reports are handled. %B 15th European Conference on Software Maintenance and Reengineering (CSMR 2011) %I IEEE %C Oldenburg, Germany %P 5 - 14 %@ 978-1-61284-259-2 %R 10.1109/CSMR.2011.5 %> https://flosshub.org/sites/flosshub.org/files/2011-03_CSMR.pdf %0 Generic %D 2011 %T Sociotechnical Coordination and Collaboration in Open Source Software %A Christian Bird %X Over the past decade, a new style of software development, termed open source software (OSS) has emerged and has originated large, mature, stable, and widely used software projects. As software continues to grow in size and complexity, so do development teams. Consequently, coordination and communication within these teams play larger roles in productivity and software quality. My dissertation focuses on the relationships between developers in large open source projects and how software affects and is affected by these relationships. Fortunately, source code repository histories, mailing list archives, and bug databases from OSS projects contain latent data from which we can reconstruct a rich view of a project over time and analyze these sociotechnical relationships. We present methods of obtaining and analyzing this data as well as the results of empirical studies whose goal is to answer questions that can help stakeholders understand and make decisions about their own teams. We answer questions such as “Do large OSS project really have a disorganized bazaar-like structure?” “What is the relationship between social and development behavior in OSS?” “How does one progress from a project newcomer to a full-fledged, core developer?” and others in an attempt to understand how large, successful OSS projects work and also to contrast them with projects in commercial settings. %B Proceedings of the 27th IEEE International Conference on Software Maintenance %I IEEE %> https://flosshub.org/sites/flosshub.org/files/bird2011scc.pdf %0 Conference Proceedings %B Open Source Systems: Grounding Research (OSS 2011) %D 2011 %T Something of a Potemkin Village? Acid2 and Mozilla’s Efforts to Comply with HTML4 %A den Besten, Matthijs %A Jean-Michel Dalle %X The real point here is that the Acid3 test isn’t a broad-spectrum standards-support test. It’s a showpiece, and something of a Potemkin village at that. Which is a shame, because what’s really needed right now is exhaustive test suites for specifications— XHTML, CSS, DOM, SVG. %B Open Source Systems: Grounding Research (OSS 2011) %I Springer %P 320-324 %8 10/2011 %0 Conference Proceedings %B Open Source Systems: Grounding Research (OSS 2011) %D 2011 %T Successful Reuse of Software Components: A Report from the Open Source Perspective %A Capiluppi, Andrea %A Boldyreff, Cornelia %A Stol, Klaas-Jan %K component-based software development %K OSS components %K Software reuse %X A promising way of software reuse is Component-Based Software Development (CBSD). There is an increasing number of OSS products available that can be freely used in product development. However, OSS communities themselves have not yet taken full advantage of the “reuse mechanism”. Many OSS projects duplicate effort and code, even when sharing the same application domain and topic. One successful counter-example is the FFMpeg multimedia project, since several of its components are widely and consistently reused into other OSS projects. This paper documents the history of the libavcodec library of components from the FFMpeg project, which at present is reused in more than 140 OSS projects. Most of the recipients use it as a black-box component, although a number of OSS projects keep a copy of it in their repositories, and modify it as such. In both cases, we argue that libavcodec is a successful example of reusable OSS library of components. %B Open Source Systems: Grounding Research (OSS 2011) %I Springer %P 159-176 %8 10/2011 %0 Conference Paper %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %D 2011 %T System compatibility analysis of Eclipse and Netbeans based on bug data %A Baik, Eilwoo %A Devanbu, Premkar %A Wang, Xinlei (Oscar) %Y van Deursen, Arie %Y Xie, Tao %Y Zimmermann, Thomas %K bug tracking system %K bugzilla %K eclipse %K ms challenge %K netbeans %K version history %X Eclipse and Netbeans are two top of the line Integrated Development Environments (IDEs) for Java development. Both of them provide support for a wide variety of development tasks and have a large user base. This paper provides an analysis and comparison for the compatibility and stability of Eclipse and Netbeans on the three most commonly used operating systems, Windows, Linux and Mac OS. Both IDEs are programmed in Java and use a Bugzilla issue tracker to track reported bugs and feature requests. We looked into the Bugzilla repository databases of these two IDEs, which contains the bug records and histories of these two IDEs. We used some basic data mining techniques to analyze some historical statistics of the bug data. Based on the analysis, we try to answer certain stability-comparison oriented questions in the paper, so that users can have a better idea which of these two IDEs is designed better to work on different platforms. %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %I ACM Press %C Waikiki, Honolulu, HI, USANew York, New York, USA %P 230-233 %8 05/2011 %@ 9781450305747 %! MSR '11 %R 10.1145/1985441.1985479 %0 Conference Paper %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %D 2011 %T A tale of two browsers %A Davis, Ian %A Godfrey, Michael W. %A Baysal, Olga %Y van Deursen, Arie %Y Xie, Tao %Y Zimmermann, Thomas %K chrome %K development history %K Firefox %K msr challenge %X We explore the space of open source systems and their user communities by examining the development artifact histories of two popular web browsers -- Firefox and Chrome -- as well as usage data. By examining the data and addressing a number of research questions, two very different profiles emerge: Firefox, as the older and established system, with long product version cycles but short bug fix cycles, and a user base that is slow to adopt newer versions; and Chrome, as the new and fast evolving system, with short version cycles, longer bug fix cycles, and a user base that very quickly adopts new versions as they become available (due largely to Chrome's mandatory automatic updates). %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %I ACM Press %C New York, New York, USA %P 238-241 %8 05/2011 %@ 9781450305747 %! MSR '11 %R 10.1145/1985441.1985481 %0 Conference Paper %B Proceedings of the 2011 Community Building Workshop on Collaborative Teaching of Globally Distributed Software Development %D 2011 %T Teaching distributed software engineering with UCOSP: the undergraduate capstone open-source project %A Stroulia, Eleni %A Bauer, Ken %A Craig, Michelle %A Reid, Karen %A Wilson, Greg %K distributed %K education %K pedagogical %K project-based courses %K software engineering education %X Software engineering courses in computer-science departments are meant to prepare students for the practice of designing, developing, understanding and maintaining software in the real world. The effectiveness of these courses have potentially a tremendous impact on the software industry, since it is through these courses that students must learn the state-of-the-art process and the tools of their eventual "trade", so that they can bring this knowledge to their job and thus advance the actual state of practice. The value of "learning software engineering" through project-based courses has long been recognized by educators and practitioners alike. In this paper, we discuss our experience with a distributed project-based course, which infuses the students' learning experience with an increased degree of realism, which, we believe, further improves the quality of their learning and advances their readiness to join the profession. %B Proceedings of the 2011 Community Building Workshop on Collaborative Teaching of Globally Distributed Software Development %S CTGDSD '11 %I ACM %C New York, NY, USA %P 20–25 %@ 978-1-4503-0590-7 %U http://doi.acm.org/10.1145/1984665.1984670 %R 10.1145/1984665.1984670 %0 Conference Paper %B Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering %D 2011 %T Towards understanding twitter use in software engineering: preliminary findings, ongoing challenges and future questions %A Bougie, Gargi %A Starke, Jamie %A Storey, Margaret-Anne %A Daniel M. German %K eclipse %K linux %K mxunit %K social media %K software development %K twitter %K web 2.0 %X There has been some research conducted around the motivation for the use of Twitter and the value brought by micro-blogging tools to individuals and business environments. This paper builds on our understanding of how the phenomenon affects the population which birthed the technology: Software Engineers. We find that the Software Engineering community extensively leverages Twitter's capabilities for conversation and information sharing and that use of the tool is notably different between distinct Software Engineering groups. Our work exposes topics for future research and outlines some of the challenges in exploring this type of data. %B Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering %S Web2SE '11 %I ACM %C New York, NY, USA %P 31–36 %@ 978-1-4503-0595-2 %U http://doi.acm.org/10.1145/1984701.1984707 %R 10.1145/1984701.1984707 %> https://flosshub.org/sites/flosshub.org/files/WEB2SE2011.pdf %0 Conference Paper %B Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering %D 2011 %T Towards understanding twitter use in software engineering: preliminary findings, ongoing challenges and future questions %A Bougie, Gargi %A Starke, Jamie %A Storey, Margaret-Anne %A Daniel M. German %K eclipse %K linux %K mxunit %K social media %K software development %K twitter %K web 2.0 %X There has been some research conducted around the motivation for the use of Twitter and the value brought by micro-blogging tools to individuals and business environments. This paper builds on our understanding of how the phenomenon affects the population which birthed the technology: Software Engineers. We find that the Software Engineering community extensively leverages Twitter's capabilities for conversation and information sharing and that use of the tool is notably different between distinct Software Engineering groups. Our work exposes topics for future research and outlines some of the challenges in exploring this type of data. %B Proceedings of the 2nd International Workshop on Web 2.0 for Software Engineering %S Web2SE '11 %I ACM %C New York, NY, USA %P 31–36 %@ 978-1-4503-0595-2 %U http://doi.acm.org/10.1145/1984701.1984707 %R 10.1145/1984701.1984707 %> https://flosshub.org/sites/flosshub.org/files/WEB2SE2011_0.pdf %0 Conference Paper %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %D 2011 %T What topics do Firefox and Chrome contributors discuss? %A Zagarese, Quirino %A Distante, Damiano %A Di Penta, Massimiliano %A Bernardi, Mario Luca %A Sementa, Carmine %Y van Deursen, Arie %Y Xie, Tao %Y Zimmermann, Thomas %K bug reports %K chrome %K Firefox %K LDA %K msr challenge %X Firefox and Chrome are two very popular open source Web browsers, implemented in C/C++. This paper analyzes what topics were discussed in Firefox and Chrome bug reports over time. To this aim, we indexed the text contained in bug reports submitted each semester of the project history, and identified topics using Latent Dirichlet Allocation (LDA). Then, we investigated to what extent Firefox and Chrome developers/contributors discussed similar topics, either in different periods, or over the same period. Results indicate a non-negligible overlap of topics, mainly on issues related to page layouting, user interaction, and multimedia contents. %B Proceedings of the 8th working conference on Mining software repositories - MSR '11 %I ACM Press %C New York, New York, USA %P 234-237 %8 05/2011 %@ 9781450305747 %! MSR '11 %R 10.1145/1985441.1985480 %0 Journal Article %J Information and Software Technology %D 2010 %T Analysis of virtual communities supporting OSS projects using social network analysis %A Toral, S.L. %A Martínez-Torres, M.R. %A Barrero, F. %K arm %K email %K Knowledge brokers %K linux %K mailing list %K open source software %K social network analysis %K virtual communities %X This paper analyses the behaviour of virtual communities for Open Source Software (OSS) projects. The development of OSS projects relies on virtual communities, which are built on relationships among members, being their final objective sharing knowledge and improving the underlying project. This study addresses the interactive collaboration in these kinds of communities applying social network analysis (SNA). In particular, SNA techniques will be used to identify those members playing a middle-man role among other community members. Results will illustrate the importance of this role to achieve successful virtual communities. %B Information and Software Technology %V 52 %P 296 - 303 %8 3/2010 %U http://www.sciencedirect.com/science/article/pii/S0950584909001888 %N 3 %! Information and Software Technology %R 10.1016/j.infsof.2009.10.007 %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Automated dependency resolution for open source software %A Ossher, Joel %A Bajracharya, Sushil %A Lopes, Cristina %K dependencies %K java %K source code %K sourcerer %X Opportunities for software reuse are plentiful, thanks in large part to the widespread adoption of open source processes and the availability of search engines for locating relevant artifacts. One challenge presented by open source software reuse is simply getting a newly downloaded artifact to build/run in the first place. The artifact itself likely reuses other artifacts, and so depends on their being located to function properly. While merely tedious in the individual case, this can cause serious difficulties for those seeking to study open source software. It is simply not feasible to manually resolve dependencies for thousands of projects, and many forms of analysis require declarative completeness. In this paper we present a method for automatically resolving dependencies for open source software. It works by cross-referencing a project's missing type information with a repository of candidate artifacts. We have implemented this method on top of the Sourcerer, an infrastructure for the large-scale indexing and analysis of open source code. The performance of our resolution algorithm was evaluated in two parts. First, for a small number of popular open source projects, we manually examined the artifacts suggested by our system to determine if they were appropriate. Second, we applied the algorithm to the 13,241 projects in the Sourcerer managed repository to evaluate the rate of resolution success. The results demonstrate the feasibility of this approach, as the algorithm located all of the required artifacts needed by 3,904 additional projects, increasing the percentage of declaratively complete projects in Sourcerer from 39% to 69%. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 130 - 140 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463346 %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Clones: What is that smell? %A Rahman, Foyzur %A Christian Bird %A Devanbu, Premkumar %K apache %K bug fix revisions %K bugs %K clone %K evolution %K gimp %K nautilus %K scm %K source code %X Clones are generally considered bad programming practice in software engineering folklore. They are identified as a bad smell and a major contributor to project maintenance difficulties. Clones inherently cause code bloat, thus increasing project size and maintenance costs. In this work, we try to validate the conventional wisdom empirically to see whether cloning makes code more defect prone. This paper analyses relationship between cloning and defect proneness. We find that, first, the great majority of bugs are not significantly associated with clones. Second, we find that clones may be less defect prone than non-cloned code. Finally, we find little evidence that clones with more copies are actually more error prone. Our findings do not support the claim that clones are really a "bad smell". Perhaps we can clone, and breathe easy, at the same time. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 72 - 81 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463343 %> https://flosshub.org/sites/flosshub.org/files/72rahman2010cws.pdf %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Cloning and copying between GNOME projects %A Krinke, Jens %A Gold, Nicolas %A Jia, Yue %A Binkley, David %K clone %K gnome %K msr challenge %K source code %X This paper presents an approach to automatically distinguish the copied clone from the original in a pair of clones. It matches the line-by-line version information of a clone to the pair's other clone. A case study on the GNOME Desktop Suite revealed a complex flow of reused code between the different subprojects. In particular, it showed that the majority of larger clones (with a minimal size of 28 lines or higher) exist between the subprojects and more than 60% of the clone pairs can be automatically separated into original and copy. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 98 - 101 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463290 %> https://flosshub.org/sites/flosshub.org/files/98Coning.pdf %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T A comparative exploration of FreeBSD bug lifetimes %A Bougie, Gargi %A Treude, Christoph %A Daniel M. German %A Storey, Margaret-Anne %K bug reports %K bug tracking %K classification %K eclipse %K msr challenge %K prediction %X In this paper, we explore the viability of mining the basic data provided in bug repositories to predict bug lifetimes. We follow the method of Lucas D. Panjer as described in his paper, Predicting Eclipse Bug Lifetimes. However, in place of Eclipse data, the FreeBSD bug repository is used. We compare the predictive accuracy of five different classification algorithms applied to the two data sets. In addition, we propose future work on whether there is a more informative way of classifying bugs than is considered by current bug tracking systems. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 106 - 109 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463291 %> https://flosshub.org/sites/flosshub.org/files/106ChallengeGargi.pdf %0 Journal Article %J International Journal of Open Source Software and Processes %D 2010 %T Developing a Dynamic and Responsive Online Learning Environment %A Buchan, Janet %K education %K learning %K sakai %X Charles Stuart University adopted the open source software, Sakai, as the foundation for the university’s new, integrated Online Learning Environment. This study explores whether a pedagogical advantage exists in adopting such an open source learning management system. Research suggests that the community source approach to development of open source software has many inherent pedagogical advantages, but this paper examines whether this is due to the choice of open source software or simply having access to appropriate technology for learning and teaching in the 21st century. The author also addresses the challenges of the project management methodology and processes in the large-scale implementation of an open-source courseware management solution at the institutional level. Consequently, this study outlines strategies that an institution can use to harness the potential of a community source approach to software development to meet the institutional and individual user needs into the future. %B International Journal of Open Source Software and Processes %V 2 %P 32 - 48 %N 1 %R 10.4018/jossp.2010010103 %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Do stack traces help developers fix bugs? %A Schroter, Adrian %A Schröter, Adrian %A Bettenburg, Nicolas %A Premraj, Rahul %K bug fixing %K bug report %K debugging %K eclipse %K stack trace %X A widely shared belief in the software engineering community is that stack traces are much sought after by developers to support them in debugging. But limited empirical evidence is available to confirm the value of stack traces to developers. In this paper, we seek to provide such evidence by conducting an empirical study on the usage of stack traces by developers from the ECLIPSE project. Our results provide strong evidence to this effect and also throws light on some of the patterns in bug fixing using stack traces. We expect the findings of our study to further emphasize the importance of adding stack traces to bug reports and that in the future, software vendors will provide more support in their products to help general users make such information available when filing bug reports. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 118 - 121 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463280 %> https://flosshub.org/sites/flosshub.org/files/118-10-msr.pdf %0 Book Section %B Open Source Software: New Horizons %D 2010 %T Download Patterns and Releases in Open Source Software Projects: A Perfect Symbiosis? %A Rossi, Bruno %A Russo, Barbara %A Succi, Giancarlo %E Ågerfalk, Pär %E Boldyreff, Cornelia %E González-Barahona, Jesús %E Madey, Gregory %E Noll, John %K flossmole %K oss2010 %K sourceforge %X Software usage by end-users is one of the factors used to evaluate the success of software projects. In the context of open source software, there is no single and non-controversial measure of usage, though. Still, one of the most used and readily available measure is data about projects downloads. Nevertheless, download counts and averages do not convey as much information as the patterns in the original downloads time series. In this research, we propose a method to increase the expressiveness of mere download rates by considering download patterns against software releases. We apply experimentally our method to the most downloaded projects of SourceForge's history crawled through the FLOSSMole repository. Findings show that projects with similar usage can have indeed different levels of sensitivity to releases, revealing different behaviors of users. Future research will develop further the pattern recognition approach to automatically categorize open source projects according to their download patterns. %B Open Source Software: New Horizons %S IFIP Advances in Information and Communication Technology %I Springer Boston %V 319 %P 252-267 %U http://dx.doi.org/10.1007/978-3-642-13244-5_20 %0 Conference Paper %B Proceedings of ICPC 2010 (18th IEEE International Conference on Program Comprehension) %D 2010 %T Extracting source code from e-mails %A Bacchelli, Alberto %A D'Ambros, Marco %A Lanza, Michele %K argouml %K email %K freenet %K jmeter %K mailing lists %K mina %K natural language %K openjpa %K source code %X E-mails, used by developers and system users to communicate over a broad range of topics, offer a valuable source of information. If archived, e-mails can be mined to support program comprehension activities and to provide views of a software system that are alternative and complementary to those offered by the source code. However, e-mails are written in natural language, and therefore contain noise that makes it difficult to retrieve the important data. Thus, before conducting an effective system analysis and extracting data for program comprehension, it is necessary to select the relevant messages, and to expose only the meaningful information. In this work we focus both on classifying e-mails that hold fragments of the source code of a system, and on extracting the source code pieces inside the e-mail. We devised and analyzed a number of lightweight techniques to accomplish these tasks. To assess the validity of our techniques, we manually inspected and annotated a statistically significant number of e-mails from five unrelated open source software systems written in Java. With such a benchmark in place, we measured the effectiveness of each technique in terms of precision and recall. %B Proceedings of ICPC 2010 (18th IEEE International Conference on Program Comprehension) %P 24-33 %U http://www.inf.usi.ch/phd/bacchelli/publications.php %> https://flosshub.org/sites/flosshub.org/files/icpc2010.pdf %0 Conference Paper %B Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - ICSE '10 %D 2010 %T Linking e-mails and source code artifacts %A Bacchelli, Alberto %A Lanza, Michele %A Robbes, Romain %Y Kramer, Jeff %Y Bishop, Judith %Y Devanbu, Prem %Y Uchitel, Sebastian %X E-mails concerning the development issues of a system constitute an important source of information about high-level design decisions, low-level implementation concerns, and the social structure of developers. Establishing links between e-mails and the software artifacts they discuss is a non-trivial problem, due to the inherently informal nature of human communication. Different approaches can be brought into play to tackle this traceability issue, but the question of how they can be evaluated remains unaddressed, as there is no recognized benchmark against which they can be compared. In this article we present such a benchmark, which we created through the manual inspection of a statistically significant number of e-mails pertaining to six unrelated software systems. We then use our benchmark to measure the effectiveness of a number of approaches, ranging from lightweight approaches based on regular expressions to full-fledged information retrieval approaches. %B Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - ICSE '10 %I ACM Press %C Cape Town, South Africa %V 1 %P 375-384 %8 05/2010 %@ 9781605587196 %U http://www.inf.usi.ch/phd/bacchelli/publications.php %! ICSE '10 %R 10.1145/1806799.1806855 %0 Conference Paper %B Demonstration Track, Proceedings of the 17th SIGSOFT Symposium on Foundations of Software Engineering %D 2010 %T Linkster: Enabling Efficient Manual Mining %A Christian Bird %A Adrian Bachman %A Rahman, Foyzur %A Bernstein, Abraham %K artifacts %K bug %K bug tracking %K data mining %K email %K mailing lists %K open source %K source code %X While many uses of mined software engineering data are automatic in nature, some techniques and studies either require, or can be improved, by manual methods. Unfortunately, manually inspecting, analyzing, and annotating mined data can be difficult and tedious, especially when information from multiple sources must be integrated. Oddly, while there are numerous tools and frameworks for automatically mining and analyzing data, there is a dearth of tools which facilitate manual methods. To fill this void, we have developed LINKSTER, a tool which integrates data from bug databases, source code repositories, and mailing list archives to allow manual inspection and annotation. LINKSTER has already been used successfully by an OSS project lead to obtain data for one empirical study. %B Demonstration Track, Proceedings of the 17th SIGSOFT Symposium on Foundations of Software Engineering %I ACM %> https://flosshub.org/sites/flosshub.org/files/bird2010lee.pdf %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Mining security changes in FreeBSD %A Mauczka, Andreas %A Schanes, Christian %A Fankhauser, Florian %A Bernhart, Mario %A Grechenig, Thomas %K freebsd %K msr challenge %K security %X Current research on historical project data is rarely touching on the subject of security related information. Learning how security is treated in projects and which parts of a software are historically security relevant or prone to security changes can enhance the security strategy of a software project. We present a mining methodology for security related changes by modifying an existing method of software repository analysis. We use the gathered security changes to find out more about the nature of security in the FreeBSD project and we try to establish a link between the identified security changes and a tracker for security issues (security advisories). We give insights how security is presented in the FreeBSD project and show how the mined data and known security problems are connected. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 90 - 93 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463289 %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Mining subclassing directives to improve framework reuse %A Bruch, Marcel %A Mezini, Mira %A Monperrus, Martin %K api %K documentation %K eclipse %K frameworks %K jface %K source code %X To help developers in using frameworks, good documentation is crucial. However, it is a challenge to create high quality documentation especially of hotspots in white-box frameworks. This paper presents an approach to documentation of object-oriented white-box frameworks which mines from client code four different kinds of documentation items, which we call subclassing directives. A case study on the Eclipse JFace user-interface framework shows that the approach can improve the state of API documentation w.r.t. subclassing directives. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 141 - 150 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463347 %> https://flosshub.org/sites/flosshub.org/files/141Mining-Subclassing-Directives-to-Improve-Framework-Reuse.pdf %0 Conference Paper %B Seventh Annual Acquisition Research Symposium, {NPS} Proceedings - %D 2010 %T On Open and Collaborative Software Development in the DoD %A Hissam, S. A. %A Weinstock, C. %A Bass, L. %K collaborative development %K open source software %K reuse %K software engineering %X The US Department of Defense (specifically, but not limited to, the DoD CIO's Clarifying Guidance Regarding Open Source Software, DISA's launch of Forge.mil and OSD's Open Technology Development Roadmap Plan) has called for increased use of open source software and the adoption of best practices from the free/open source software (F/OSS) community to foster greater reuse and innovation between programs in the DoD. In our paper, we examine some key aspects of open and collaborative software development inspired by the success of the F/OSS movement as it might manifest itself within the US DoD. This examination is made from two perspectives: the reuse potential among DoD programs sharing software and the incentives, strategies and policies that will be required to foster a culture of collaboration needed to achieve the benefits indicative of F/OSS. Our conclusion is that to achieve predictable and expected reuse, not only are technical infrastructures needed, but also a shift to the business practices in the software development and delivery pattern seen in the traditional acquisition lifecycle is needed. Thus, there is potential to overcome the challenges discussed within this paper to engender a culture of openness and community collaboration to support the DoD mission. %B Seventh Annual Acquisition Research Symposium, {NPS} Proceedings - %I Naval Postgraduate School %C Monterey, California %V 1 %P 219–235 %8 04/2010 %U http://www.acquisitionresearch.net/cms/_files/FY2010/NPS-AM-10-037.pdf %0 Book %B IFIP Advances in Information and Communication TechnologyOpen Source Software: New Horizons %D 2010 %T Open Source Software Developer and Project Networks %A Madey, G. %A van Antwerp, M. %E Ågerfalk, Pär %E Boldyreff, Cornelia %E González-Barahona, Jesús M. %E Madey, Gregory R. %E Noll, John %K berlios %K savannah %K sourceforge %X This paper outlines complex network concepts and how social networks are built from Open Source Software (OSS) data. We present an initial study of the social networks of three different OSS forges, BerliOS Developer, GNU Savannah, and SourceForge. Much research has been done on snapshot or conflated views of these networks, especially SourceForge, due to the size of the SourceForge community. The degree distribution, connectedness, centrality, and scale-free nature of SourceForge has been presented for the network at particular points in time. However, very little research has been done on how the network grows, how connections were made, especially during its infancy, and how these metrics evolve over time. %B IFIP Advances in Information and Communication TechnologyOpen Source Software: New Horizons %I Springer Berlin Heidelberg %C Berlin, Heidelberg %V 319 %P 407 - 412 %@ 978-3-642-13244-5 %R 10.1007/978-3-642-13244-5_39 %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Should I contribute to this discussion? %A Ibrahim, Walid M %A Bettenburg, Nicolas %A Shihab, Emad %A Adams, Bram %A Hassan, Ahmed E. %K apache %K contributions %K developers %K email %K email archives %K mailing lists %K postgresql %K python %X Development mailing lists play a central role in facilitating communication in open source projects. Since these lists frequently host design and project discussions, knowledgeable contribution to these discussion threads is essential to avoid mis-communication that might slow-down the progress of a project. However, given the sheer volume of emails on these lists, it is easy to miss important discussions. To find out how developers are able to deal with mailing list discussions, we study the main factors that encourage developers to contribute to the development mailing lists. We develop personalized models to automatically identify discussion threads that a developer would contribute to based on his previous contribution behavior. Case studies on development mailing lists of three open source projects (Apache, PostgreSQL and Python) show that the average accuracy of our models is 89-85% and that the models vary significantly between different developers. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town %P 181 - 190 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463345 %> https://flosshub.org/sites/flosshub.org/files/181ibrahim-msr2010.pdf %0 Thesis %D 2010 %T The Sociability of Free Software: A GNU Look at Free Software Identified Businesses as Social Entrepreneurships %A Barcomb, Ann %K free software %K open source software %K public good %K small business %K social entrepreneurship %K social ventures %X This research strives to address the gap in the literature surrounding companies which identify with the philosophical values associated with the Free Software movement, which have historically been associated with Open Source businesses. It investigates whether ethically-motivated Free Software identified companies resemble social entrepreneurships. This work also examines whether there are significant differences between the business practices of Free Software identified companies, Free Software, and Open Source enterprises in order to assess if it is appropriate to address them as a group. The study is based on seven case studies, and includes one company which is a Free Software business, but does not identify with the Free Software philosophy, as well as one company which is ethically-motivated but identifies with Open Source rather than Free Software. The results indicate that there is good reason to believe that adherence to Free Software philosophy creates socially-aware businesses, which may be social entrepreneurships. No problems were discovered with the practice of grouping together Free Software and Open Source companies in the study of business practices, provided that a broad definition of success is used. %I Maastricht University %U http://barcomb.org/cgi/paper.cgi?paper=barcomb:2010:sociability %9 masters %> https://flosshub.org/sites/flosshub.org/files/barcomb-2010-sociability.pdf %0 Conference Paper %B 2010 43rd Hawaii International Conference on System Sciences (HICSS 2010) %D 2010 %T Towards an Openness Rating System for Open Source Software %A Bein, Wolfgang %A Jeffery, Clinton %K alice %K case study %K contribution %K documentation %K freespire %K galib %K latex %K license %K linux %K linux kernel %K mediaportal %K openness %K openoffice %K opensolaris %K rating %K unicon %X Many open source software projects are not very open to third party developers. The point of open source is to enable anyone to fix bugs or add desired capabilities without holding them hostage to the original developers. This principle is important because an open source project's developers may be unresponsive or unable to meet third party needs, even if funding support for requested improvements is offered.This paper presents a simple rating system for evaluating the openness of software distributions. The rating system considers factors such as platform portability, documentation, licensing, and contribution policy. Several popular open source products are rated in order to illustrate the efficacy of the rating system. %B 2010 43rd Hawaii International Conference on System Sciences (HICSS 2010) %I IEEE %C Honolulu, Hawaii, USA %P 1 - 8 %@ 978-1-4244-5509-6 %R 10.1109/HICSS.2010.405 %> https://flosshub.org/sites/flosshub.org/files/10-07-04.pdf %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Validity of network analyses in Open Source Projects %A Nia, Roozbeh %A Christian Bird %A Devanbu, Premkumar %A Filkov, Vladimir %K apache %K email archives %K mailing lists %K missing data %K mysql %K perl %K social networks %X Social network methods are frequently used to analyze networks derived from Open Source Project communication and collaboration data. Such studies typically discover patterns in the information flow between contributors or contributions in these projects. Social network metrics have also been used to predict defect occurrence. However, such studies often ignore or side-step the issue of whether (and in what way) the metrics and networks of study are influenced by inadequate or missing data. In previous studies email archives of OSS projects have provided a useful trace of the communication and co-ordination activities of the participants. These traces have been used to construct social networks that are then subject to various types of analysis. However, during the construction of these networks, some assumptions are made, that may not always hold; this leads to incomplete, and sometimes incorrect networks. The question then becomes, do these errors affect the validity of the ensuing analysis? In this paper we specifically examine the stability of network metrics in the presence of inadequate and missing data. The issues that we study are: 1) the effect of paths with broken information flow (i.e. consecutive edges which are out of temporal order) on measures of centrality of nodes in the network, and 2) the effect of missing links on such measures. We demonstrate on three different OSS projects that while these issues do change network topology, the metrics used in the analysis are stable with respect to such changes. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 201 - 209 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463342 %> https://flosshub.org/sites/flosshub.org/files/201NetworkAnalysis.pdf %0 Book %B IFIP Advances in Information and Communication Technology Open Source Software: New Horizons (OSS 2010) %D 2010 %T Warehousing and Studying Open Source Versioning Metadata %A van Antwerp, M. %A Madey, G. %E Ågerfalk, Pär %E Boldyreff, Cornelia %E González-Barahona, Jesús M. %E Madey, Gregory R. %E Noll, John %K berlios %K cvs %K savannah %K scm %K sourceforge %K srda %K subversion %K svn %X In this paper, we describe the downloading and warehousing of Open Source Software (OSS) versioning metadata from SourceForge, BerliOS Developer, and GNU Savannah. This data enables and supports research in areas such as software engineering, open source phenomena, social network analysis, data mining, and project management. This newly-formed database containing Concurrent Versions System (CVS) and Subversion (SVN) metadata offers new research opportunities for large-scale OSS development analysis. The CVS and SVN data is juxtaposed with the SourceForge.net Research Data Archive [5] for the purpose of performing more powerful and interesting queries. We also present an initial statistical analysis of some of the most active projects. %B IFIP Advances in Information and Communication Technology Open Source Software: New Horizons (OSS 2010) %I Springer Berlin Heidelberg %C Berlin, Heidelberg %V 319 %P 413 - 418 %@ 978-3-642-13244-5 %R 10.1007/978-3-642-13244-5_40 %0 Journal Article %J International Journal of Open Source Software and Processes %D 2010 %T Weaving a Semantic Web Across OSS Repositories %A Olivier Berger %A Valentin Vlasceanu %A Christian Bac %A Quang Vu Dang %A Lauriere, Stéphane %K archive %K bug %K bugtracker %K database %K debian %K forge %K interoperability %K ontology %K OSLC-CM %K RDF %K repository of repositories %K semantic %K semantic Web %X Several public repositories and archives of “facts” about libre software projects, maintained either by open source communities or by research communities, have been flourishing over the Web in recent years. These have enabled new analysis and support for new quality assurance tasks. This paper presents some complementary existing tools, projects and models proposed both by OSS actors or research initiatives that are likely to lead to useful future developments in terms of study of the FLOSS phenomenon, and also to the very practitioners in the FLOSS development projects. A goal of the research conducted within the HELIOS project is to address bugs traceability issues. In this regard, the authors investigate the potential of using Semantic Web technologies in navigating between many different bugtracker systems scattered all over the open source ecosystem. By using Semantic Web techniques, it is possible to interconnect the databases containing data about open-source software projects development, which enables OSS partakers to identify resources, annotate them, and further interlink those using dedicated properties and collectively designing a distributed semantic graph. %B International Journal of Open Source Software and Processes %V 2 %P 29 - 40 %8 32/2010 %N 2 %R 10.4018/jossp.2010040103 %> https://flosshub.org/sites/flosshub.org/files/wopdasd2009-olivier-berger.pdf %0 Journal Article %J IEEE Transactions on Software Engineering %D 2010 %T What Makes a Good Bug Report? %A Zimmermann, Thomas %A Premraj, Rahul %A Bettenburg, Nicolas %A Sascha Just %A Schroter, Adrian %A Weiss, Cathrin %K bug report %K Survey %X In software development, bug reports provide crucial information to developers. However, these reports widely differ in their quality. We conducted a survey among developers and users of APACHE, ECLIPSE, and MOZILLA to find out what makes a good bug report. The analysis of the 466 responses revealed an information mis- match between what developers need and what users supply. Most developers consider steps to reproduce, stack traces, and test cases as helpful, which are at the same time most difficult to provide for users. Such insight is helpful to design new bug tracking tools that guide users at collecting and providing more helpful information. Our CUEZILLA prototype is such a tool and measures the quality of new bug reports; it also recommends which elements should be added to improve the quality. We trained CUEZILLA on a sample of 289 bug reports, rated by developers as part of the survey. In our experiments, CUEZILLA was able to predict the quality of 31–48% of bug reports accurately. %B IEEE Transactions on Software Engineering %I IEEE Computer Society %C Los Alamitos, CA, USA %V 36 %P 618-643 %U http://dl.acm.org/citation.cfm?id=1453146 %R http://doi.ieeecomputersociety.org/10.1109/TSE.2010.63 %> https://flosshub.org/sites/flosshub.org/files/bettenburg-fse-2008.pdf %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T When process data quality affects the number of bugs: Correlations in software engineering datasets %A Bachmann, Adrian %A Bernstein, Abraham %K apache %K bug reports %K eclipse %K gnome %K log files %K mozilla %K netbeans %K openoffice.org %K version control %X Software engineering process information extracted from version control systems and bug tracking databases are widely used in empirical software engineering. In prior work, we showed that these data are plagued by quality deficiencies, which vary in its characteristics across projects. In addition, we showed that those deficiencies in the form of bias do impact the results of studies in empirical software engineering. While these findings affect software engineering researchers the impact on practitioners has not yet been substantiated. In this paper we, therefore, explore (i) if the process data quality and characteristics have an influence on the bug fixing process and (ii) if the process quality as measured by the process data has an influence on the product (i.e., software) quality. Specifically, we analyze six Open Source as well as two Closed Source projects and show that process data quality and characteristics have an impact on the bug fixing process: the high rate of empty commit messages in Eclipse, for example, correlates with the bug report quality. We also show that the product quality - measured by number of bugs reported - is affected by process data quality measures. These findings have the potential to prompt practitioners to increase the quality of their software process and its associated data quality. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010)2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town, South Africa %P 62 - 71 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463286 %> https://flosshub.org/sites/flosshub.org/files/62bachmann-msr10.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Assurance Evaluation for OSS Adoption in a Telco Context %A Ardagna, Claudio %A Banzi, Massimo %A Damiani, Ernesto %A El Ioini, Nabil %A Frati, Fulvio %X Software Assurance (SwA) is a complex concept that involves different stages of a software development process and may be defined differently depending on its focus, as for instance software quality, security, or dependability. In Computer Science, the term assurance is referred to all activities necessary to provide enough confidence that a software product will satisfy its users’ functional and non-functional requirements. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 363 - 363 %8 2009/// %G eng %& 37 %R http://dx.doi.org/10.1007/978-3-642-02032-2_37 %> https://flosshub.org/sites/flosshub.org/files/Assurance%20Evaluation%20for%20OSS.pdf %0 Conference Paper %B 2009 16th Working Conference on Reverse Engineering %D 2009 %T Benchmarking Lightweight Techniques to Link E-Mails and Source Code %A Bacchelli, Alberto %A D'Ambros, Marco %A Lanza, Michele %A Robbes, Romain %K argouml %K email %K mailing lists %X During the evolution of a software system, a large amount of information, which is not always directly related to the source code, is produced. Several researchers have provided evidence that the contents of mailing lists represent a valuable source of information: Through e-mails, developers discuss design decisions, ideas, known problems and bugs, etc. which are otherwise not to be found in the system. A technical challenge in this context is how to establish the missing link between free-form e-mails and the system artifacts they refer to. Although the range of approaches is vast, establishing their accuracy remains a problem, as there is no benchmark against which to compare their performance. To overcome this issue, we manually inspected a statistically significant number of e-mails pertaining to the ArgoUML system. Based on this benchmark, we present a variety of lightweight techniques to assign e-mails to software artifacts and measure their effectiveness in terms of precision and recall. %B 2009 16th Working Conference on Reverse Engineering %I IEEE %C Lille, France %P 205 - 214 %@ 978-0-7695-3867-9 %R 10.1109/WCRE.2009.44 %> https://flosshub.org/sites/flosshub.org/files/wcre2009.pdf %0 Conference Paper %B iConference '09 %D 2009 %T Design Information Sharing Across Multiple Knowledge Systems in a FLOSS Community %A Bach, Paula %K codeplex %K developers %K information sharing %X This paper explores support for design information sharing between the distinct knowledge systems and skill sets of interactive system designers and developers. The paper focuses on the challenges of sharing information among groups of designers, developers, and users with multiple knowledge systems in the context of free/libre/open source software (FLOSS) communities. Bringing design to FLOSS communities introduces new knowledge into a solitary community of practice, and discussion ensues about how exploiting the 'symmetry of ignorance' can enhance information sharing through design in CodePlex, an open source project hosting community website. Finally, design mockups illustrate how CodePlex serves as a boundary object supporting design information sharing across distinct knowledge systems. %B iConference '09 %8 02/2009 %> https://flosshub.org/sites/flosshub.org/files/finalDraft41.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T FLOSS UX Design: An Analysis of User Experience Design in Firefox and OpenOffice.org %A Bach, Paula %A Carroll, John %X We describe two cases of open user experience (UX) design using the Firefox web browser and OpenOffice.org office suite as case studies. We analyze the social complexity of integrating UX practices into the two open source projects using activity awareness, a framework for understanding team performance in collective endeavors of significant scope, duration, and complexity. The facets of activity awareness are common ground, community of practice, social capital, and human development. We found that differences between the communities include different strategies for community building, UX status in the community, type of open UX design, and different ways to share information. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 237 - 250 %8 2009/// %G eng %& 21 %R http://dx.doi.org/10.1007/978-3-642-02032-2_21 %> https://flosshub.org/sites/flosshub.org/files/Floss%20UX%20Desing.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T How Open Source Can Still Save the World %A Behlendorf, Brian %X Many of the worlds’ major problems - economic distress, natural disaster responses, broken health care systems, education crises, and more - are not fundamentally information technology issues. However, in every case mentioned and more, there exist opportunities for Open Source software to uniquely change the way we can address these problems. At times this is about addressing a need for which no sufficient commercial market exists. For others, it is in the way Open Source licenses free the recipient from obligations to the creators, creating a relationship of mutual empowerment rather than one of dependency. For yet others, it is in the way the open collaborative processes that form around Open Source software provide a neutral ground for otherwise competitive parties to find a greatest common set of mutual needs to address together rather than in parallel. Several examples of such software exist today and are gaining traction. Governments, NGOs, and businesses are beginning to recognize the potential and are organizing to meet it. How far can this be taken? %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %V 299/2009 %P 2 %8 2009/// %G eng %& 2 %R http://dx.doi.org/10.1007/978-3-642-02032-2_2 %> https://flosshub.org/sites/flosshub.org/files/How%20Open%20Source%20Can%20Still%20Save%20the%20World.pdf %0 Journal Article %J Journal of Systems and Software %D 2009 %T Identifying exogenous drivers and evolutionary stages in FLOSS projects %A Karl Beecher %A Capiluppi, Andrea %A Boldyreff, Cornelia %K developers %K forge %K forges %K repositories %K repository %K scm %K software repositories %K sourceforge %K success %K users %X The success of a Free/Libre/Open Source Software (FLOSS) project has been evaluated in the past through the number of commits made to its configuration management system, number of developers and number of users. Most studies, based on a popular FLOSS repository (SourceForge), have concluded that the vast majority of projects are failures. This study's empirical results confirm and expand conclusions from an earlier and more limited work. Not only do projects from different repositories display different process and product characteristics, but a more general pattern can be observed. Projects may be considered as early inceptors in highly visible repositories, or as established projects within desktop-wide projects, or finally as structured parts of FLOSS distributions. These three possibilities are formalized into a framework of transitions between repositories. The framework developed here provides a wider context in which results from FLOSS repository mining can be more effectively presented. Researchers can draw different conclusions based on the overall characteristics studied about an Open Source software project's potential for success, depending on the repository that they mine. These results also provide guidance to OSS developers when choosing where to host their project and how to distribute it to maximize its evolutionary success. %B Journal of Systems and Software %V 82 %P 739 - 750 %U http://www.sciencedirect.com/science/article/B6V0N-4TVTJFS-1/2/e32ecee1bcb54bd4a5dff6d5e3daca8d %R DOI: 10.1016/j.jss.2008.10.026 %0 Conference Paper %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %D 2009 %T Mining search topics from a code search engine usage log %A Bajracharya, Sushil %A Lopes, Cristina %K analysis %K black duck %K koders %K log %K logfile %K search %K source code %X We present a topic modeling analysis of a year long usage log of Koders, one of the major commercial code search engines. This analysis contributes to the understanding of what users of code search engines are looking for. Observations on the prevalence of these topics among the users, and on how search and download activities vary across topics, leads to the conclusion that users who find code search engines usable are those who already know to a high level of specificity what to look for. This paper presents a general categorization of these topics that provides insights on the different ways code search engine users express their queries. The findings support the conclusion that existing code search engines provide only a subset of the various information needs of the users when compared to the categories of queries they look at. %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %I IEEE %C Vancouver, BC, Canada %P 111 - 120 %@ 978-1-4244-3493-0 %R 10.1109/MSR.2009.5069489 %0 Conference Paper %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %D 2009 %T Mining the coherence of GNOME bug reports with statistical topic models %A Linstead, Erik %A Baldi, Pierre %K bug reports %K bugzilla %K gnome %K msr challenge %K quality %K sourcerer %X We adapt latent Dirichlet allocation to the problem of mining bug reports in order to define a new information-theoretic measure of coherence. We then apply our technique to a snapshot of the GNOME Bugzilla database consisting of 431,863 bug reports for multiple software projects. In addition to providing an unsupervised means for modeling report content, our results indicate substantial promise in applying statistical text mining algorithms for estimating bug report quality. Complete results are available from our supplementary materials Web site at http://sourcerer.ics.uci.edu/msr2009/gnome_coherence.html. %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %I IEEE %C Vancouver, BC, Canada %P 99 - 102 %@ 978-1-4244-3493-0 %R 10.1109/MSR.2009.5069486 %0 Conference Paper %B 2009 42nd Hawaii International Conference on System Sciences (HICSS 2009) %D 2009 %T Multiple Social Networks Analysis of FLOSS Projects using Sargas %A de Sousa, S.F. %A Balieiro, M.A. %A dos R. Costa, J.M. %A de Souza, C.R.B. %K case study %K multiple social networks %K ossnetwork %K pmd %K social network analysis %K transflow %X Due to their characteristics and claimed advantages, several researchers have been investigating free and open-source projects. Different aspects are being studied: for instance, what motivates developers to join FLOSS projects, the tools, processes and practices used in FLOSS projects, the evolution of FLOSS communities among other things. Researchers have studied collaboration and coordination of open source software developers using an approach known as social network analysis and have gained important insights about these projects. Most researchers, however, have not focused on the integrated study of these networks and, accordingly, in their interrelationships. This paper describes an approach and tool to combine multiple social networks to study the evolution of open-source projects. Our tool, named Sargas, allows comparison and visualization of different social networks at the same time. Initial results of our analysis can be used to extend the "onion-model" of open source participation. %B 2009 42nd Hawaii International Conference on System Sciences (HICSS 2009) %I IEEE %C Waikoloa, Hawaii, USA %P 1 - 10 %@ 978-0-7695-3450-3 %R 10.1109/HICSS.2009.316 %> https://flosshub.org/sites/flosshub.org/files/07-07-06.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Panel: Governance in Open Source Projects and Communities %A Bolici, Francesco %A de Laat, Paul %A Ljungberg, Jan %A Pontiggia, Andrea %A Rossi Lamastra, Cristina %X “Although considerable research has been devoted to the growth and expansion of open source communities and the comparison between the efficiency of corporate structures and community structures in the field of software development, rather less attention has been paid to their governance structures (control, monitoring, supervision)” (Lattemann and Stieglitz 2005). %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 370 - 370 %8 2009/// %G eng %& 43 %R http://dx.doi.org/10.1007/978-3-642-02032-2_43 %> https://flosshub.org/sites/flosshub.org/files/Panel%20Governnance.pdf %0 Journal Article %J International Journal of Industrial Ergonomics %D 2009 %T Participation in online interaction spaces: Design-use mediation in an Open Source Software community %A Barcellini, Flore %A Détienne, Françoise %A Burkhardt, Jean-Marie %K Distributed participatory design %X This research aims at characterizing emerging roles fostering design-use mediation during the Open Source Software (OSS) design process through the analysis of participation. Studying OSS is of particular interest: (1) to investigate socio-technical settings supporting user participation to the design process, which is considered to be the major strength of OSS design; (2) to gain insights into supporting the changing nature of the software industry, which is becoming more and more distributed and global, and which is thus increasingly making use of OSS design tools and methods. In this research, we characterized effective roles of participants, i.e. participation, on the basis of activities analysis in three online interaction spaces (discussion, documentation and implementation) during a continuous “pushed-by-users” design process of the Python project. Participation is targeted through a methodology articulating: (1) structural analyses (organization of the discussions, regularity and involvement of participants, quotes-based social network) in usage-oriented and development-oriented mailing lists of the projects’ discussion space; (2) actions to the code and documentation made by participants in the implementation and documentation spaces. Besides the importance of the users’ contribution to the process, OSS design is fostered by some key-participants, the cross-participants, who act as boundary spanners between the developers and the users, helping them to go beyond some barriers to participation. These findings can be reinforced developing software to automate the structural analysis of discussions and actions to the code and documentation. %B International Journal of Industrial Ergonomics %V 39 %P 533 - 540 %U http://www.sciencedirect.com/science/article/pii/S0169814108001637 %R 10.1016/j.ergon.2008.10.013 %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Peeling the Onion %A Masmoudi, Héla %A den Besten, Matthijs %A de Loupy, Claude %A Jean-Michel Dalle %X According to the now widely accepted “onion-model” of the organization of open source software development, an open source project typically relies on a core of developers that is assisted by a larger periphery of users. But what does the role of the periphery consist of? Raymond’s Linus’s Law which states that “given enough eyeballs all bugs are shallow” suggests at least one important function: the detection of defects. Yet, what are the ways through which core and periphery interact with each other? With the help of text-mining methods, we study the treatment of bugs that affected the Firefox Internet browser as reflected in the discussions and actions recorded in Mozilla’s issue tracking system Bugzilla. We find various patterns in the modes of interactions between core and peripheral members of the community. For instance, core members seem to engage more frequently with the periphery when the latter proposes a solution (a patch). This leads us to conclude that Alan Cox’s dictum “show me the code”, perhaps even more than Linus’s law, seems to be the dominant rule that governs the development of software like Firefox. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 284 - 297 %8 2009/// %G eng %& 25 %R http://dx.doi.org/10.1007/978-3-642-02032-2_25 %> https://flosshub.org/sites/flosshub.org/files/Peeling%20the%20Onion.pdf %0 Conference Paper %B Proceedings of the 6th International Working Conference on Mining Software Repositories, MSR 2009 %D 2009 %T The promises and perils of mining git %A Christian Bird %A Peter C. Rigby %A Earl T. Barr %A David J. Hamilton %A Daniel M. Germán %A Premkumar T. Devanbu %K dscm %K git %K mining %K scm %K source code %X We are now witnessing the rapid growth of decentralized source code management (DSCM) systems, in which every developer has her own repository. DSCMs facilitate a style of collaboration in which work output can flow sideways (and privately) between collaborators, rather than always up and down (and publicly) via a central repository. Decentralization comes with both the promise of new data and the peril of its misinterpretation. We focus on git, a very popular DSCM used in high-profile projects. Decentralization, and other features of git, such as automatically recorded contributor attribution, lead to richer content histories, giving rise to new questions such as "How do contributions flow between developers to the official project repository?" However, there are pitfalls. Commits may be reordered, deleted, or edited as they move between repositories. The semantics of terms common to SCMs and DSCMs sometimes differ markedly, potentially creating confusion. For example, a commit is immediately visible to all developers in centralized SCMs, but not in DSCMs. Our goal is to help researchers interested in DSCMs avoid these and other perils when mining and analyzing git data. %B Proceedings of the 6th International Working Conference on Mining Software Repositories, MSR 2009 %P 1-10 %> https://flosshub.org/sites/flosshub.org/files/1promisePeril.pdf %0 Conference Paper %B Proceedings of the 17th International Symposium on Software Reliability Engineering %D 2009 %T Putting it All Together: Using Socio-Technical Networks to Predict Failures %A Christian Bird %A Nachiappan Nagappan %A Devanbu, Premkumar %A Gall, Harald %A Brendan Murphy %K eclipse %K microsoft %K social network %K vista %K windows %X Studies have shown that social factors in development organizations have a dramatic effect on software quality. Separately, program dependency information has also been used successfully to predict which software components are more fault prone. Interestingly, the influence of these two phenomena have only been studied separately. Intuition and practical experience suggests, however, that task assignment (i.e. who worked on which components and how much) and dependency structure (which components have dependencies on others) together interact to influence the quality of the resulting software. We study the influence of combined socio-technical software networks on the fault-proneness of individual software components within a system. The network properties of a software component in this combined network are able to predict if an entity is failure prone with greater accuracy than prior methods which use dependency or contribution information in isolation. We evaluate our approach in different settings by using it on Windows Vista and across six releases of the Eclipse development environment including using models built from one release to predict failure prone components in the next release. We compare this to previous work. In every case, our method performs as well or better and is able to more accurately identify those software components that have more post-release failures, with precision and recall rates as high as 85%. %B Proceedings of the 17th International Symposium on Software Reliability Engineering %> https://flosshub.org/sites/flosshub.org/files/bird2009pat.pdf %0 Journal Article %J Electronic Notes in Theoretical Computer Science %D 2009 %T Quality Factors and Coding Standards - a Comparison Between Open Source Forges %A Capiluppi, Andrea %A Boldyreff, Cornelia %A Karl Beecher %A Paul J. Adams %K artefacts %K artifacts %K coding standards %K coding style %K complexity %K forge %K forges %K kde %K metrics %K quality %K source code %K sourceforge %X Enforcing adherence to standards in software development in order to produce high quality software artefacts has long been recognised as best practice in traditional software engineering. In a distributed heterogeneous development environment such those found within the Open Source paradigm, coding standards are informally shared and adhered to by communities of loosely coupled developers. Following these standards could potentially lead to higher quality software. This paper reports on the empirical analysis of two major forges where OSS projects are hosted. The first one, the KDE forge, provides a set of guidelines and coding standards in the form of a coding style that developers may conform to when producing the code source artefacts. The second studied forge, SourceForge, imposes no formal coding standards on developers. A sample of projects from these two forges has been analysed to detect whether the SourceForge sample, where no coding standards are reinforced, has a lower quality than the sample from KDE. Results from this analysis form a complex picture; visually, all the selected metrics show a clear divide between the two forges, but from the statistical standpoint, clear distinctions cannot be drawn amongst these quality related measures in the two forge samples. %B Electronic Notes in Theoretical Computer Science %V 233 %P 89 - 103 %U http://www.sciencedirect.com/science/article/B75H1-4VXDKRV-7/2/abcc2be2c4c3998e4bc9b53473ca2d81 %R DOI: 10.1016/j.entcs.2009.02.063 %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Quality of Open Source Software: The QualiPSo Trustworthiness Model %A del Bianco, Vieri %A Lavazza, Luigi %A Morasca, Sandro %A Taibi, Davide %X Trustworthiness is one of the main issues upon which the decision whether to adopt an Open-Source Software (OSS) product is based. The work described here is part of an activity that has the goals of 1) defining an adequate notion of trustworthiness of software products and artifacts and 2) identifying a number of factors that influence it. Specifically, this paper reports about the identification of the “dimensions” of trustworthiness, i.e., of the high-level qualities that software products and artefacts have to posses in order to be considered trustworthy. These dimensions are described by means of a conceptual model of trustworthiness, which comprises the representation of the factors that affect the user’s perception of trustworthiness, as well as the objective characteristics of the products that contribute to “build” trustworthi-ness. The aforementioned model is equipped with a measurement plan that de-scribes, at the operational level, how to perform the evaluation of the trustwor-thiness of OSS products. The proposed model provides the basis to build quantitative models of the trustworthiness of OSS products and artifacts that are able to explain the relationships between the (objectively observable) characteristics of OSS products and the level of trustworthiness perceived by the users of such products. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 199 - 212 %8 2009/// %G eng %& 18 %R http://dx.doi.org/10.1007/978-3-642-02032-2_18 %> https://flosshub.org/sites/flosshub.org/files/Quality%20of%20Open%20Source%20Software.pdf %0 Conference Paper %B Proceedings of the 28th annual conference on Human Factors in Computing Systems %D 2009 %T Short and Tweet: Experiments on Recommending Content from Information %A Chen, J %A Nairn, R. %A Nelson, L. %A Bernstein, M. %A Chi, E. %B Proceedings of the 28th annual conference on Human Factors in Computing Systems %I ACM Press %C Atlanta, GA %8 04/10/10 %G eng %0 Conference Paper %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %D 2009 %T SourcererDB: An aggregated repository of statically analyzed and cross-linked open source Java projects %A Ossher, Joel %A Bajracharya, Sushil %A Linstead, Erik %A Baldi, Pierre %A Lopes, Cristina %K apache %K integration %K java %K java.net %K project %K repository %K sourceforge %K SourcererDB %X The open source movement has made vast quantities of source code available online for free, providing an extremely large dataset for empirical study and potential resuse. A major difficulty in exploiting this potential fully is that the data are currently scattered between competing source code repositories, none of which are structured for empirical analysis and cross-project comparison. As a result, software researchers and developers are left to compile their own datasets, resulting in duplicated effort and limited results. To address this challenge, we built SourcererDB, an aggregated repository of statically analyzed and cross-linked open source Java projects. SourcererDB contains local snapshots of 2,852 Java projects taken from Sourceforge, Apache and Java.net. These projects are statically analyzed to extract rich structural information, which is then stored in a relational database. References to entities in the 16,058 external jars are resolved and grouped, allowing for cross-project usage information to be accessed easily. This paper describes: (a) the mechanism for resolving and grouping these cross-project references, (b) the structure of and the metamodel for the SourcererDB repository, and (d) end-user dataset access mechanisms. Our goal in building SourcererDB is to provide a rich dataset of source code to facilitate the sharing of extracted data and to encourage reuse and repeatability of experiments. %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %I IEEE %C Vancouver, BC, Canada %P 183 - 186 %@ 978-1-4244-3493-0 %R 10.1109/MSR.2009.5069501 %0 Conference Paper %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR) %D 2009 %T SourcererDB: An aggregated repository of statically analyzed and cross-linked open source Java projects %A Ossher, Joel %A Bajracharya, Sushil %A Linstead, Erik %A Baldi, Pierre %A Lopes, Cristina %K apache %K java %K java.net %K source code %K sourceforge %K sourcerer %X The open source movement has made vast quantities of source code available online for free, providing an extremely large dataset for empirical study and potential re-use. A major difficulty in exploiting this potential fully is that the data are currently scattered between competing source code repositories, none of which are structured for empirical analysis and cross-project comparison. As a result, software researchers and developers are left to compile their own datasets, resulting in duplicated effort and limited results. To address this challenge, we built SourcererDB, an aggregated repository of statically analyzed and cross-linked open source Java projects. SourcererDB contains local snapshots of 2,852 Java projects taken from Sourceforge, Apache and Java.net. These projects are statically analyzed to extract rich structural information, which is then stored in a relational database. References to entities in the 16,058 external jars are resolved and grouped, allowing for cross-project usage information to be accessed easily. This paper describes: (a) the mechanism for resolving and grouping these cross-project references, (b) the structure of and the metamodel for the SourcererDB repository, and (d) end-user dataset access mechanisms. Our goal in building SourcererDB is to provide a rich dataset of source code to facilitate the sharing of extracted data and to encourage reuse and repeatability of experiments. %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR) %I IEEE %C Vancouver, BC, Canada %P 183 - 186 %@ 978-1-4244-3493-0 %R 10.1109/MSR.2009.5069501 %0 Conference Paper %B Internet-Based Systems (SITIS 2009)2009 Fifth International Conference on Signal Image Technology and Internet Based Systems %D 2009 %T Supporting Situation Awareness in FLOSS Projects by Semantical Aggregation of Tools Feeds %A Quang Vu Dang %A Christian Bac %A Olivier Berger %A Valentin Vlasceanu %X It is rather difficult to monitor or visualize what can be the contribution of a member in a collaboration project, especially when the project uses multiple tools to produce its results. This is the case for collaborative development of FLOSS software, that uses Wiki, bug tracker, mailing lists and source code management tools. This paper presents an approach to data collection by using aggregation of feeds published by the different tools of a software forge. To allow this aggregation, collected data is semantically reformatted into Semantic Web standards: RDF, DC, DOAP, FOAF and EvoOnt. Resulting data can then be processed, re-published or displayed to project members. This approach was used to implement a supervision module that is integrated into the PicoForge platform. This module is able to draw a live graph of the social community out of the different sources of data, and in turn exports semantic feeds for other uses. %B Internet-Based Systems (SITIS 2009)2009 Fifth International Conference on Signal Image Technology and Internet Based Systems %I IEEE %C Marakesh, Morocco %P 423 - 429 %@ 978-1-4244-5740-3 %R 10.1109/SITIS.2009.72 %0 Thesis %B PhD Thesis, THE PENNSYLVANIA STATE UNIVERSITY %D 2009 %T Supporting the user experience in free/libre/open source software development %A Bach, Paula %K codeplex %X With the increasing number and awareness of free/libre/open source software (FLOSS) projects, Internet users can download a FLOSS tool that meets just about any need. The user experience of projects, however, varies greatly and identifying FLOSS projects that offer a positive user experience (UX) is challenging. FLOSS projects center on software developer activities with little attention to user-centered design activities that could increase the user experience on the project. The purpose of this dissertation is to understand open source software ecology in order to bring support for user experience design activities on FLOSS projects. CodePlex, an open source project hosting website, serves as the open source software ecology. The research consists of two phases, a descriptive science phase and a design science phase. In the descriptive phase fieldwork in the form of ethnomethodologically informed ethnography describes the everyday activities of three groups: the team that produces CodePlex, the participants who use CodePlex to produce open source projects, and user experience practitioners who bring their expertise to design software with a positive user experience. The descriptive phase also includes an analysis of activity awareness of the three groups. The design science phase consists of a claims analysis that provides design rationale for a design that proposes to support UX activities on CodePlex. The results show that activity awareness contributes to the socio-technical solution where UX activities can be supported as a new community of practice, with features that support building social capital. The UX support features include a UX workspace where UX contributors recognize their value and other features that support the presence of UX throughout the project site and the CodePlex community. This dissertation contributes empirical materials from the descriptions of everyday activities of the three groups and analytic materials generated from the activity awareness and claims analyses that are translated into design representations. Specifically the contributions include (1) mechanisms of articulation work of the three groups and how the mechanisms contribute to the design representation; (2) the demonstration of a translation science in computersupported cooperative work (CSCW) and human-computer interaction (HCI); and (3) an understanding of how UX activities and software engineering activities integrate. %B PhD Thesis, THE PENNSYLVANIA STATE UNIVERSITY %0 Conference Paper %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %D 2009 %T Tracking concept drift of software projects using defect prediction quality %A Ekanayake, Jayalath %A Tappolet, Jonas %A Gall, Harald C. %A Bernstein, Abraham %K bugzilla %K cvs %K defect prediction %K eclipse %K mozilla %K netbeans %K openoffice %X Defect prediction is an important task in the mining of software repositories, but the quality of predictions varies strongly within and across software projects. In this paper we investigate the reasons why the prediction quality is so fluctuating due to the altering nature of the bug (or defect) fixing process. Therefore, we adopt the notion of a concept drift, which denotes that the defect prediction model has become unsuitable as set of influencing features has changed - usually due to a change in the underlying bug generation process (i.e., the concept). We explore four open source projects (Eclipse, OpenOffice, Netbeans and Mozilla) and construct file-level and project-level features for each of them from their respective CVS and Bugzilla repositories. We then use this data to build defect prediction models and visualize the prediction quality along the time axis. These visualizations allow us to identify concept drifts and - as a consequence - phases of stability and instability expressed in the level of defect prediction quality. Further, we identify those project features, which are influencing the defect prediction quality using both a tree induction-algorithm and a linear regression model. Our experiments uncover that software systems are subject to considerable concept drifts in their evolution history. Specifically, we observe that the change in number of authors editing a file and the number of defects fixed by them contribute to a project's concept drift and therefore influence the defect prediction quality. Our findings suggest that project managers using defect prediction models for decision making should be aware of the actual phase of stability or instability due to a potential concept drift. %B 2009 6th IEEE International Working Conference on Mining Software Repositories (MSR)2009 6th IEEE International Working Conference on Mining Software Repositories %I IEEE %C Vancouver, BC, Canada %P 51 - 60 %@ 978-1-4244-3493-0 %R 10.1109/MSR.2009.5069480 %> https://flosshub.org/sites/flosshub.org/files/51MSR2009_0111_Ekanayake_Jayalath.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Undergraduate Research Opportunities in OSS %A Boldyreff, Cornelia %A Capiluppi, Andrea %A Knowles, Thomas %A Munro, James %X Using Open Source Software (OSS) in undergraduate teaching in universities is now commonplace. Students use OSS applications and systems in their courses on programming, operating systems, DBMS, web development to name but a few. Studying OSS projects from both a product and a process view also forms part of the software engineering curriculum at various universities. Many students have taken part in OSS projects as well as developers. At the University of Lincoln, under the Undergraduate Research Opportunities Scheme (UROS), undergraduate student researchers have the chance to work over the summer embedded within an existing research centre on a UROS project. Here two such projects within the Centre for Research in Open Source Software (CROSS) are described: Collaborative Development for the XO Laptop (CODEX) and Software Modularity in Open Source Software (SoMOSS). The CODEX project focused on creating resources to support students undertaking software application development for the XO laptop, and the SoMOSS project focused on architectural studies of OSS instant messaging software. Both projects achieved successful research outcomes; more importantly, both student researchers benefited directly from the encouragement and concrete assistance that they received through interaction with the wider OSS research community. Both projects are ongoing and present further research opportunities for students. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 340 - 350 %8 2009/// %G eng %& 30 %R http://dx.doi.org/10.1007/978-3-642-02032-2_30 %> https://flosshub.org/sites/flosshub.org/files/Undergraduate%20Research%20Opportunities.pdf %0 Conference Paper %B 4th Workshop on Public Data about Software Development (WoPDaSD 2009) %D 2009 %T Weaving a Semantic Web across OSS repositories: a spotlight on bts­link, UDD, SWIM %A Olivier Berger %A Valentin Vlasceanu %A Christian Bac %A Laurière, Stéphane %K bts-link %K bug tracker %K bugzilla %K debian %K ecosystem %K helios %K mandriva %K semantic Web %K swim %K udd %X Several public repositories and archives of facts about libre software projects, developed either by open source communities or by research communities, have been flourishing over the Web in the recent years. These enable new analysis and support new quality assurance tasks. By using Semantic Web techniques, the databases containing data about open-source software projects development can be interconnected, hence letting OSS partakers identify resources, annotate them and further interlink them using dedicated properties, collectively designing a distributed semantic graph. Such links expressed with standard Semantic techniques are paving the way to new applications (including ones meant for “end-users”). For instance this may have an impact on the way research efforts are conducted (less fragmented), and could also be used by development communities to improve Quality Assurance tasks. A goal of the research conducted within the HELIOS project, is to address bugtracker synchronization issues. For that, the potential of using Semantic Web technologies in navigating between many different bugtracker systems scattered all over the open source ecosystem is being investigated. This position paper presents some existing tools, projects and models proposed by OSS actors that are complementary to research initiatives, and that are likely to lead to useful future developments: UDD (Ultimate Debian Database) and bts-link, developed by the Debian community, and SWIM (Semantic Web enabled Issue Manager) developed by Mandriva. The HELIOS team welcomes comments on the future paths that can be considered in using the Semantic Web approach for improving these projects. %B 4th Workshop on Public Data about Software Development (WoPDaSD 2009) %> https://flosshub.org/sites/flosshub.org/files/HELIOS-WOPDASD-improved-Olivier.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Workshop – Serious Games and Open Source: Practice and Futures %A Backlund, Per %A Lundell, Björn %A Walt Scacchi %X Computer games are increasingly used throughout our society with people playing on the bus, at home and at work. Computer games thus affect larger and larger number of people and areas in the society of today. There are even scholars who advocate that games create better environments for learning than traditional classrooms. This situation motivates the use of games and game technology for additional purposes, e.g. education, training, health care or marketing. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 361 - 362 %8 2009/// %G eng %& 36 %R http://dx.doi.org/.1007/978-3-642-02032-2_36 %> https://flosshub.org/sites/flosshub.org/files/Workshop%20Serious%20Games%20and%20Open%20Source.pdf %0 Journal Article %J Information Economics and Policy %D 2008 %T The allocation of collaborative efforts in open-source software %A den Besten, Matthijs %A Jean-Michel Dalle %A Galia, Fabrice %K age %K apache %K complexity %K cvs %K division of labor %K functions %K gaim %K gcc %K ghostscript %K lines of code %K loc %K log files %K mozilla %K netbsd %K openssh %K postgresql %K python %K revision control %K scm %K size %K source code %K Stigmergy %K version control %X The article investigates the allocation of collaborative efforts among core developers (maintainers) of open-source software by analyzing on-line development traces (logs) for a set of 10 large projects. Specifically, we investigate whether the division of labor within open-source projects is influenced by characteristics of software code. We suggest that the collaboration among maintainers tends to be influenced by different measures of code complexity. We interpret these findings by providing preliminary evidence that the organization of open-source software development would self-adapt to characteristics of the code base, in a 'stigmergic' manner. %B Information Economics and Policy %V 20 %P 316 - 322 %U http://www.sciencedirect.com/science/article/B6V8J-4SSG4PN-1/2/88b3824c30a31c18929d8a5ca6d64f62 %R DOI: 10.1016/j.infoecopol.2008.06.003 %0 Conference Paper %B Proceedings of the 2008 international working conference on Mining software repositories %D 2008 %T AMAP: automatically mining abbreviation expansions in programs to enhance software maintenance tools %A Hill, Emily %A Fry, Zachary P. %A Boyd, Haley %A Sridhara, Giriprasad %A Novikova, Yana %A Pollock, Lori %A Vijay-Shanker, K. %K automatic abbreviation expansion %K azureus %K itext.net %K liferay %K maintenance %K natural language %K openoffice.org %K program comprehension %K source code %K tiger envelopes %K tools %X When writing software, developers often employ abbreviations in identifier names. In fact, some abbreviations may never occur with the expanded word, or occur more often in the code. However, most existing program comprehension and search tools do little to address the problem of abbreviations, and therefore may miss meaningful pieces of code or relationships between software artifacts. In this paper, we present an automated approach to mining abbreviation expansions from source code to enhance software maintenance tools that utilize natural language information. Our scoped approach uses contextual information at the method, program, and general software level to automatically select the most appropriate expansion for a given abbreviation. We evaluated our approach on a set of 250 potential abbreviations and found that our scoped approach provides a 57% improvement in accuracy over the current state of the art. %B Proceedings of the 2008 international working conference on Mining software repositories %S MSR '08 %I ACM %C New York, NY, USA %P 79–88 %8 05/2008 %@ 978-1-60558-024-1 %U http://doi.acm.org/10.1145/1370750.1370771 %R http://doi.acm.org/10.1145/1370750.1370771 %> https://flosshub.org/sites/flosshub.org/files/p79-hill.pdf %0 Journal Article %J Communications of the ACM %D 2008 %T The business of open source %A Watson, R. T. %A Boudreau, M. C. %A York, P. T. %A Greiner, M. E. %A Wynn, D. %B Communications of the ACM %V 51 %P 41-46 %G eng %M WOS:000254780700009 %0 Conference Paper %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %D 2008 %T Channeling Firefox Developers: Mom and Dad Aren’t Happy Yet %A Jean-Michel Dalle %A den Besten, Matthijs %A Masmoudi, Héla %X Firefox, a browser targeted at mainstream users, has been one of the big successes of open source development in recent years. That Firefox succeeded where earlier attempts failed is undoubtedly due to the particular choices that were made in the process of development. In this paper, we look at this process in more detail. Mining bug reports and feature requests related to Firefox in Mozilla’s Bugzilla bug tracker system, we find that the attention developers devoted to reports and requests was influenced by several factors. Most importantly, other things being equal, reports and requests from outsiders increasingly tend to be ignored. While such behavior may have helped to shield Firefox from the “alpha-geek power user” in the early stages of development, it also makes it difficult for “mom and dad” to let their voice be heard even after they have adopted Firefox. %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 275/2008 %P 265 - 271 %8 2008/// %G eng %& 22 %R http://dx.doi.org/10.1007/978-0-387-09684-1_22 %> https://flosshub.org/sites/flosshub.org/files/Channeling%20Firefox%20Developers.pdf %0 Conference Paper %B Proceedings of the 2008 international workshop on Mining software repositories - MSR '08 %D 2008 %T Extracting structural information from bug reports %A Premraj, Rahul %A Zimmermann, Thomas %A Kim, Sunghun %A Bettenburg, Nicolas %Y Hassan, Ahmed E. %Y Lanza, Michele %Y Godfrey, Michael W. %K bug reports %K eclipse %K enumerations %K infozilla %K natural language %K patches %K source code %K stack trace %X In software engineering experiments, the description of bug reports is typically treated as natural language text, although it often contains stack traces, source code, and patches. Neglecting such structural elements is a loss of valuable information; structure usually leads to a better performance of machine learning approaches. In this paper, we present a tool called infoZilla that detects structural elements from bug reports with near perfect accuracy and allows us to extract them. We anticipate that infoZilla can be used to leverage data from bug reports at a different granularity level that can facilitate interesting research in the future. %B Proceedings of the 2008 international workshop on Mining software repositories - MSR '08 %I ACM Press %C New York, New York, USA %P 27-30 %8 05/2008 %@ 9781605580241 %! MSR '08 %R 10.1145/1370750.1370757 %> https://flosshub.org/sites/flosshub.org/files/p27-bettenburg.pdf %0 Conference Paper %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %D 2008 %T Facilitating Social Network Studies of FLOSS using the OSSNetwork Environment %A Balieiro, Marco %A de Júnior, Samuel %A De Souza, Cleidson %X Open source projects are typical examples of successful distributed software development projects. Understanding how coordination in these projects takes place can provide important lessons to Software Engineering researchers and practitioners. This understanding has been achieved using different research methods, including, surveys, case studies and social network analysis. However, to conduct these studies each researcher needs to build his own infra-structure from the scratch, a time consuming and error-prone task. This paper aims to alleviate this problem by describing an environment, the OSSNetwork, which allows the automatic data collection of open source repositories. Data collected by the OSSNetwork is aimed to support the construction, visualization, and analysis of social networks. This environment is extensible, therefore facilitating empirical studies of open source projects. %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 275/2008 %P 343 - 350 %8 2008/// %G eng %& 31 %R http://dx.doi.org/10.1007/978-0-387-09684-1_31 %> https://flosshub.org/sites/flosshub.org/files/Facilitating%20Social%20Network%20Studies.pdf %0 Manuscript %D 2008 %T I’m not chatting, i’m innovating! locating lead users in open source software communities %A Breach, Geoff %K irc %X The Lead User Method recognizes that certain end users of products are a valuable source – in some industries the only source – of new product innovations. Organizations that seek out these lead users and invite them to assist with the new product development process will produce products that perform far better in the marketplace than those developed by traditional methods based on research and development and market research. The individuals who author Open Source Software (OSS) – complex and high-quality computer software that is typically distributed free of charge and without restrictions on use – appear to share many common characteristics with lead users. These skilled computer programmers develop their software collaboratively, and often form large communities that congregate in Internet 'chat rooms' to discuss and manage their software development projects. In response to a call for further research to improve the performance of the lead user method, this paper integrates concepts from the fields of innovation management, data visualization and open source software to a proposal to improve the process of identifying lead users in OSS communities %B University of Technology, Sydney School of Management Working Paper Series %U http://www.geoffbreach.com/files/9512/6939/7089/Breach%202008%20Im%20Not%20Chatting%20Im%20Innovating.pdf %6 2008-7 %> https://flosshub.org/sites/flosshub.org/files/Breach%202008%20Im%20Not%20Chatting%20Im%20Innovating.pdf %0 Conference Paper %B 3nd International Workshop on Public Data about Software Development (WoPDaSD 2008), Milano, Italy, September 2008 %D 2008 %T Improving community awareness in software forges by semantical aggregation of tools feeds %A Quang Vu Dang %A Christian Bac %A Olivier Berger %A Xuan Sang Dao %K community of practice %K DOAF. %K FOAF %K free and open source software development %K public data %K RDF %K semantic Web %K social filtering %K social network analysis %X It is rather difficult to monitor or visualize what can be the contribution of a member in a project, especially when the project uses multiple tools to produce its results. This is the case for collaborative development of FLOSS software, that use Wiki, bug tracker, mailing lists and source code management tools. This paper presents an approach to data collection by using aggregation of feeds published by the different tools of a software forge. To allow this aggregation, collected data is semantically reformatted into Semantic Web standards: RDF, DC, DOAP, and FOAF. Resulting data can then be processed, republished or displayed to project members. We implemented this approach in a supervision module that has been integrated into the PicoForge platform. This module is able do draw a live graph of the social community out of the different sources of data, and in turn export semantic feeds for other uses. %B 3nd International Workshop on Public Data about Software Development (WoPDaSD 2008), Milano, Italy, September 2008 %G eng %> https://flosshub.org/sites/flosshub.org/files/Paper4.pdf %0 Journal Article %J Information and Software Technology %D 2008 %T JADE: A software framework for developing multi-agent applications. Lessons learned %A Bellifemine, F. %A Caire, G. %A Poggi, A. %A Rimassa, G. %X Since a number of years agent technology is considered one of the most innovative technologies for the development of distributed software systems. While not yet a mainstream approach in software engineering at large, a lot of work on agent technology has been done, many research results and applications have been presented, and some software products exists which have moved from the research community to the industrial community. One of these is JADE, a software framework that facilitates development of interoperable intelligent multi-agent systems and that is distributed under an Open Source License. JADE is a very mature product, used by a heterogeneous community of users both in research activities and in industrial applications. This paper presents JADE and its technological components together with a discussion of the possible reasons for its success and lessons learned from the somewhat detached perspective possible nine years after its inception. (c) 2007 Elsevier B.V. All rights reserved. %B Information and Software Technology %V 50 %P 10-21 %G eng %M WOS:000252196700004 %0 Conference Paper %B SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT Symposium on Foundations of Software Engineering %D 2008 %T Latent Social Structure in Open Source Projects %A Christian Bird %A David Pattison %A Raissa D'Souza %A Filkov, Vladimir %A Devanbu, Premkumar %X Commercial software project managers design project organizational structure carefully, mindful of available skills, division of labour, geographical boundaries, etc. These organizational “cathedrals” are to be contrasted with the "bazaar-like" nature of Open Source Software (OSS) Projects, which have no pre-designed organizational structure. Any structure that exists is dynamic, self-organizing, latent, and usually not explicitly stated. Still, in large, complex, successful, OSS projects, we do expect that subcommunities will form spontaneously within the developer teams. Studying these subcommunities, and their behavior can shed light on how successful OSS projects self-organize. This phenomenon could well hold important lessons for how commercial software teams might be organized. Building on known well-established techniques for detecting community structure in complex networks, we extract and study latent subcommunities from the email social network of several projects: Apache HTTPD, Python, PostgresSQL, Perl, and Apache ANT. We then validate them with software development activity history. Our results show that subcommunities do indeed spontaneously arise within these projects as the projects evolve. These subcommunities manifest most strongly in technical discussions, and are significantly connected with collaboration behaviour. %B SIGSOFT '08/FSE-16: Proceedings of the 16th ACM SIGSOFT Symposium on Foundations of Software Engineering %I ACM %P 24–35 %> https://flosshub.org/sites/flosshub.org/files/bird2008lss.pdf %0 Conference Paper %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %D 2008 %T Mining for Practices in Community Collections: Finds From Simple Wikipedia %A den Besten, Matthijs %A Rossi, Alessandro %A Gaio, Loris %A Loubser, Max %A Jean-Michel Dalle %X The challenges of commons based peer production are usually associated with the development of complex software projects such as Linux and Apache. But the case of open content production should not be treated as a trivial one. For instance, while the task of maintaining a collection of encyclopedic articles might seem negligible compared to the one of keeping together a software system with its many modules and interdependencies, it still poses quite demanding problems. In this paper, we describe the methods and practices adopted by Simple Wikipedia to keep its articles easy to read. Based on measurements of article readability and similarity, we conclude that while the mechanisms adopted by the community had some effect, in the long run more efforts and new practices might be necessary in order to maintain an acceptable level of readability in the Simple Wikipedia collection. %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 275/2008 %P 105 - 120 %8 2008/// %G eng %& 9 %R http://dx.doi.org/10.1007/978-0-387-09684-1_9 %> https://flosshub.org/sites/flosshub.org/files/Mining%20for%20Practices.pdf %0 Conference Paper %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %D 2008 %T Open Source Environments for Collaborative Experiments in e-Science %A Bosin, Andrea %A Dessí, Nicoletta %A Fugini, Maria %A Liberati, Diego %A Pes, Barbara %X Open Source Software (OSS) for e-Science should make reference to the paradigm of a distributed surrounding over a multi system mix of Web Services and Grid technologies, allowing data exchanging through services, according to standards in the area of the Grid and of Service Oriented Computing (SOC). In fact, biologists, medical doctors, and scientists are often involved in time consuming experiments and are aware of the degree of difficulty in validating or rejecting a given hypothesis by lab experiments. The benefits of OSS for e-Science consider that as many operating nodes as possible can work cooperatively sharing data, resources, and software, thus avoiding the bottleneck of licenses for distributed use of tools needed to perform cooperative scientific experiments. In particular, this chapter presents an architecture based on nodes equipped with a Grid and with Web Services in order to access OSS, showing how scientific experiments can be enacted through the use of a cooperation among OSS sites. Such a choice, besides reducing the cost of the experiments, would support distributed introduction of OSS among other actors of the dynamical networks, thus supporting the awareness about OSS and their diffusion. An OSS environment for cooperative scientific experiments (e-experiments) can effectively support the distributed execution of different classes of experiments, from visualization to model identification through clustering and rules generation, in various application fields, such as bioinformatics, neuro-informatics, tele-monitoring,or drug discovery. By applying Web Services and Grid computing, an experiment or a simulation can be executed in a cooperative way on various computation nodes of a network equipped with OSS, allowing data exchange among researchers. Our environment formalizes experiments as cooperative services on various computational nodes of a grid network. Basic elements are models, languages, and support tools creating a virtual network of organizational responsibility of the global experiments, according to rules under which each node can execute local services to be accessed by other nodes in order to achieve the whole experiment’s results. %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 275/2008 %P 415 - 416 %8 2008/// %G eng %& 41 %R http://dx.doi.org/10.1007/978-0-387-09684-1_41 %> https://flosshub.org/sites/flosshub.org/files/Open%20Source%20Environments.pdf %0 Journal Article %D 2008 %T Open Source Software: Free Provision of a Complex Public Good %A Jim Bessen %X Open source software, developed by volunteers, appears counter to conventional wisdom about private provision of public goods. Standard arguments suggest that proprietary provision should be more efficient. But complex open source products challenge commercially-developed software in quality and market share. I argue that the complexity of software changes the results. With complex software, standard products cannot address all consumer needs and proprietary custom solutions are not always offered. Open source allows consumers to create their own customizations. When such user-customizations are then shared, open source products grow in quality and features. Open source extends the market for complex products. %8 April %G eng %> https://flosshub.org/sites/flosshub.org/files/lakhanivonhippelusersupport.pdf %0 Journal Article %D 2008 %T Open Source Software: Free Provision of a Complex Public Good %A Jim Bessen %X Open source software, developed by volunteers, appears counter to conventional wisdom about private provision of public goods. Standard arguments suggest that proprietary provision should be more efficient. But complex open source products challenge commercially-developed software in quality and market share. I argue that the complexity of software changes the results. With complex software, standard products cannot address all consumer needs and proprietary custom solutions are not always offered. Open source allows consumers to create their own customizations. When such user-customizations are then shared, open source products grow in quality and features. Open source extends the market for complex products. %8 April %G eng %> https://flosshub.org/sites/flosshub.org/files/opensrc.pdf %0 Journal Article %J Interacting with Computers %D 2008 %T A socio-cognitive analysis of online design discussions in an Open Source Software community %A Barcellini, Flore %A Détienne, Françoise %A Burkhardt, Jean-Marie %A Warren Sack %K Role %X This paper is an analysis of online discussions in an Open Source Software (OSS) design community, the Python project. Developers of Python are geographically distributed and work online asynchronously. The objective of our study is to understand and to model the dynamics of the OSS design process that takes place in mailing list exchanges. We develop a method to study distant and asynchronous collaborative design activity based on an analysis of quoting practices. We analyze and visualize three aspects of the online dynamics: social, thematic temporal, and design. We show that roles emerge during discussions according to the involvement and the position of the participants in the discussions and how they influence participation in the design discussions. In our analysis of the thematic temporal dynamics of discussion, we examine how themes of discussion emerge, diverge, and are refined over time. To understand the design dynamics, we perform a content analysis of messages exchanged between developers to reveal how the online discussions reflect the “work flow” of the project: it provides us with a picture of the collaborative design process in the OSS community. These combined results clarify how knowledge and artefacts are elaborated in this epistemic, exploration-oriented, OSS community. Finally, we outline the need to automate of our method to extend our results. The proposed automation could have implications for both researchers and participants in OSS communities. %B Interacting with Computers %V 20 %P 141 - 165 %U http://www.sciencedirect.com/science/article/pii/S0953543807000793 %R 10.1016/j.intcom.2007.10.004 %0 Conference Paper %B Proceedings of the 2008 international working conference on Mining software repositories %D 2008 %T Talk and work: a preliminary report %A Pattison, David S. %A Bird, Christian A. %A Premkumar T. Devanbu %K ant %K apache %K email %K mailing lists %K postgresql %K python %K scm %K source code %X Developers in Open Source Software (OSS) projects communicate using mailing lists. By convention, the mailing lists used only for task-related discussions, so they are primarily concerned with the software under development, and software process issues (releases, etc.). We focus on the discussions concerning the software, and study the frequency with which software entities (functions, methods, classes, etc) are mentioned in the mail. We find a strong, striking, cumulative relationship between this mention count in the email, and the number of times these entities are included in changes to the software. When we study the same phenomena over a series of time-intervals, the relationship is much less strong. This suggests some interesting avenues for future research. %B Proceedings of the 2008 international working conference on Mining software repositories %S MSR '08 %I ACM %C New York, NY, USA %P 113–116 %8 05/2008 %@ 978-1-60558-024-1 %U http://doi.acm.org/10.1145/1370750.1370776 %R http://doi.acm.org/10.1145/1370750.1370776 %> https://flosshub.org/sites/flosshub.org/files/p113-pattison.pdf %0 Conference Paper %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %D 2008 %T To What Extent Does It Pay to Approach Open Source Software for a Big Telco Player? %A Banzi, Massimo %A Bruno, Guido %A Caire, Giovanni %X In this paper we describe the strategy under adoption in Telecom Italia (TI) Technology Department toward open source software. This stems from trying to create synergy among big Telco Player to increase knowledge and influence over strategic communities to the evaluation of the creation of new communities over internally developed applications. In particular here the approach and the expectations in starting the community on WADE (Workflow and Agent Development Environment) is described. This is a platform used to develop mission critical applications and is the main evolution of JADE a popular Open Source framework for the development of interoperable intelligent multi-agent systems. It adds to JADE the support for the execution of tasks defined according to the workflow metaphor as well as a number of mechanisms that help managing the complexity of the distribution both in terms of administration and fault tolerance. The idea is to use WADE as a mean to gather critical information on the opportunity of approaching OS as a strategic mean toward the development of always more important application in Operating Support System for TI, possibly also involving other great Telco Players For this reason great care is being paid in setting up the Community environment and in deciding which metrics are to be extracted from it, since the result will be the input for a strategic decision in TI. %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 275/2008 %P 307 - 315 %8 2008/// %G eng %& 27 %R http://dx.doi.org/10.1007/978-0-387-09684-1_27 %> https://flosshub.org/sites/flosshub.org/files/To%20What%20Extent%20Does%20it%20Pay%20to%20Approach.pdf %0 Conference Paper %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %D 2008 %T Towards The Evaluation of OSS Trustworthiness: Lessons Learned From The Observation of Relevant OSS Projects %A Taibi, Davide %A del Bianco, Vieri %A Carbonare, Davide %A Lavazza, Luigi %A Morasca, Sandro %X To facilitate the adoption of open-source software (OSS) in industry, it is important to provide potential users (i.e., those who could decide to adopt OSS) with the means for evaluating the trustworthiness of OS products. This paper presents part of the work done in the QualiPSo project for this purpose. A set of factors that are believed to affect the perception of trustworthiness are introduced. In order to test the feasibility of deriving a correct, complete and reliable evaluation of trustworthiness on the basis of these factors, a set of well-known OSS projects have been chosen. Then, the possibility to assess the proposed factors on each project was verified: not all the factors appear to be observable or measurable. The paper reports what information is available to support the evaluation and what is not. This knowledge is considered to be useful to users, who are warned that there are still dark areas in the characterization of OSS products, and to developers, who should provide more data and characteristics on their products in order to support their adoption. %B OSS2008: Open Source Development, Communities and Quality (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 275/2008 %P 389 - 395 %8 2008/// %G eng %& 37 %R http://dx.doi.org/10.1007/978-0-387-09684-1_37 %> https://flosshub.org/sites/flosshub.org/files/Toward%20the%20Evaluation%20of%20OSS.pdf %0 Journal Article %J Int. J. Hum.-Comput. Stud. %D 2008 %T User and developer mediation in an Open Source Software community: Boundary spanning through cross participation in online discussions %A Barcellini, Flore %A Détienne, Françoise %A Burkhardt, Jean-Marie %K Boundary spanners %K Cross-participants %K Distributed design %K Open Source Software Community %K Role emerging design %X The aim of this research is to analyse how design and use are mediated in Open Source Software (OSS) design. Focusing on the Python community, our study examines a ''pushed-by-users'' design proposal through the discussions occurring in two mailing-lists: one, user-oriented and the other, developer-oriented. To characterize the links between users and developers, we investigate the activities and references (knowledge sharing) performed by the contributors to these two mailing-lists. We found that the participation of users remains local to their community. However, several key participants act as boundary spanners between the user and the developer communities. This emerging role is characterized by cross-participation in parallel same-topic discussions in both mailing-lists, cohesion between cross-participants, the occupation of a central position in the social network linking users and developers, as well as active, distinctive and adapted contributions. The user championing the proposal acts as a key boundary spanner coordinating the process and using explicit linking strategies. We argue that OSS design may be considered as a form of ''role emerging design'', i.e. design organized and pushed through emerging roles and through a balance between these roles. The OSS communities seem to provide a suitable socio-technical environment to enable such role emergence. %B Int. J. Hum.-Comput. Stud. %I Academic Press, Inc. %C Duluth, MN, USA %V 66 %P 558–570 %U http://dx.doi.org/10.1016/j.ijhcs.2007.10.008 %R 10.1016/j.ijhcs.2007.10.008 %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T Authenticating from multiple authentication sources in a collaborative platform %A Quang Vu Dang %A Olivier Berger %A Christian Bac %A Hamet, Benoît %X This paper presents a proposal to address the need for multiple authentication sources for users of collaborative work platforms. The proposed approach, developed for the needs of GET and Picolibre, relies on a generic solution that integrate groupware servers in a Shibboleth infrastructure. We have developed adapters for this integration, that we contributed to the phpGroupware project. This document should serve as a basis for discussion in order to validate the level of generality of the proposed approach. We hope that this approach can also help maintainers of other collaboration platforms, who want to integrate a park of deployed platforms with external user identification and authentication services, get a better view of solutions available with Shibboleth. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 229 - 234 %8 2007/// %G eng %& 20 %R http://dx.doi.org/10.1007/978-0-387-72486-7_20 %> https://flosshub.org/sites/flosshub.org/files/Authenticating%20from%20multiple.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Correlating Social Interactions to Release History during Software Evolution %A Baysal, Olga %A Malton, Andrew J. %K ant %K apache %K change management %K developers %K discussion %K effort estimation %K lsedit %K mailing lists %K scm %K source code %X In this paper, we propose a method to reason about the nature of software changes by mining and correlating discussion archives. We employ an information retrieval approach to find correlation between source code change history and history of social interactions surrounding these changes. We apply our correlation method on two software systems, LSEdit and Apache Ant. The results of these exploratory case studies demonstrate the evidence of similarity between the content of free-form text emails among developers and the actual modifications in the code. We identify a set of correlation patterns between discussion and changed code vocabularies and discover that some releases referred to as minor should instead fall under the major category. These patterns can be used to give estimations about the type of a change and time needed to implement it. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 7 - 7 %@ 0-7695-2950-X %R 10.1109/MSR.2007.4 %> https://flosshub.org/sites/flosshub.org/files/28300007.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Detecting Patch Submission and Acceptance in OSS Projects %A Christian Bird %A Gourley, Alex %A Devanbu, Prem %K apache %K contributions %K mysql %K patches %K postgresql %K python %K scm %K source code %X The success of open source software (OSS) is completely dependent on the work of volunteers who contribute their time and talents. The submission of patches is the major way that participants outside of the core group of developers make contributions. We argue that the process of patch submission and acceptance into the codebase is an important piece of the open source puzzle and that the use of patch-related data can be helpful in understanding how OSS projects work. We present our methods in identifying the submission and acceptance of patches and give results and evaluation in applying these methods to the Apache webserver, Python interpreter, Postgres SQL database, and (with limitations) MySQL database projects. In addition, we present valuable ways in which this data has been and can be used. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 26 - 26 %@ 0-7695-2950-X %R 10.1109/MSR.2007.6 %> https://flosshub.org/sites/flosshub.org/files/28300026.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T Different Bug Fixing Regimes? A Preliminary Case for Superbugs %A Jean-Michel Dalle %A den Besten, Matthijs %X The paper investigates the processes by which bugs are fixed in open-source software projects. Focusing on Mozilla and combining data from both its bug tracker (Bugzilla) and from its CVS, we suggest that: a) Some bugs resist beyond the first patch applied to the main branch of the source code in relation to them, which we denote as superbugs; b) There might exist different bug fixing regimes; c) priority and severity flags as defined in bug repositories are not optimized for superbugs and might lead to a involuntary side effects; d) The survival time of superbugs is influenced by the nature of the discussions within Bugzilla, by bug dependencies and by the provision of contextual elements. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 247 - 252 %8 2007/// %G eng %& 23 %R http://dx.doi.org/10.1007/978-0-387-72486-7_23 %> https://flosshub.org/sites/flosshub.org/files/Different%20Bug%20Fixing%20Regimes.pdf %0 Journal Article %D 2007 %T Do firms take part in the projects of the OS community. Some preliminary evidence and a research agenda %A Andrea Bonaccorsi %A Dario Lorenzi %A Monica Merito %A Cristina Rossi %X The Open Source (OS) software has progressively gained economic importance in recent years, and more and more commercial firms are getting involved, to various extents, in the OS movement. While a number of studies have investigated motivations and business models of OS-based software companies, very few works have examined whether and how firms actively participate to open projects. This paper contributes to the literature by providing empirical evidence on the role and the activities of software houses in community developed projects. The research also proposes an original methodology of large-scale primary data collection from OS project repositories and linked Web sites. The findings show how different today???s OS movement is from its origins and how important firm involvement has become, not only numerically but also for the deepness of its impact on community projects. Finally, further research developments are suggested. %8 January %G eng %> https://flosshub.org/sites/flosshub.org/files/paper_firm_involvement_MIT.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T FOSLET 07 — Workshop on Free and Open Source Learning Environments and Tools %A Botturi, Luca %A Mazza, Riccardo %A Tardini, Stefano %X Web-based Learning Environments supported by Course Management Systems (also known as Learning Management Systems) are nowadays a valid solution for institutions, companies, schools and universities that deliver eLearning or support blended-learning activities. Learning Environments are used to distribute information and content material to learners, prepare and deliver assignments and tests, engage in discussions, and manage distance classes without time and space restrictions. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 385 - 387 %8 2007/// %G eng %& 52 %R http://dx.doi.org/10.1007/978-0-387-72486-7_52 %> https://flosshub.org/sites/flosshub.org/files/FOSLET%2007%20workshop.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T Learning and the imperative of production in Free/Open Source development %A Evangelia Berdou %X This paper examines the role of learning in structuring access and participation in F/OS communities. In particular it highlights the challenges and barriers to access faced by new developers and the expectations of senior developers regarding the mindsets and capabilities of new contributors. It is argued that learning in F/OS is inextricably connected with the demand for continuous production. The evidence presented is drawn from interviews conducted with inexperienced and experienced contributors from the GNOME and KDE projects. The author challenges the view of learning as an enculturation process and the paper contributes to the understanding of power relations among established and peripheral members in communities of practice. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 235 - 240 %8 2007/// %G eng %& 21 %R http://dx.doi.org/10.1007/978-0-387-72486-7_21 %> https://flosshub.org/sites/flosshub.org/files/Learning%20and%20the%20imprative%20of%20production.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Local and Global Recency Weighting Approach to Bug Prediction %A Joshi, Hemant %A Zhang, Chuanlei %A Ramaswamy, S. %A Bayrak, Coskun %K bug fixing %K bug reports %K eclipse %K maintenance %K prediction %X Finding and fixing software bugs is a challenging maintenance task, and a significant amount of effort is invested by software development companies on this issue. In this paper, we use the Eclipse project's recorded software bug history to predict occurrence of future bugs. The history contains information on when bugs have been reported and subsequently fixed. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 33 - 33 %@ 0-7695-2950-X %R 10.1109/MSR.2007.17 %> https://flosshub.org/sites/flosshub.org/files/28300033.pdf %0 Thesis %D 2007 %T Managing the Bazaar: Commercialization and peripheral participation in mature, community-led Free/Open source software projects %A Evangelia Berdou %X The thesis investigates two fundamental dynamics of participation and collaboration in mature, community-led Free/Open Source (F/OS) software projects - commercialization and peripheral participation. The two primary case studies of the research are the GNOME and KDE communities.The thesis contributes insights into how the gift economy is embedded in the exchange economy and the role of peripheral contributors. The analysis indicates that community-integrated paid developers have a key role in project development, maintaining the infrastructure aspects of the code base. The analysis suggests that programming and non-programming contributors are distinct in their make-up, priorities and rhythms of participation, and that learning plays an important role in controlling access. The results show that volunteers are important drivers of peripheral activities, such as translation and documentation. The term "autonomous peripherality"" is used to capture the unique characteristics of these activities. These findings support the argument that centrality and peripherality are associated with the division of labour, which, in turn, is associated with employment relations and frameworks of institutional support. The thesis shows how the tensions produced by commercialization and peripheral participation are interwoven with values of meritocracy, ritual and strategic enactment of the idea of community as well as with tools and techniques developed to address the emergence of a set of problems specific to management and governance. These are characterized as "technologies of communities." %8 Nov %G eng %> https://flosshub.org/sites/flosshub.org/files/PhD_Berdou.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Mining Eclipse Developer Contributions via Author-Topic Models %A Linstead, Erik %A Rigor, Paul %A Bajracharya, Sushil %A Lopes, Cristina %A Baldi, Pierre %K contributions %K developers %K eclipse %K expertise %K mining challenge %K msr challenge %K source code %K topics %X We present the results of applying statistical author-topic models to a subset of the Eclipse 3.0 source code consisting of 2,119 source files and 700,000 lines of code from 59 developers. This technique provides an intuitive and automated framework with which to mine developer contributions and competencies from a given code base while simultaneously extracting software function in the form of topics. In addition to serving as a convenient summary for program function and developer activities, our study shows that topic models provide a meaningful, effective, and statistical basis for developer similarity analysis. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 30 - 30 %@ 0-7695-2950-X %R 10.1109/MSR.2007.20 %> https://flosshub.org/sites/flosshub.org/files/28300030.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Mining Software Repositories with iSPAROL and a Software Evolution Ontology %A Kiefer, Christoph %A Bernstein, Abraham %A Tappolet, Jonas %K database %K eclipse %K evoont %K java %K owl %K semantic %K sparql %X One of the most important decisions researchers face when analyzing the evolution of software systems is the choice of a proper data analysis/exchange format. Most existing formats have to be processed with special programs written specifically for that purpose and are not easily extendible. Most scientists, therefore, use their own database(s) requiring each of them to repeat the work of writing the import/export programs to their format. We present EvoOnt, a software repository data exchange format based on the Web Ontology Language (OWL). EvoOnt includes software, release, and bug-related information. Since OWL describes the semantics of the data, EvoOnt is (1) easily extendible, (2) comes with many existing tools, and (3) allows to derive assertions through its inherent Description Logic reasoning capabilities. The paper also shows iSPARQL -- our SPARQL-based Semantic Web query engine containing similarity joins. Together with EvoOnt, iSPARQL can accomplish a sizable number of tasks sought in software repository mining projects, such as an assessment of the amount of change between versions or the detection of bad code smells. To illustrate the usefulness of EvoOnt (and iSPARQL), we perform a series of experiments with a real-world Java project. These show that a number of software analyses can be reduced to simple iSPARQL queries on an EvoOnt dataset. %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 10 - 10 %@ 0-7695-2950-X %R 10.1109/MSR.2007.21 %> https://flosshub.org/sites/flosshub.org/files/28300010.pdf %0 Conference Paper %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Open Borders? Immigration in Open Source Projects %A Christian Bird %A Gourley, Alex %A Devanbu, Prem %A Swaminathan, Anand %A Hsu, Greta %K apache %K core %K joining %K postgresql %K python %K team %X Open source software is built by teams of volunteers. Each project has a core team of developers, who have the authority to commit changes to the repository; this team is the elite, committed foundation of the project, selected through a meritocratic process from a larger number of people who participate on the mailing list. Most projects carefully regulate admission of outsiders to full developer privileges; some projects even have formal descriptions of this process. Understanding the factors that influence the "who, how and when" of this process is critical, both for the sustainability of FLOSS projects, and for outside stakeholders who want to gain entry and succeed. In this paper we mount a quantitative case study of the process by which people join FLOSS projects, using data mined from the Apache web server, Postgres, and Python. We develop a theory of open source project joining, and evaluate this theory based on our data. %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 6 - 6 %@ 0-7695-2950-X %R 10.1109/MSR.2007.23 %> https://flosshub.org/sites/flosshub.org/files/28300006.pdf %0 Journal Article %D 2007 %T Open Source and the software industry. How firms do business out of an open innovation paradigm %A Andrea Bonaccorsi %A Monica Merito %A Rossi Cristina %A Lucia Piscitello %X Open Source Software (OSS) represents an ???open innovation??? paradigm based on knowledge produced and shared by developers and users. The paper inquires how OSS challenges the three Teece???s building blocks. New findings from a large survey of European software companies, show that within the OSS paradigm: (i) OSS can be a sustainable business model even in the absence of any appropriability; (ii) complementary assets are distributed collectively and made widely available without the need for dedicated contractual arrangements; (iii) a de facto dominant design may stem from a community of users/producers even independently of the presence of powerful large companies. %8 January %G eng %> https://flosshub.org/sites/flosshub.org/files/paper_euram_2007.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T Open source technologies for visually impaired people %A Boccacci, Patrizia %A Carrega, Veronica %A Dodero, Gabriella %X We describe two open source applications which we have experienced as very useful aids for the integration of people suffering from visual impairments, from hypovision to actual blindness. The first application is based on speech synthesis and has been experienced by disabled university students. The second experience is oriented to schoolchildren with low residual vision, and it provides their educators and parents with easy to use tools for image manipulation, especially designed for exploiting residual visual abilities. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 241 - 246 %8 2007/// %G eng %& 22 %R http://dx.doi.org/10.1007/978-0-387-72486-7_22 %> https://flosshub.org/sites/flosshub.org/files/Open%20Source%20Technologies%20for%20Visually%20.pdf %0 Conference Paper %B 2nd Workshop on Public Data about Software Development (WoPDaSD 2007) %D 2007 %T A Preliminary Analysis of Publicly Available FLOSS Measurements: Towards Discovering Maintainability Trends %A Samoladas, Ioannis %A Bibi, Stamatia %A Ioannis Stamelos %A Sowe, Sulayman K. %A Deligiannis, Ignatios %K decision tree %K flossmole %K java %K machine learning %K metrics %K sourcekibitzer %X The spread of free/libre/open source software (FLOSS) and the openness of its development model offer researchers a valuable source of information regarding software data. The creation of large portals, which host a vast amount of FLOSS projects make it easy to create large datasets with valuable information regarding the FLOSS development process. In addition initiatives such as FLOSSMole provide researchers with a single point and continuing access to those data. Up to now the majority of datasets from FLOSSMole offered data regarding the development process and not the code itself. From February 2007 FLOSSMole offers data donated by SourceKibitzer, which contain source code metrics for FLOSS projects written in Java. In this paper we provide a premilinary analysis on those data using machine learning techniques, such as classification rules and decision trees. Using the first available data from February 2007, we tried to build rules that can be used in order to estimate the future values of metrics offered for March. Here we present some preliminary results that are encouraging and deserve to be further analyzed in future releases of SourceKibitzer datasets. %B 2nd Workshop on Public Data about Software Development (WoPDaSD 2007) %8 2007 %> https://flosshub.org/sites/flosshub.org/files/Samolades2007.pdf %0 Conference Proceedings %B Twenty Eighth International Conference on Information Systems %D 2007 %T Productivity effects of information diffusion in email networks %A Sinan Aral %A Erik Brynjolfsson %A Marshall Van Alstyne %B Twenty Eighth International Conference on Information Systems %C Montréal, PQ, Canada %G eng %0 Conference Paper %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Visual Data Mining in Software Archives to Detect How Developers Work Together %A Weissgerber, Peter %A Pohl, Mathias %A Burch, Michael %K change %K coordination %K cvs %K developers %K junit %K modules %K scm %K source code %K svn %K teams %K tomcat %K visualization %X Analyzing the check-in information of open source software projects which use a version control system such as CVS or SUBVERSION can yield interesting and important insights into the programming behavior of developers. As in every major project tasks are assigned to many developers, the development must be coordinated between these programmers. This paper describes three visualization techniques that help to examine how programmers work together, e.g. if they work as a team or if they develop their part of the software separate from each other. Furthermore, phases of stagnation in the lifetime of a project can be uncovered and thus, possible problems are revealed. To demonstrate the usefulness of these visualization techniques we performed case studies on two open source projects. In these studies interesting patterns of developers? behavior, e.g. the specialization on a certain module can be observed. Moreover, modules that have been changed by many developers can be identified as well as such ones that have been altered by only one programmer. %B Fourth International Workshop on Mining Software RepositoriesFourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 9 - 9 %@ 0-7695-2950-X %R 10.1109/MSR.2007.34 %> https://flosshub.org/sites/flosshub.org/files/28300009.pdf %0 Conference Paper %B 2nd Workshop on Public Data about Software Development (WoPDaSD 2007) %D 2007 %T Working with Open Source Development Data: Considerations triggered by a study of bug scenarios %A den Besten, Matthijs %A Masmoudi, Héla %A Jean-Michel Dalle %K bug reports %K bug scenarios %K Data Collection %X The retrieval and preparation of public data on software development calls for more than just technical skills. In addition, care and judgement are needed to avoid disproportionate costs to the providers of data or unnecessary embarrassment to the participants tracked in the data. Taking the extraction of bug scenarios as a use case, we illustrate these concerns and discuss how they could be translated into social requirements that would help to make retrieval and preparation a sustainable exercise. In particular, we call for more efforts to establish institutional repositories of public data on software development and, besides, we suggest that reviewers could play a role in making sure that empirical research is performed in a way that does not bring the long-term relationship between software developers and researchers in jeopardy. %B 2nd Workshop on Public Data about Software Development (WoPDaSD 2007) %8 2007 %> https://flosshub.org/sites/flosshub.org/files/denBesten-wopdasd.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Co-change visualization applied to PostgreSQL and ArgoUML: (MSR challenge report) %A Beyer, Dirk %K argouml %K ccvisu %K cvs %K force-directed graph layout %K graph %K mining challenge %K msr challenge %K postgresql %K software clustering %K software structure analysis %K software visualization %K version control %K visualization %X Co-change visualization is a method to recover the subsystem structure of a software system from the version history, based on common changes and visual clustering. This paper presents the results of applying the tool CCVisu which implements co-change visualization, to the two open-source software systems PostgreSQL and ArgoUML The input of the method is the co-change graph, which can be easily extracted by CCVisu from a Cvs version repository. The output is a graph layout that places software artifacts that were often commonly changed at close positions, and artifacts that were rarely co-changed at distant positions. This property of the layout is due to the clustering property of the underlying energy model,which evaluates the quality of a produced layout. The layout can be displayed on the screen, or saved to a file in SVG or VRML format. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 165–166 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138023 %R http://doi.acm.org/10.1145/1137983.1138023 %> https://flosshub.org/sites/flosshub.org/files/165Co-Change.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Collaborative Maintenance in Large Open-Source Projects %A den Besten, Matthijs %A Jean-Michel Dalle %A Galia, Fabrice %K apache %K COLLABORATION %K complexity %K cvs %K gaim %K gcc %K ghostscript %K halstead %K lines of code %K loc %K mccabe %K mozilla %K netbsd %K openssh %K postgresql %K python %K sloc %X The paper investigates collaborative work among maintainers of open source software by analyzing the logs of a set of 10 large projects. We inquire whether teamwork can be influenced by several characteristics of code. Preliminary results suggest that collaboration among maintainers in most large open-source projects seems to be positively influenced by file vintage and by Halstead volume of files, and negatively by McCabe complexity and size measured in SLOCs. These results could be consistent with an increased attractivity of files created early in the history of a project, and with maintainers being less attracted by more verbose code and by more complex code, although in this last case it might also reflect the fact that more complex files would be de facto more exclusive in terms of maintenance. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %P 233 - 244 %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_23 %> https://flosshub.org/sites/flosshub.org/files/Collaborative%20Maintenance.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Critical Success Factors for Migrating to OSS-on-the-Desktop: Common Themes across Three South African Case Studies %A Brink, Daniel %A Roos, Llewelyn %A Weller, James %A Van Belle, Jean-Paul %X This paper investigates the critical success factors associated with the migration from proprietary desktop software to an open source software (OSS) desktop environment in a South African context. A comparative case study analysis approach was adopted whereby three organisations that have migrated to desktop OSS were analysed. For diversity, one case study each was drawn from government, private industry and the educational sector. Most of the findings agree with those in the available literature though there are notable differences in the relative importance of certain factors. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %P 287 - 293 %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_29 %> https://flosshub.org/sites/flosshub.org/files/Critical%20Success%20Factors%20for%20Migrating.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Detecting similar Java classes using tree algorithms %A Sager, Tobias %A Bernstein, Abraham %A Pinzger, Martin %A Kiefer, Christoph %K change analysis %K clones %K coogle %K eclipse %K famix %K java %K similarity %K software evolution %K software repositories %K source code %K tree similarity measures %X Similarity analysis of source code is helpful during development to provide, for instance, better support for code reuse. Consider a development environment that analyzes code while typing and that suggests similar code examples or existing implementations from a source code repository. Mining software repositories by means of similarity measures enables and enforces reusing existing code and reduces the developing effort needed by creating a shared knowledge base of code fragments. In information retrieval similarity measures are often used to find documents similar to a given query document. This paper extends this idea to source code repositories. It introduces our approach to detect similar Java classes in software projects using tree similarity algorithms. We show how our approach allows to find similar Java classes based on an evaluation of three tree-based similarity measures in the context of five user-defined test cases as well as a preliminary software evolution analysis of a medium-sized Java project. Initial results of our technique indicate that it (1) is indeed useful to identify similar Java classes, (2)successfully identifies the ex ante and ex post versions of refactored classes, and (3) provides some interesting insights into within-version and between-version dependencies of classes within a Java project. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 65–71 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138000 %R http://doi.acm.org/10.1145/1137983.1138000 %> https://flosshub.org/sites/flosshub.org/files/65Detecting.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Development Platforms as a Niche for Software Companies in Open Source Software %A Savonnet, Marinette %A Leclercq, Eric %A Terrasse, Marie-Noëlle %A Grison, Thierry %A Becker, George %A Farizy, Anne %A Denoyelle, Ludovic %X Without Abstract %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 203/2006 %P 341 - 342 %8 2006/// %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_37 %> https://flosshub.org/sites/flosshub.org/files/Development%20Platforms%20as%20a%20Niche.pdf %0 Thesis %B Master thesis, under direction of dr. Konrad Wozniacki %D 2006 %T The Economical Aspects of Free Software and Open Source Software Solutions in Modern Business %A Borucki, Blazej %X In the study economical aspects of Free / Open Source Software applications in commercial environment have been examined including main differences of commercial software development projects versus FOSS. Courtesy of Computer Science & Engineering department of University of Notre Dame statistical data from sourceforge.net projects have been analysed to show advantages and features of FOSS development. Empirical study also supports possible business models and strategies based on FOSS usage. %B Master thesis, under direction of dr. Konrad Wozniacki %9 Master Thesis %> https://flosshub.org/sites/flosshub.org/files/theeconomical_v2.pdf %0 Journal Article %D 2006 %T The evolution of free/libre open source software %A Lorenzo Benussi %X The Free Libre Open Source Software represents an outstanding example of ???open development model of technological knowledge???. It has been studied in several researches that produced valuable illustrations of the way it works. Our understanding of its principal features is growing exponentially and an entire new literature on open source has been created. However there appears to be an important gap in the literature: the origin of the phenomenon. The following chapter attempts to tackle this issue by analyzing the long-term technological history of the Free Open Source Software; the main research questions at stake are: ???Is the phenomenon completely new? and if it is not totally new, where does it come from???? and, more generally, ???how did open source software developed over time????. As a consequence the present work focuses primarily on the analysis of the free/open source software history of technological change over a period of almost sixty years. I adopted a multidisciplinary approach to analyse the network of relations emerging between inventions and technological innovations, as well as economic determinants and intellectual property rights regimes throughout the period considered. Therefore, I attempted to investigate the origins of the phenomenon as a way of understanding its evolution. %8 December %G eng %> https://flosshub.org/sites/flosshub.org/files/Benussi%282006%29_The_evolution_of_FLOSS_1.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Exploring the potential of OSS in Air Traffic Management %A Hardy, Jean-Luc %A Bourgois, Marc %X This paper introduces a project that aims at defining an Open Source Software (OSS) policy in the field of Air Traffic Management (ATM). In order to develop such a policy, we chose to investigate first a set of predictive hypotheses. Our four initial hypotheses were presented, refined and discussed in bi-lateral meetings with experts in the ATM field and in several conferences and workshops with OSS experts. At a roundtable, jointly organized by CALIBRE and EUROCONTROL, we confronted early open source experiences and insights in the ATM domain with experiences and knowledge from a panel of OSS experts and practitioners from academia and industry. The revised initial hypotheses are presented using a fixed format that should facilitate further evolution of these hypotheses. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %P 173 - 179 %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_17 %> https://flosshub.org/sites/flosshub.org/files/Exploring%20the%20potential%20of%20OSS.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T How is it possible to profit from innovation in the absence of any appropriability? %A Andrea Bonaccorsi %A Lucia Piscitello %A Monica Merito %A Cristina Rossi %X Open Source Software (OSS) represents an “open innovation” paradigm based on knowledge produced and shared by developers and users. New findings from a large survey of European software companies show that: (i) the OSS business model is currently involving almost one third of the industry, although with different intensity; (ii) compared with pure proprietary software producers, OSS firms have a broader product portfolio and are more diversified; moreover, (iii) OSS firms provide more complementary services to their customers; (iv) over time OSS firms increase the share of OS turnover out of the total turnover, becoming more and more OSS oriented; (v) both NOSS and OSS firms do not consider appropriability as a crucial requirement for innovation and do not consider the lack of appropriability as an obstacle to profitability. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 203/2006 %P 333 - 334 %8 2006/// %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_33 %> https://flosshub.org/sites/flosshub.org/files/How%20is%20it%20possible%20to%20profit.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Insiders and outsiders: paid contributors and the dynamics of cooperation in community led F/OS projects %A Evangelia Berdou %K gnome %K interviews %K kde %X This paper examines the role of paid developers in mature free/open source (F/OS) communities. In particular it provides a typology for their involvement based on their employment and sponsorship arrangements and elaborates a framework for understanding the dynamics of cooperation developing between them and the volunteers based on their community ties. The evidence presented is drawn from individual interviews conducted with volunteer and paid contributors from the GNOME and KDE projects within the context of a PhD research focusing on commercialization and peripheral participation in F/OS communities. The paper highlights the various interdependencies that form between communities and companies and adds to our understanding of the dynamics of commercialization in F/OS projects. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %P 201 - 208 %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_20 %> https://flosshub.org/sites/flosshub.org/files/Insiders%20and%20outsiders.pdf %0 Journal Article %J International Journal of Information Technology and Web Engineering %D 2006 %T Integration of libre software applications to create a collaborative work platform for researchers at GET %A Olivier Berger %A Christian Bac %A Benoit Hamet %K collaborative work environment %K contribution %K free software %K groupware %K in-house applications %K libre software %K open source software %K OpenLDAP %K phpGroupware %K PicoLibre %K ProGET %K Sympa %K TWiki %K WebDAV %K wiki %X Libre software provides powerful applications ready to be integrated for the build-up of platforms for internal use in organizations. We describe the architecture of the collaborative work platform which we have integrated, designed for researchers at GET. We present the elements we have learned during this project in particular with respect to contribution to external libre projects, in order to better ensure the maintainability of the internal applications, and to phpGroupware as a framework for specific applications development. %B International Journal of Information Technology and Web Engineering %I IGI Global %V 1 %P 1-16 %8 07/2006 %0 Unpublished Work %D 2006 %T Learning and knowledge in FLOSS - Situated learning and organizational knowledge-conversion in community-based free/libre open source software development %A Sverre Helge Bolstad %X In free/libre open source software development (FLOSS), groups of developers and users working in geographically dispersed settings are supported by a dense network of interactions. The participants are highly skilled in the use of information- and communication technologies, and build the software by relying on extensive peer production and through skillful use of communication tools available on the Internet. In building the software, explicit, formal and structured knowledge in the form of documents, objects, machines and external sources are communicated and stored in ways that make it available for others in the present and future. This knowledge make up an important resource for the members and developers of the community. Another kind, or aspect, of knowledge, often called tacit or soft knowledge, is informal, unstructured, resides in people, and are difficult, or maybe impossible, to articulate. The questions guiding this research is how knowledge, both explicit and tacit, is shared, and how a new member is able take part in the practice and knowledge of the community. The theory of legitimate peripheral participation in communities of practice describes an environment for people to develop knowledge through interaction with others in an environment where knowledge is created, nurtured and sustained. By taking part in the practice as a participant observer, through virtual ethnography, the author describes the practice and communication in this decentralized and knowledge-intensive process. Taking it a step further, the knowledge of the community, and how it is shared within the ???organization???, is explored with a model for managing dynamic aspects of organizational knowledge-creation. The central theme here is that knowledge is created through a continuous dialogue between tacit and explicit knowledge. Logs from Internet Relay Chat (IRC) and interviews with core developers are analyzed, and the author argues that the Plone community is able to share both kinds of knowledge in a complex web of resources and interaction. The analysis further suggest that the FLOSS development-model facilitates access, transparency and participation on premisses that are important for learning. %8 November %G eng %> https://flosshub.org/sites/flosshub.org/files/Learning-and-knowledge-in-FLOSS.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Mining additions of method calls in ArgoUML %A Zimmermann, Thomas %A Breu, Silvia %A Lindig, Christian %A Livshits, Benjamin %K argouml %K change analysis %K eclipse %K function calls %K mining challenge %K msr challenge %K pattern %K source code %K xelopes %X In this paper we refine the classical co-change to the addition of method calls. We use this concept to find usage patterns and to identify cross-cutting concerns for ArgoUML. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 169–170 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138025 %R http://doi.acm.org/10.1145/1137983.1138025 %> https://flosshub.org/sites/flosshub.org/files/169MiningAdditions.pdf %0 Conference Paper %B 1st Workshop on Public Data about Software Development (WoPDaSD 2006) %D 2006 %T Mining CVS Signals %A Jean-Michel Dalle %A L. Daudet %A den Besten, Matthijs %B 1st Workshop on Public Data about Software Development (WoPDaSD 2006) %P 10-19 %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Mining eclipse for cross-cutting concerns %A Breu, Silvia %A Zimmermann, Thomas %A Lindig, Christian %K aspects %K concept analysis %K cvs %K eclipse %K source code %X Software may contain functionality that does not align with its architecture. Such cross-cutting concerns do not exist from the beginning but emerge over time. By analysing where developers add code to a program, our history-based mining identifies cross-cutting concerns in a two-step process. First, we mine CVS archives for sets of methods where a call to a specific single method was added. In a second step, such simple cross-cutting concerns are combined to complex cross-cutting concerns. To compute these efficiently, we apply formal concept analysis—an algebraic theory. History-based mining scales well: we are the first to report aspects mined from an industrial-sized project like Eclipse. For example, we identified a locking concern that crosscuts 1284 methods. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 94–97 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138006 %R http://doi.acm.org/10.1145/1137983.1138006 %> https://flosshub.org/sites/flosshub.org/files/94MiningEclipse.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Mining email social networks %A Christian Bird %A Gourley, Alex %A Devanbu, Prem %A Gertz, Michael %A Swaminathan, Anand %K communication %K contributions %K developers %K email %K email archives %K mailing lists %K open source %K social networks %X Communication & Co-ordination activities are central to large software projects, but are difficult to observe and study in traditional (closed-source, commercial) settings because of the prevalence of informal, direct communication modes. OSS projects, on the other hand, use the internet as the communication medium,and typically conduct discussions in an open, public manner. As a result, the email archives of OSS projects provide a useful trace of the communication and co-ordination activities of the participants. However, there are various challenges that must be addressed before this data can be effectively mined. Once this is done, we can construct social networks of email correspondents, and begin to address some interesting questions. These include questions relating to participation in the email; the social status of different types of OSS participants; the relationship of email activity and commit activity (in the CVS repositories) and the relationship of social status with commit activity. In this paper, we begin with a discussion of our infrastructure (including a novel use of Scientific Workflow software) and then discuss our approach to mining the email archives; and finally we present some preliminary results from our data analysis. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 137–143 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138016 %R http://doi.acm.org/10.1145/1137983.1138016 %> https://flosshub.org/sites/flosshub.org/files/137MiningEmail.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Mining email social networks in Postgres %A Christian Bird %A Gourley, Alex %A Devanbu, Prem %A Gertz, Michael %A Swaminathan, Anand %K developers %K email %K email archives %K open source %K postgresql %K scm %K social network analysis %K social networks %K source code %K status %X Open Source Software (OSS) projects provide a unique opportunity to gather and analyze publicly available historical data. The Postgres SQL server, for example, has over seven years of recorded development and communication activity. We mined data from both the source code repository and the mailing list archives to examine the relationship between communication and development in Postgres. Along the way, we had to deal with the difficult challenge of resolving email aliases. We used a number of social network analysis measures and statistical techniques to analyze this data. We present our findings in this paper. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 185–186 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138033 %R http://doi.acm.org/10.1145/1137983.1138033 %> https://flosshub.org/sites/flosshub.org/files/185MiningEmail.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Open Source in Web-based Periodicals %A Baravalle, Andres %A Chambers, Sarah %X In this paper we aim to investigate the role of the media in the diffusion of Open Source, analysing three web-based periodicals from Italy, United Kingdom and USA. The influence of the media in our society is wide and we have to look to that direction if we want to seriously investigate the in-depth causes of the different trends. Nevertheless, our results show a picture that may not be familiar to many researchers of the field. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 203/2006 %P 347 - 348 %8 2006/// %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_40 %> https://flosshub.org/sites/flosshub.org/files/Open%20Source%20in%20Web-based%20Periodicals.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Predicting defect densities in source code files with decision tree learners %A Knab, Patrick %A Pinzger, Martin %A Bernstein, Abraham %K change analysis %K data mining %K decision tree learner %K defect density %K defect prediction %K mozilla %K prediction %K release history %K scm %K source code %K version control %X With the advent of open source software repositories the data available for defect prediction in source files increased tremendously. Although traditional statistics turned out to derive reasonable results the sheer amount of data and the problem context of defect prediction demand sophisticated analysis such as provided by current data mining and machine learning techniques.In this work we focus on defect density prediction and present an approach that applies a decision tree learner on evolution data extracted from the Mozilla open source web browser project. The evolution data includes different source code, modification, and defect measures computed from seven recent Mozilla releases. Among the modification measures we also take into account the change coupling, a measure for the number of change-dependencies between source files. The main reason for choosing decision tree learners, instead of for example neural nets, was the goal of finding underlying rules which can be easily interpreted by humans. To find these rules, we set up a number of experiments to test common hypotheses regarding defects in software entities. Our experiments showed, that a simple tree learner can produce good results with various sets of input data. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 119–125 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138012 %R http://doi.acm.org/10.1145/1137983.1138012 %> https://flosshub.org/sites/flosshub.org/files/119Predicting.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T A Robust Open Source Exchange for Open Source Software Development %A Basu, Amit %X This paper addresses the development of mechanisms for the creation of OSSD exchanges that could be used by developers across any geographical range, as long as all the developers can interact via some open network infrastructure such as the Internet. The structure of these exchanges can range from public repositories such as Sourceforge.net to intra-organizational forums for software development within an enterprise. We examine in particular the structure of an exchange model based on protocols for a robust online marketplace. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %P 99 - 108 %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_10 %> https://flosshub.org/sites/flosshub.org/files/A%20Robust%20Open%20Source%20Exchange.pdf %0 Journal Article %D 2005 %T Analysing the technological history of the Open Source Phenomenon: Stories from the Free Software Evolution %A Lorenzo Benussi %X The Free Libre Open Source Software represents an outstanding example of open development model of technological knowledge. It has been studied in several researches that produced valuable illustrations of the way it works. Our understanding of its principal features is growing exponentially and an entire new literature on open source has been created. However there appears to be an important gap in the literature: the origin of the phenomenon. The paper attempts to tackle this issue by analyzing the long-term technological history of Free Open Source Software; the main research questions at stake are: is the phenomenon completely new? and if it is not totally new, where it comes form? and, more generally, how open source software developed over time? As a consequence the present work focuses primarily on the analysis of the fee/open source software history of technological change over a period of almost sixty years. I adopted a multidisciplinary approach to analyse the network of relations emerging between inventions and technological innovations, as well as economic determinants and intellectual property rights regimes throughout the period considered. Thus, I attempt to investigate the origins of the phenomenon as a way of understanding its evolution. %8 September %G eng %> https://flosshub.org/sites/flosshub.org/files/benussi.pdf %0 Conference Paper %B Proceedings of the 2005 international workshop on Mining software repositories %D 2005 %T Analysis of signature change patterns %A Kim, Sunghun %A Whitehead,Jr., E. James %A Bevan, Jennifer %K apache %K gcc %K kernel %K linux %K signature change %K signature change patterns %K software evolution %K software evolution path %K soure code %X Software continually changes due to performance improvements, new requirements, bug fixes, and adaptation to a changing operational environment. Common changes include modifications to data definitions, control flow, method/function signatures, and class/file relationships. Signature changes are notable because they require changes at all sites calling the modified function, and hence as a class they have more impact than other change kinds.We performed signature change analysis over software project histories to reveal multiple properties of signature changes, including their kind, frequency, and evolution patterns. These signature properties can be used to alleviate the impact of signature changes. In this paper we introduce a taxonomy of signature change kinds to categorize observed changes. We report multiple properties of signature changes based on an analysis of eight prominent open source projects including the Apache HTTP server, GCC, and Linux 2.5 kernel. %B Proceedings of the 2005 international workshop on Mining software repositories %S MSR '05 %I ACM %C New York, NY, USA %P 1–5 %@ 1-59593-123-6 %U http://doi.acm.org/10.1145/1082983.1083154 %R http://doi.acm.org/10.1145/1082983.1083154 %> https://flosshub.org/sites/flosshub.org/files/64AnalysisOfSignature.pdf %0 Book Section %B The Standards Edge: Open Season %D 2005 %T A Conceptual Model for Enterprise Adoption of Open Source Software %A Kwan, Stephen K. %A Joel West %E Bolin, Sherrie %B The Standards Edge: Open Season %I Sheridan Books %C Ann Arbor, Mich. %P 51-62 %G eng %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Defining the Total Cost of Ownership for the Transition to Open Source Systems %A Russo, Barbara %A Braghin, Chiara %A Gasperi, Paolo %A Sillitti, Alberto %A Succi, Giancarlo %X This paper provides a a framework to evaluate the transition to a OSS software solution in terms of returns and losses in the context of Public Administrations. The ultimate goal of the framework is to identify costs that are not easy to trace or that are not usually collected like user acceptance. The framework has been conducted using a Total Cost of Ownership approach, which is the most frequently used model to conduct cost comparisons between two or more IT systems. The study further implements the Goal Question Metric paradigm to identify the cost metrics. The framework relies various methods to collect the data, including questionnaires with end-users, qualitative interviews with IT-managers and company balance sheets. An example of framework's use is provided. %B OSS2005: Open Source Systems %P 108-112 %U http://pascal.case.unibz.it/handle/2038/774 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Development-oriented Open Source eLearning Tool Evaluation: the Edukalibre Approach %A Botturi, Luca %A Chris Tebb %A Vania Dimitrova %A Drew Withworth %A Julika Matravers %A Jutta Geldermann %A Isabelle Hubert %B OSS2005: Open Source Systems %P 341-344 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Distributed Software Platforms for Rehabilitating Obsolete Hardware %A Russo, Ruggero %A Lamanna, Davide %A Baldoni, Roberto %X The diffusion of ICTs created the issue of a huge quantity of old computers to be discarded (E-waste). Sustainable dismantle is becoming a global enviromental emergency. Trashware movement is spreading worldwide, aiming to profitably reuse discarded computers as an alternative to dismantling them. Trashware is deeply related to the Open Source and Free Software movements. The aim of this piece of research is to combine Trashware to clustering, in order to verify if further optimisations are possible. Experiments were conducted on clusters of old machines and results are hereby presented. %B OSS2005: Open Source Systems %P 220-223 %U http://pascal.case.unibz.it/handle/2038/782 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T EDOS: Environment for the Development and Distribution of Open Source Software %A Abiteboul, Serge %A Leroy, Xavier %A Vrdoljak, Boris %A Di Cosmo, Roberto %A Fermigier, Stéfane %A Laurière, Stéphane %A Lepied, Frédéric %A Pop, Radu %A Villard, Florent %A Smets, Jean-Paul %A Bryce, Ciarán %A Dittrich, Klaus R. %A Milo, Tova %A Sagi, Assaf %A Shtossel, Yotam %A Panto, Eleonora %X The open-source software community is now comprised of a very large and growing number of contributors and users. The GNU/Linux operating system for instance has an estimated 18 million users worldwide and its contributing developers can be counted by thousands. The critical mass of contributors taking part in various opensource projects has helped to ensure high quality for open source software. However, despite the achievements of the open-source software industry, there are issues in the production of large scale open-source software (OSS) such as the GNU/Linux operating system that have to be addressed as the numbers of users, of contributors, and of available applications grow. EDOS is a European project supported by IST started October 2004 and ending in 2007, whose objective is to provide a new generation of methodologies, theoretical models, technical tools and quality models specifically tailored to OSS engineering and to software distribution over the Interne... %B OSS2005: Open Source Systems %P 66-70 %U http://pascal.case.unibz.it/handle/2038/737 %0 Conference Proceedings %B Software Process Workshop (SPW05): Unifying the Software Process Spectrum %D 2005 %T Experiences in Discovering, Modeling, and Reenacting Open Source Software Development Processes %A Chris Jensen %A Walt Scacchi %E Li, Mingshu %E Boehm, Barry %E Osterweil, Leon J. %B Software Process Workshop (SPW05): Unifying the Software Process Spectrum %I Springer-Verlag %C Beijing, China %8 May %G eng %0 Journal Article %D 2005 %T Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code (updated) %A Alan MacCormack %A John Rusnak %A Carliss Baldwin %K complexity %K cost %K dependencies %K functions %K lines of code %K linux %K loc %K mozilla %K source code %X This paper reports data from a study that seeks to characterize the differences in design structure between complex software products. In particular, we use Design Structure Matrices (DSMs) to map the dependencies between the elements of a design and define metrics that allow us to compare the structures of different designs. We first use these metrics to compare the architectures of two software products - the Linux operating system and the Mozilla web browser - that were developed via contrasting modes of organization: specifically, open source versus proprietary development. We then track the evolution of Mozilla, paying particular attention to a purposeful "re-design" effort that was undertaken with the intention of making the product more "modular." We find significant differences in structure between Linux and the first version of Mozilla, suggesting that Linux had a more modular architecture. We also find that the redesign of Mozilla resulted in an architecture that was significantly more modular than that of its predecessor, and indeed, than that of Linux. Our results, while exploratory, are consistent with a view that different modes of organization are associated with designs that possess different structures. However, we also illustrate that purposeful managerial actions can have a large impact on structure. This latter result is important given recent moves to release proprietary software into the public domain. These moves are likely to fail unless the product possesses an architecture that facilitates participation. Our paper provides evidence that a tightly-coupled design can be adapted to meet this objective. %8 June %G eng %> https://flosshub.org/sites/flosshub.org/files/maccormackrusnakbaldwin2.pdf %0 Journal Article %J Ieee Transactions on Professional Communication %D 2005 %T A framework for analyzing levels of analysis issues in studies of e-collaboration %A Gallivan, M. J. %A Benbunan-Fich, R. %X There has been a proliferation of competing explanations regarding the inconsistent results reported by the e-collaboration literature since its inception. This study advances another possible explanation by investigating the range of multilevel issues that can be encountered in research on the use of synchronous or asynchronous group support systems. We introduce concepts of levels of analysis from the management literature and then examine all empirical studies of e-collaboration from seven information systems journals for the period 1999-2003. We identified a total of 54 studies of e-collaboration in these journals, and after excluding 18 nonconforming studies-those that were primarily conceptual, qualitative, or exploratory only-we analyzed the levels of analysis issues in the remaining 36 empirical studies. Based on our analysis and classification of these studies into six different clusters according to their levels of analysis, we found that a majority of these studies contain one or more problems of levels incongruence that cast doubts on the validity of their results. It is indeed possible that these methodological problems are in part responsible for the inconsistent results reported in this literature, especially since researchers frequent decisions to analyze data at the individual level-even when the theory was formulated at the group level and when the research setting featured individuals working in groups-may very well have artificially inflated the authors' chances of finding statistically significant results. Based on our discussion of levels of analysis concepts, we hope to provide guidance to empirical researchers who study e-collaboration. %B Ieee Transactions on Professional Communication %V 48 %P 87-104 %G eng %M WOS:000227260800008 %0 Journal Article %J IEEE Trans. Software Eng. %D 2005 %T The FreeBSD Project: A Replication Case Study of Open Source Development %A Trung T. Dinh-Trong %A James M. Bieman %K apache %K bug reports %K contributors %K core %K cvs %K defect density %K developers %K email %K email archive %K freebsd %K mailing list %K scm %K source code %K users %X Case studies can help to validate claims that open source software development produces higher quality software at lower cost than traditional commercial development. One problem inherent in case studies is external validity—we do not know whether or not results from one case study apply to another development project. We gain or lose confidence in case study results when similar case studies are conducted on other projects. This case study of the FreeBSD project, a long-lived open source project, provides further understanding of open source development. The paper details a method for mining repositories and querying project participants to retrieve key process information. The FreeBSD development process is fairly well-defined with proscribed methods for determining developer responsibilities, dealing with enhancements and defects, and managing releases. Compared to the Apache project, FreeBSD uses 1) a smaller set of core developers—developers who control the code base—that implement a smaller percentage of the system, 2) a larger set of top developers to implement 80 percent of the system, and 3) a more well-defined testing process. FreeBSD and Apache have a similar ratio of core developers to people involved in adapting and debugging the system and people who report problems. Both systems have similar defect densities and the developers are also users in both systems. %B IEEE Trans. Software Eng. %V 31 %P 481-494 %R 10.1109/TSE.2005.73 %> https://flosshub.org/sites/flosshub.org/files/DinhTrungBieman.pdf %0 Journal Article %J IEEE Transactions on Software Engineering %D 2005 %T Hipikat: a project memory for software development %A Cubranic, D. %A Murphy, G. C. %A Singer, J. %A Booth, K.S. %X Sociological and technical difficulties, such as a lack of informal encounters, can make it difficult for new members of noncollocated software development teams to learn from their more experienced colleagues. To address this situation, we have developed a tool, named Hipikat, that provides developers with efficient and effective access to the group memory for a software development project that is implicitly formed by all of the artifacts produced during the development. This project memory is built automatically with little or no change to existing work practices. After describing the Hipikat tool, we present two studies investigating Hipikat's usefulness in software modification tasks. One study evaluated the usefulness of Hipikat's recommendations on a sample of 20 modification tasks performed on the Eclipse Java IDE during the development of release 2.1 of the Eclipse software. We describe the study, present quantitative measures of Hipikat's performance, and describe in detail three cases that illustrate a range of issues that we have identified in the results. In the other study, we evaluated whether software developers who are new to a project can benefit from the artifacts that Hipikat recommends from the project memory. We describe the study, present qualitative observations, and suggest implications of using project memory as a learning aid for project newcomers. %B IEEE Transactions on Software Engineering %V 31 %P 446 - 465 %8 06/2005 %N 6 %! IIEEE Trans. Software Eng. %R 10.1109/TSE.2005.71 %0 Journal Article %J Ibm Systems Journal %D 2005 %T Improving Web accessibility through an enhanced open-source browser %A Hanson, V. L. %A Brezin, J. P. %A Crayne, S. %A Keates, S. %A Kjeldsen, R. %A Richards, J. T. %A Swart, C. %A Trewin, S. %X The accessibilityWorks project provides software enhancements to the Mozilla (TM) Web browser and allows users to control their browsing environment. Although Web accessibility standards specify markup that must be incorporated for Web pages to be accessible, these standards do not ensure a good experience for all Web users. This paper discusses user controls that facilitate a number of adaptations that can greatly increase the usability of Web pages for a diverse population of users. in addition to transformations that change page presentation, innovations are discussed that enable mouse and keyboard input correction as well as vision-based control for users unable to use their hands for computer input. %B Ibm Systems Journal %V 44 %P 573-588 %G eng %M WOS:000231303100010 %0 Journal Article %J Ibm Systems Journal %D 2005 %T The Jikes research virtual machine project: Building an open-source research community %A Alpern, B. %A Augart, S. %A Blackburn, S. M. %A Butrico, M. %A Cocchi, A. %A Cheng, P. %A Dolby, J. %A Fink, S. %A Grove, D. %A Hind, M. %A McKinley, K. S. %A Mergen, M. %A Moss, J. E. B. %A Ngo, T. %A Sarkar, V. %A Trapp, M. %X This paper describes the evolution of the Jikes (TM) Research Virtual Machine project from an IBM internal research project, called Jalapeno, into an open-source project. After summarizing the original goals of the project, we discuss the motivation for releasing it as an open-source project and the activities performed to ensure the success of the project. Throughout, we highlight the unique challenges of developing and maintaining an open-source project designed specifically to support a research community. %B Ibm Systems Journal %V 44 %P 399-417 %G eng %M WOS:000229333800018 %1 information systems %2 case study %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Knowledge, Communication and Innovation: the case of Open Source Software as Open Media %A Lorenzo Benussi %K attributive %K Creative Commons %K non-commercial %X The understanding of the major characteristics of Linux Operative System and, more in general, the analysis of the so-called Open Source Phenomenon, is nowadays a central issue in order to appreciate the ongoing evolution of software industry. Free/Open Source Software model may appears a “revolution” in the way of thinking about software development, distribution and use. But, at a closer glance, it reveals itself more as an “evolution” along the path of “tinkering with software” typical of the Hacker Tech-Culture since the beginning of sixties. In fact the “open source way” of developing software results from the hackers habit of sharing technological knowledge and it represents a perfection of this, due to the availably of new communication technologies. The aim of this research is to explain this evolution, to point out its historical, economics and technological determinants and to link it with the evolution of the “communicational medium” used by the open source ... %B OSS2005: Open Source Systems %P 314-316 %U http://pascal.case.unibz.it/handle/2038/972 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Live! I-Learn @ Home %A Baldoni, Matteo %A Baroglio, Cristina %A Roversi, Luca %A Grandi, Claudio %K e-learning %K environment %K GNU/Linux %K java %K open source %K technology %X In this paper we present a live CD based on GNU/Linux (Knoppix), customized in order to supply a complete working and studying environment for the students of the Corso di Studi in Informatica. In particular, it supplies the Moodle course management system for e-learning complete of the courses of the first year. The aim is to enable the use of such resources off-line and without requiring special skills that will be achieved by studying but that newbies do not have. %B OSS2005: Open Source Systems %P 294-295 %U http://pascal.case.unibz.it/handle/2038/977 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Open Source and IMS Learning Design: Building the Infrastructure for eLearning %A Griffiths, David %A Blat, Josep %A Elferink, Ray %A Zondergeld, Sara %K eLearning %K eLearning specification %K FOSS %K infrastructure %K Learning Design %X The development of open, flexible eLearning specifications has significant implications for and interactions with the FOSS movement. A short overview of eLearning specifications is provided, focusing on the difference between SCORM and Learning Design (LD). The significance of LD for FOSS is examined, and common values identified. The particular contribution made by FOSS to LD infrastructure is discussed, and the importance of reference applications described. An overview is given of the FOSS applications available, divided into design time and run time, with particular reference to LD editors and the CopperCore Learning Design engine. %B OSS2005: Open Source Systems %P 329-333 %U http://pascal.case.unibz.it/handle/2038/1264 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Open Source firms: from community to business %A Andrea Bonaccorsi %A Cristina Rossi %K business %K COMMUNITY %K firms %K Open Source firms %X A large body of literature is now addressing the Open Source (OS) phenomenon. Economic scholars have deeply investigated the incentives of people working within OS community projects; the software production models in absence of explicit hierarchical structures; the successful dissemination of OS programs in environments dominated by proprietary standards; the peculiarities in the management of intellectual property within the OS framework. Theoretical contributions have coupled with the collection of extensive empirical evidence mainly through surveys taken on individual developers. Nowadays a new trend is shaping the OS movement: more and more firms are entering the market by offering software solutions based on the new paradigm (Open Source firms). %B OSS2005: Open Source Systems %P 362-363 %U http://pascal.case.unibz.it/handle/2038/1265 %0 Journal Article %D 2005 %T Open Source Patenting %A Sara Boettiger %X The open source and free software movements have used self-perpetuating copyright licenses to maintain open access to publicly distributed software. This model of licensing has now migrated to the field of biotechnology, where patents rather than copyrights dominate proprietary rights. Consequently, a model for open source patenting or free biotechnology presents a constellation of legal issues not typically found in previous open source licensing. This paper discusses several of these issues, including the nature of the rights transferred, the activities that may trigger the terms of the license, and the legal prohibitions on certain forms of licensing. %8 April %G eng %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Open Source software, intrinsic motivations and profit-oriented firms. Do not firms practise what they preach? %A Andrea Bonaccorsi %A Cristina Rossi %X This paper contributes to the literature by providing empirical evidence on the incentives of firms that engage in Open Source activities. Data collected by a survey on 146 Italian companies supplying OS solutions (Open Source firms) show that (surprisingly) intrinsic, communitybased incentives do play a role but are not, in general, put into practise. The discrepancy between attitudes and behaviours is investigated and firms adopting more consistent behaviours are singled out. Our results are in line with the literature on business models of firms entering the Open Source field. %B OSS2005: Open Source Systems %P 241-245 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T OTRS: un sistema a ticket per la gestione dell’help desk %A Bencetti, Stefano %A Verduci, Gianni %K help desk %K open source %K ticketing system %K trouble ticket %X Descriviamo l’esperienza di utilizzo di un software a codice sorgente aperto per la gestione a ticket dell’help desk informatico del D.I.S.I. (Dipartimento di Informatica e Scienze delll’Informazione) dell’Università degli Studi di Genova. %B OSS2005: Open Source Systems %P 281 %U http://pascal.case.unibz.it/handle/2038/1423 %0 Journal Article %D 2005 %T The Provision of a Public Good with a direct Provision Technology and Large Number of Agents %A Stefan Behringer %X This paper provides a limit result for the provision of a public good in a mechanism design framework as the number of agents gets large. A canonical example for a public good that is produced with a direct provision technology is Open Source Software. %8 January %G eng %> https://flosshub.org/sites/flosshub.org/files/behringer.pdf %0 Journal Article %J Group '05 Conference Proceedings %D 2005 %T Thematic Coherence and Quotation Practices in Open Source Software design-oriented online discussions %A F. Barcellini %A F. Detienne %A J. M. Burkhardt %A W. Sack %X This paper presents an analysis of online discussions in Open Source Software (OSS) design. The objective of our work is to understand and model the dynamics of OSS design that take place in mailing list exchanges. We show how quotation practices can be used to locate design relevant data in discussion archives. OSS developers use quotation as a mechanism to maintain the discursive context. To retrace thematic coherence in the online discussions of a major OSS project, Python, we follow how messages are linked through quotation practices. We compare our quotation-based analysis with a more conventional analysis: a thread-based of the reply-to links between messages. The advantages of a quotation-based analysis over a thread-based analysis are outlined. Our analysis reveals also the links between the social structure and elements in the discussion space and how it shapes influence in the design process. %B Group '05 Conference Proceedings %I ACM %8 November %G eng %> https://flosshub.org/sites/flosshub.org/files/Barcellinietal.pdf %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Towards an Open Source Development Process - Evaluating the Migration to an Open Source Project by Means of the Capability Maturity Model %A Bleek, Wolf-Gideon %A Finck, Matthias %A Pape, Bernd %X In this paper we review the ongoing development of a Web-based community system that has been migrated from a closed software development to an open source project. We identify three different phases in the migration process where the development process changed significantly. We analyse these phases by means of the Capability Maturity Model (CMM). The insights gained show the implications of such a migration process towards open source concerning the process quality of a development process. They also show underlying assumptions of the CMM that do not totally match with developments in this specific case study. As a helpful outcome, our reflection about the ongoing software development process helped identify two crucial factors: reflection about the process is possible even at lower levels and how to handle people's fluctuation to sustain a development project. %B OSS2005: Open Source Systems %P 37-43 %U http://pascal.case.unibz.it/handle/2038/1543 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Towards Supporting Agile Practice Within The Libre Software Paradigm %A Adams, Paul %A Boldyreff, Cornelia %K agile methods %K agile practice %K extreme programming %K libre software %K open source %K XP %X Individual agile methods have never been practiced as defined, in the same way that Royce's waterfall [1] model never reflected actual practice. Instead, practitioners adapted the core principles of these processes in order to suit their needs. Understanding this is key to appreciating the agile mindset. What does exist is a set of principles1 which, when followed loosely, form the agile practices. It is an important part of the agile mentality that the individuals within a project are more important that the process they follow. However, the individual methods do have their own identifying features that make them unique; for example testing must be performed before coding within eXtreme Programming (XP) [2]. However, if practitioners were to apply XP, exactly as Beck describes it, then they are probably not “doing agile” as they may not be following the process that suits their needs best. One of the interesting features of the XP method is its requirement of a collocated team. Th... %B OSS2005: Open Source Systems %P 303-304 %U http://pascal.case.unibz.it/handle/2038/1546 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Using Open Source Tools to Support Collaboration Within CALIBRE %A Adams, Paul %A Nutter, David %A Rank, Stephen %A Boldyreff, Cornelia %X This paper describes the deployment of Plone, an Open-Source content management system, to support the activities of CALIBRE, an EU-funded coordination action integrating research into Libre software. The criteria by which Plone was selected are described, and the goodness of fit to these criteria is analysed. As a coordination action, CALIBRE involves 12 partners with different requirements and characteristics. The CALIBRE Working Environment (CWE) must therefore support a variety of users with different levels of technical expertise and expectations. Implementation of the support infrastructure for CALIBRE is ongoing, and has provided some interesting insights into the benefits of the use of libre software. Although Plone has not been explicitly developed as a collaboration infrastructure, with its wealth of plugins, it has proven highly adaptable for this purpose. %B OSS2005: Open Source Systems %P 61-65 %U http://pascal.case.unibz.it/handle/2038/1555 %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Using Plone To Support Collaborative Research %A Adams, Paul %A Nutter, David %A Rank, Stephen %A Boldyreff, Cornelia %K collaboration environment %K collaborative research %K content management system %K open source %K plugin %B OSS2005: Open Source Systems %P 296-297 %U http://pascal.case.unibz.it/handle/2038/1558 %0 Journal Article %D 2005 %T Webservice Protocol Design for Economic Liberty and Observability %A Norbert Bollow %X One big potential benefit of the webservices paradigm is in reducing the costs of inter-firm business transactions. That should allow small and medium-sized enterprises to compete successfully with big firms. This paper considers specifically the economic needs of peer-to-peer business alliances, defined as multiparty business alliances which are not under the control of any single firm or any small group of alliance members, so that each participating firm has full economic liberty. This organisational form is appropriate for example for Free Software businesses. The main conclusions are that achieving economic observability of business transactions is of great importance, and that this is difficult to achieve with the Remote Procedure Calls paradigm of JINI or XML / HTTP / SOAP based webservices. The problem can be overcome by using the SXDF / QQP / QRPC suite of webservice protocols, %8 March %G eng %> https://flosshub.org/sites/flosshub.org/files/bollow.pdf %0 Journal Article %J Proceedings of the 1st International Conference on Open Source Systems %D 2005 %T Why and how-to contribute to libre software when you integrate them into an in-house application ? %A Christian Bac %A Olivier Berger %A Véronique Deborde %A Benoit Hamet %X Free or open source software are common tools that everybody can use and customise at its convenience to create in-house applications. Using and customising free software is not sufficient to ensure that this in-house application will be maintainable at mid or long term. This paper draws lessons from our in-house project, the development of a groupware Web platform for researchers, to help defining a policy through which efficient contributions can be made to open source software so that the in-house projects may remain viable. %B Proceedings of the 1st International Conference on Open Source Systems %P 113–118 %8 June %G eng %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T Workshop on "Open Source and Multimedia" %A Julien Bourgeois %A François Spies %A Dodero, Gabriella %A Vittoria Gianuzzi %B OSS2005: Open Source Systems %P 360 %0 Journal Article %D 2004 %T The Architecture of Cooperation: How Code Architecture Mitigates Free Riding in the Open Source Development Model %A Carliss Baldwin %X We argue that the architecture of a codebase is a critical factor that lies at the heart of the open source development process. To support this argument, we define two observable properties of an architecture: (1) its modularity and (2) its option values. Developers can make informed judgments about modularity and option value from early code releases. Their judgments in turn will influence their decisions to work and to contribute their code back to the community. We go on to suggest that the core of the open source development process can be thought of as two linked games played within a codebase architecture. The first game involves the implicit exchange of effort directed at the modules and option values of a codebase; the second is a Prisoners' Dilemma game triggered by the irreducible costs of communicating. The implicit exchange of effort among developers is made possible by the the non-rivalrous nature of the codebase and by the modularity and option values of the codebase's architecture. This exchange creates value for all participants, both workers and free-riders. In contrast, the Prisoners' Dilemma is a problem that must be surmounted if the exchanges are to take place. It can be addressed through a combination of reducing the costs of communication, providing rewards, and encouraging repeated interactions. Finally, the initial design and "opening up" of a codebase can be seen as a rational move by an architect who is seeking to test the environment in hopes of initiating exchanges of effort with other developers. %8 January %G eng %> https://flosshub.org/sites/flosshub.org/files/baldwinclark.pdf %0 Conference Proceedings %B Proceedings of the 4th ICSE Workshop on Open Source %D 2004 %T Communication and Conflict Issues in Collaborative Software Research Projects %A Boldyreff, Cornelia %A Nutter, David %A Rank, Stephen %K artefact %K cvs %K genesis %K oscar %X The Open Source Component Artefact Repository (OSCAR) was developed under the auspices of the GENESIS project to store data produced during the software development process. Significant problems were encountered during the course of the project in both the development itself and management of the project. The reasons for and potential solutions to these problems are examined with the intention of developing a set of guidelines to enable participants in other collaborative projects to avoid these pitfalls. We wish to make it clear that we attach no opprobrium to any of the participants in the GENESIS project as many of the issues we outline below have solutions only visible with hindsight. Instead, we seek to provide a fair-minded critique of our role and the mistakes we made in a fairly typical two-year EU research project, and to provide a set of recommendations for other similar projects, in order that they can (attempt to) avoid suffering similarly. %B Proceedings of the 4th ICSE Workshop on Open Source %P 14-17 %> https://flosshub.org/sites/flosshub.org/files/boldyreff15-18.pdf %0 Journal Article %D 2004 %T The contestation of code: A preliminary investigation into the discourse of the free/libre and open source movements %A David M. Berry %X This paper uses discourse analysis to examine the free/libre and open source movements. It analyses how they fix elements within the order of discourse of computer code production. It attempts to uncover the key signifiers in their discourses and trace linkages between the sedimented discourses of wider society. Using discourse theory and critical discourse analysis, the theoretical foundations underpinning each of the movements are critically examined and the effect on the wider developer and Internet community is discussed. Additionally, this paper seeks to recommend discursive strategies that could be employed to avoid the threat of colonization by neoliberal discourse and the consequent challenge this has for the ideas of freedom, liberty and community within the developer communities? own discourses. %8 April %G eng %> https://flosshub.org/sites/flosshub.org/files/berry1.pdf %0 Conference Proceedings %B Proceedings of the 4th ICSE Workshop on Open Source %D 2004 %T Contributing to OS Projects. A Comparison between Individual and Firms %A Andrea Bonaccorsi %A Cristina Rossi %K Survey %X This paper studies the contributions software firms make to Open Source (OS) projects. Our goal is to ascertain whether they follow the same regularity of pattern seen for individual programmer An exhaustive empirical analysis was carried out using data on project membership1 , project coordination and the contributions made by 146 Italian firms that do business with OS software. We compare our findings with the results of the surveys taken on OS programmers. The availability of the data gathered by Hertel et al. ([10]) on 141 developers of the Linux kernel allowed a direct comparison to be carried out between the two sets2 . %B Proceedings of the 4th ICSE Workshop on Open Source %P 18-22 %> https://flosshub.org/sites/flosshub.org/files/19-23.pdf %0 Unpublished Work %D 2004 %T Implicit theories of "good leadership" in the open-source community %A Gianluca Bosco %X The goal of this paper is to uncover the implicit theories (a.k.a. personal believes) of open-source developers concerning the characteristics and behaviors of a "good project leader". Three main behavioral factors are discovered to describe such implicit theories: competence, task orientation and person consideration. The conclusions of this study have been drawn from an analysis conducted on data gathered through 138 respondents. %8 April %G eng %> https://flosshub.org/sites/flosshub.org/files/bosco.pdf %0 Journal Article %D 2004 %T Internet Research: Privacy, Ethics and Alienation ? An Open Source Approach %A David M. Berry %X This paper examines some of the ethical problems involved in undertaking Internet research and draws on historical accounts as well as contemporary studies to offer an analysis of the issues raised. It argues that privacy is a misleading and confusing concept to apply to the Internet, and that the concept of non-alienation is more resourceful in addressing the many ethical issues surrounding Internet research. Using this as a basis, the paper then investigates the Free/Libre and Open Source research model and argues for the principles of ?open source ethics? in researching the online world, which includes a participatory and democratic research method. %8 December %G eng %> https://flosshub.org/sites/flosshub.org/files/berry2.pdf %0 Journal Article %D 2004 %T Intrinsic Motivation in Open Source Software Development %A Jurgen Bitzer %X This papers sheds light on the puzzling evidence that even though open source software (OSS) is a public good, it is developed for free by highly qualified, young and motivated individuals, and evolves at a rapid pace. We show that once OSS development is understood as the private provision of a public good, these features emerge quite naturally. We adapt a dynamic private-provision-of-public-goods model to reflect key aspects of the OSS phenomenon. In particular, instead of relying on extrinsic motives for programmers (e.g. signaling) the present model is driven by intrinsic motives of OSS programmers, such as user-programmers, play value or homo ludens payoff, and gift culture benefits. Such intrinsic motives feature extensively in the wider OSS literature and contribute new insights to the economic analysis. %8 September %G eng %> https://flosshub.org/sites/flosshub.org/files/bitzerschrettlschroder.pdf %0 Conference Proceedings %B Proceedings of the 4th ICSE Workshop on Open Source %D 2004 %T Migrating a Development Project to Open Source Software Development %A Bleek, W-G. %A Finck, M. %X The CommSy-system is a web-based community system, which has been in development since 1999 at the University of Hamburg. It has initially been developed by students and researchers in their spare time. Its last organizational setting was a publicly funded research project, which allowed for full-time and part-time developers. As that project has come to an end, we are aiming at an open source project to ensure continuity by providing a frame for people from different organizations. In this paper we discuss the characteristics of this specific project and of other open source projects to identify a strategy for migrating that particular project to open source. We outline the actions taken to migrate the existing project to open source software development and raise questions concerning the necessary characteristics of an open source project as well as whether the actions will suffice or not. %B Proceedings of the 4th ICSE Workshop on Open Source %P 9-13 %> https://flosshub.org/sites/flosshub.org/files/bleek10-14.pdf %0 Journal Article %D 2004 %T Open Source en el e-learning: ?Una cuesti?n de mente? - An?lisis del fen?meno del Open source en el e-learning, situaci?n actual y tendencias %A Carlos Biscay %X Primeramente vamos a ver ?Qu? es el Open source? ?Cu?les son sus objetivos y sus caracter?sticas fundamentales?. En segundo lugar haremos un recorrido hist?rico de los ?ltimos a?os, identificando los principales desarrollos del Open source y sus tendencias, especialmente en el ?rea de la educaci?n superior. En tercer lugar mostramos el Proyecto Sakai, donde un grupo de Universidades de Primer nivel est?n uniendo fuerzas para integrar y para sincronizar su software educativo en una colecci?n de herramientas de Open source para e-learning. A continuaci?n haremos mencionaremos los principales aplicaciones de OS en general. Seguidamente, he de tratar de poner de relieve los aspectos t?cnicos y psicol?gicos que est?n presentes en el Open Source y sus protagonistas y las razones econ?micas, pol?ticas y culturales que se vinculan o crean un contexto favorable al crecimiento de las soluciones de open source en el mercado y especialmente en mercado del e-learning. Finalmente y como s?ntesis de estas cuestiones describimos el caso concreto de la Universidad de Wisconsin, que actualmente junto a otras instituciones educativas se ha sumado en el proyecto Sakai. %8 January %G eng %> https://flosshub.org/sites/flosshub.org/files/biscay.pdf %0 Unpublished Work %D 2004 %T Open source software development put in an impure public goods context %A Federico Bertelli %X The open source software development appears to be a problem of pure public goods contribution, but looking more in depth emerge the classic question posed by Lerner and Tirole: "Why should thousands of top-notch programmers contribute freely to the provision of a public good?". So, the aim of this research is to elaborate a model able to cope with the low level of free riding. %8 June %G eng %> https://flosshub.org/sites/flosshub.org/files/bertelli.pdf %0 Journal Article %D 2004 %T SOS-ware DEVILS[Strategic Open Software DEVelopment ILlnesseS] %A George Blanas %X Certain categories of software play a strategic role in contemporary public and private organizations. While software use is accelerated and diffused to more and more people and organisations, software development follows a reverse trend where fewer players form oligopolies, with some of them having almost reached a state of monopoly in certain areas. The evil consequences of such an evolution can be numerous, some of them relate to economic and security dependence and some others to phenomena of knowledge dependence and hysteresis. Within the current paper, we formulate a general framework that categorises the types of illnesses in open strategic software development from a number of viewpoints and the types of damages that could be inflicted to organizations and states as a result of false expectations if these illnesses persist. Finally, we identify the areas where research is considered to be urgently needed. %8 March %G eng %> https://flosshub.org/sites/flosshub.org/files/blanas1.pdf %0 Journal Article %D 2004 %T SOS-ware [Strategic Open Software] Perspectives %A George Blanas %X Certain types of software play a strategic role in the development of the various aspects of organizational life. One of these roles is knowledge development that can act as a facilitator of economic diamonds. We review the characteristics of strategic software and we try to answer the question whether there can exist open software development that would be able to incorporate these characteristics. Based on this review, and on certain case studies, we present a theory, on how open software might be able to close the gaps in knowledge creation and usage - or the reverse, ie. to become a vehicle for an acceleration of this hysteresis. %8 March %G eng %> https://flosshub.org/sites/flosshub.org/files/blanas2.pdf %0 Journal Article %J Electronic Markets %D 2004 %T Will the Open Source Movement Survive a Litigious Society? %A Vijay K. Vemuri %A Vince Bertone %K courts %K INNOVATION %K lawsuit %K litigation %K patents %K software patents %X Since no one is willing to undertake costly research and development to create innovation, incentives in the form of patents were instituted to motivate R&D. In software development, contrary to economic intuition, open source software has emerged as a viable alternative source of innovation. The patenting system has performed reasonably well in enhancing many other technologies. Since the mid-1990s patenting of software and business methods is increasingly accepted in the United States. The legitimacy of many of these new patents is subject to controversy and debate. In this paper we examine the trend, rate of litigation and disposition of US patents in the US Federal Courts. We find that litigation rates of software and business method patents is four times that of all other patents and is increasing. A majority of patent litigations are not won by the perpetrator of the lawsuits. The open source software community is not immune to heightened patent litigations. Since software development is incremental, the paths of OSS and commercial development are entwined. The spillover of patent litigation into OSS may have disastrous consequences: It may increase the 'cost' of OSS, dissuade volunteer developers and make OSS less attractive to users. %B Electronic Markets %V 14 %P 114-123 %0 Journal Article %D 2003 %T Adaptive entry strategies under dominant standards: Hybrid business models in the Open Source software industry %A Andrea Bonaccorsi %X Although a growing body of literature is analysing Open Source software (OSS) issues, there is still lack of empirical data on the phenomenon and little is known about firms that enter the software industry by producing under the Open Source license scheme (Open Source firms). This paper is a contribution to fill this gap and focuses on the business models of these firms. We find significant heterogeneity among them, in particular many agents supply both proprietary and Open Source software. We present a model of adoption that studies the intra-firm diffusion of the new paradigm. Explanatory hypotheses are discussed analysing how the characteristics of the Open Source production mode and of network externalities in software demand shape the strategies of firms that entered the OSS field. %8 March %G eng %> https://flosshub.org/sites/flosshub.org/files/bonnacorsirossigiannangeli.pdf %0 Journal Article %D 2003 %T Altruistic individuals, selfish firms? The structure of motivation in Open Source software %A Andrea Bonaccorsi %X A growing body of economic literature is addressing the incentives of the individuals that take part to the Open Source movement. However, empirical analyses focus on individual developers and neglect firms that do business with Open Source software (OSS). During 2002, we conducted a large-scale survey on 146 Italian firms supplying Open Source solutions in Italy. In this paper our data on firms? motivations are compared with data collected by the surveys made on individual programmers. We aim at analysing the role played by different classes of motivations (social, economic and technological) in determining the involvement of different groups of agents in Open Source %8 August %G eng %> https://flosshub.org/sites/flosshub.org/files/bnaccorsirossimotivationshort.pdf %0 Journal Article %D 2003 %T An analysis of Open Source production in Italy %A Andrea Bonaccorsi %X Final report of a survey on Italian firms that do business with Open Source software %8 August %G eng %> https://flosshub.org/sites/flosshub.org/files/bonaccorsirossiccatenieliss.pdf %0 Journal Article %D 2003 %T Comparing motivations of individual programmers and firms to take part in the Open Source movement. From community to business %A Andrea Bonaccorsi %X A growing body of economic literature is addressing the incentives of the individuals that take part to the Open Source movement. However, empirical analyses focus on individual developers and neglect firms that do business with Open Source software (OSS). During 2002, we conducted a large-scale survey on 146 Italian firms supplying Open Source solutions in Italy. In this paper our data on firms? motivations are compared with data collected by the surveys made on individual programmers. We aim at analysing the role played by different classes of motivations (social, economic and technological) in determining the involvement of different groups of agents in Open Source activities. %8 October %G eng %> https://flosshub.org/sites/flosshub.org/files/bnaccorsirossimotivationlong.pdf %0 Journal Article %D 2003 %T Contributing to the common pool resources in Open Source software. A comparison between individuals and firms %A Andrea Bonaccorsi %K developers %K linux %K linux kernel %K Survey %X This paper studies the contributions to Open Source projects of software firms. Our goal is to analyse whether they follow the same regularities that characterize the behaviour of individual programmers. An exhaustive empirical analysis is carried out using data on project membership, project coordination and contribution efforts of 146 Italian firms that do business with Open Source software. We follow a meta-analytic approach comparing our findings with the results of the surveys conducted on Free Software programmers. Moreover, the availability of the data gathered by Hertel et al. (2003) on 141 developers of the Linux kernel will allow direct comparisons between the two sets. %8 August %G eng %> https://flosshub.org/sites/flosshub.org/files/bnaccorsirossidevelopers.pdf %0 Journal Article %D 2003 %T The Contribution of Free Software to Software Evolution %A Andreas Bauer %X t is remarkable to think that even without any interest in finding suitable methods and concepts that would allow complex software systems to evolve and remain manageable, the ever growing open source movement has silently managed to establish highly successful evolution techniques over the last two decades. These concepts represent best practices that could be applied equally to a number of today?s most crucial problems concerning the evolution of complex commercial software systems. In this paper, the authors state and explain some of these principles from the perspective of experienced open source developers, and give the rationale as to why the highly dynamic free software development process, as a whole, is entangled with constantly growing code bases and changing project sizes, and how it deals with these successfully. %8 September %G eng %> https://flosshub.org/sites/flosshub.org/files/bauerpizka.pdf %0 Journal Article %J MIS Quarterly %D 2003 %T The identity crisis within the IS discipline: Defining and communicating the discipline’s core properties %A Izak Benbasat %A Robert W. Zmud %B MIS Quarterly %V 27 %P 183–194 %G eng %> https://flosshub.org/sites/flosshub.org/files/westdedrick.pdf %0 Journal Article %D 2003 %T Licensing schemes in the production and distribution of Open Source software. An empirical investigation %A Andrea Bonaccorsi %X Contrary to what most people assume, Open source doesn't just mean access to the source code. A software is considered Open Source if and only if its distribution terms [i.e. the license] comply with the set of criteria defined by the Open Source Definition (OSD). That is, to say that a code is Open Source is to say that it is subject to a member of a particular category of licenses (McGowan, 2000). As many others in the Open Source field, the research on Open Source licenses suffers from lack of empirical data. Although in the literature there are empirical studies that explore the relationships between license choice and project characteristics (Lerner and Tirole, 2002a), at present we are not aware of surveys that collect data on licensors, that is on firms producing and distributing software on an Open Source basis. This study addresses his shortcoming. We examine the license choice of the firms that supply Open Source products and services and relate it to their structural characteristics, business models and attitudes towards the movement and its community. Between September 2002 and March 2003 we conducted a survey on Italian firms that do business with Open Source software. We asked them to indicate the Open Source licenses with which they work, for the distribution of their software as well as the production process. We made reference to the distinction between copyleft and non-copyleft distribution schemes. Using these data, this paper aims at testing several theoretical hypotheses advanced by the literature on Open Source licenses. In order to make the discussion more lively, for each issue we present the hypothesis and our findings in sequence. %8 August %G eng %> https://flosshub.org/sites/flosshub.org/files/bnaccorsirossilicense.pdf %0 Journal Article %D 2003 %T Open Source Software as an organisational Technology %A Jonathan Barnes %X This paper is still relatively preliminary, yet it provides a decent introduction to open source, as well as including discussion on various economic issues, contained in the following sections: The benefits of Open Source, Possible incentives that encourage contribution, Barriers to widespread implementation of Open Source. %8 July %G eng %> https://flosshub.org/sites/flosshub.org/files/barnes.pdf %0 Conference Proceedings %B Proceedings of the 3rd ICSE Workshop on Open Source %D 2003 %T Open-Source Development Processes and Tools %A Boldyreff, Cornelia %A Lavery, J. %A Nutter, David %A Rank, Stephen %B Proceedings of the 3rd ICSE Workshop on Open Source %P 15-18 %> https://flosshub.org/sites/flosshub.org/files/15-18.pdf %0 Journal Article %J RP Special Issue %D 2003 %T Why open source software can succeed %A Andrea Bonaccorsi %X The paper discusses three key economic problems raised by the emergence and diffusion of open source software: motivation, coordination, and diffusion under a dominant standard. First the movement took off through the activity of a software development community that deliberately did not follow profit motivations. Second, a hierarchical coordination emerged without the support of an organization with proprietary rights. Third, Linux and other open source systems diffused in an evnvironment dominated by established proprietary standards, which benefited from significant increasing returns. The paper show that recent developments in the theory of critical mass in the diffusion of technologies with network externality may help to explain these phenomena. %B RP Special Issue %8 February %G eng %> https://flosshub.org/sites/flosshub.org/files/rp-bonaccorsirossi.pdf %0 Journal Article %J Yale Law Journal %D 2002 %T Coase's penguin, or, Linux and The Nature of the Firm %A Benkler, Y. %X For decades our common understanding of the organization of economic production has been that individuals order their productive activities in one of two ways: either as employees in firms, following the directions of managers, or as individuals in markets, following price signals. This dichotomy was first identified in the early work of Ronald Coase and was developed most explicitly in the work of institutional economist Oliver Williamson. Recently, public attention has focused on a fifteen-year-old phenomenon called free software or open source software. This phenomenon involves thousands, or even tens of thousands, of computer programmers who collaborate on large- and small-scale projects without traditional firm-based or market-based ownership of the resulting product. This Article explains why free software is only one example of a much broader social-economic phenomenon emerging in the digitally networked. environment, a third mode of production that the author calls "commons-based peer production." The Article begins by demonstrating the widespread use of commons-based peer production on the Internet through a number of detailed examples, such as Wikipedia, Slashdot the Open Directory Project, and Google. The Article uses these examples to reveal fundamental characteristics of commons-based peer production that distinguish it from the property- and contract-based modes of firms and markets. The central distinguishing characteristic. is that groups of individuals successfully collaborate on large-scale projects following a diverse cluster of motivational drives and social signals rather than market prices or managerial commands. The Article then explains why this mode has systematic advantages over markets and managerial hierarchies in the digitally networked environment when the object of production is information or culture. First, peer production has an advantage in what the author calls "information opportunity cost," because it loses less information about who might be the best person for a given job. Second, there are substantial increasing allocation gains to be captured from allowing large clusters of potential contributors to interact with large clusters of information resources in search of new projects and opportunities for collaboration. The Article concludes with an overview of how these models use a variety of technological, social, and formal strategies to overcome the collective action problems usually solved in managerial and market-based systems by property, contract, and managerial commands. %B Yale Law Journal %V 112 %P 369-+ %8 Dec %@ 0044-0094 %G eng %M ISI:000180062600001 %1 economics %2 case study %0 Journal Article %J Information Systems Journal %D 2002 %T Code quality analysis in open source software development %A Ioannis Stamelos %A Lefteris Angelis %A Apostolos Oikonomou %A Georgios L. Bleris %K C %K Code quality characteristics %K functions %K linux %K metrics %K open source development %K software measurement %K structural code analysis %K Suse %K user satisfaction %X Proponents of open source style software development claim that better software is produced using this model compared with the traditional closed model. However, there is little empirical evidence in support of these claims. In this paper, we present the results of a pilot case study aiming: (a) to understand the implications of structural quality; and (b) to figure out the benefits of structural quality analysis of the code delivered by open source style development. To this end, we have measured quality characteristics of 100 applications written for Linux, using a software measurement tool, and compared the results with the industrial standard that is proposed by the tool. Another target of this case study was to investigate the issue of modularity in open source as this characteristic is being considered crucial by the proponents of open source for this type of software development. We have empirically assessed the relationship between the size of the application components and the delivered quality measured through user satisfaction. We have determined that, up to a certain extent, the average component size of an application is negatively related to the user satisfaction for this application. %B Information Systems Journal %V 12 %P 43–60 %0 Journal Article %D 2002 %T Community Effort in Online Groups? Who Does the Work and Why? %A Brian Butler %X In this paper, the authors consider how the formal leadership role, personal and community benefits, and community characteristics influence the effort members put into helping their online groups. Results from a survey of Internet listserv owners and other members suggest that though owners, who have a formal leadership role, do more of the effortful community building work than do regular members, other members also take on some of the work. Moreover, members who value different benefits are likely to contribute to the development on an online community in different ways. %8 February %G eng %> https://flosshub.org/sites/flosshub.org/files/butler.pdf %0 Conference Paper %B 2nd Workshop on Open Source Software Engineering at ICSE 2002 %D 2002 %T Open-Source Artefact Management %A Boldyreff, Cornelia %A Nutter, David %A Rank, Stephen %K artefacts %K artifacts %K genesis %K oscar %K workflow %X This paper presents the GENESIS project, which aims to develop an open-source, lightweight, process-aware (and process-neutral) workflow management system. In particular OSCAR, the artefact repository is discussed. The requirements of a system for artefact management and storage are described, and the concept of active artefacts is explained. The software engineering methods which will be used in the project are described, and some examples of the open-source tools which may be used are described. %B 2nd Workshop on Open Source Software Engineering at ICSE 2002 %> https://flosshub.org/sites/flosshub.org/files/BoldyreffNutterRank.pdf %0 Conference Paper %B 1st Workshop on Open Source Software Engineering at ICSE 2001 %D 2001 %T Configuration Management for Open Source Software %A Asklund, U. %A Bendix, L. %K configuration management %K interviews %K project success %X Open Source Software (OSS) projects have a seemingly anarchistic way of organising projects and a set-up (many, distributed developers) that is usually considered difficult to handle within the field of configuration management. Still they manage to produce software that is of at least as high a quality as that produced by Conventional Software Development (CSD) projects. We have investigated more closely what they actually do, and why they are so successful. The goal of the study was to describe their underlying configuration management process, thereby making it explicit, so it can be followed in case others (like commercial companies) want to start an OSS project or a project having similar characteristics. We also analysed to what extent their success is due to a good process, good tools or simply to outstanding people participating in OSS projects. Based on this, lessons could be learned from OSS and possible transferred to conventional ways of developing software. We interviewed key people from three OSS projects (KDE, Mozilla and Linux) to obtain data for our study. %B 1st Workshop on Open Source Software Engineering at ICSE 2001 %8 05/2001 %> https://flosshub.org/sites/flosshub.org/files/asklundbendix.pdf %0 Conference Paper %B 1st Workshop on Open Source Software Engineering at ICSE 2001 %D 2001 %T Creating a Free, Dependable Software Engineering Environment for Building Java Applications %A Bittman, M. %A Roos, R. %A Kapfhammer, G.M. %K applications %K cvs %K Doc++ %K GNU Make %K GVim %K Jakarta Ant %K java %K Javadoc %K jrefactory %K junit %K tools %X As open source software engineering becomes more prevalent, employing sound software engineering practices and the tools used to implement these practices becomes more important. This paper examines the current status of free software engineering tools. For each set of tools, we determined the important attributes that would best assist a developer in each stage of the waterfall model. We rated each tool based on predetermined attributes. We used the creation of a graphical user interface based email client in Java to assist in evaluating each tool. Our findings show that there is still a need for free tools to extract UML diagrams, test graphical user interfaces, make configuring Emacs easier, and profile Java applications. In other areas there are free tools that provide satisfactory functionality such as Concurrent Versions System (CVS), GVim, JUnit, JRefactory, GNU Make, Jakarta Ant, Javadoc, and Doc++. %B 1st Workshop on Open Source Software Engineering at ICSE 2001 %> https://flosshub.org/sites/flosshub.org/files/bittman.pdf %0 Journal Article %J Communications of the Association for Information Systems %D 1999 %T Dimensions of information systems success %A Seddon, PB %A Staples, S %A Patnayakuni, R %A Bowtell, M %B Communications of the Association for Information Systems %V 20 %P 61 %G eng