%0 Journal Article %J Information Economics and Policy %D 2008 %T The allocation of collaborative efforts in open-source software %A den Besten, Matthijs %A Jean-Michel Dalle %A Galia, Fabrice %K age %K apache %K complexity %K cvs %K division of labor %K functions %K gaim %K gcc %K ghostscript %K lines of code %K loc %K log files %K mozilla %K netbsd %K openssh %K postgresql %K python %K revision control %K scm %K size %K source code %K Stigmergy %K version control %X The article investigates the allocation of collaborative efforts among core developers (maintainers) of open-source software by analyzing on-line development traces (logs) for a set of 10 large projects. Specifically, we investigate whether the division of labor within open-source projects is influenced by characteristics of software code. We suggest that the collaboration among maintainers tends to be influenced by different measures of code complexity. We interpret these findings by providing preliminary evidence that the organization of open-source software development would self-adapt to characteristics of the code base, in a 'stigmergic' manner. %B Information Economics and Policy %V 20 %P 316 - 322 %U http://www.sciencedirect.com/science/article/B6V8J-4SSG4PN-1/2/88b3824c30a31c18929d8a5ca6d64f62 %R DOI: 10.1016/j.infoecopol.2008.06.003 %0 Conference Paper %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %D 2007 %T Comparing Approaches to Mining Source Code for Call-Usage Patterns %A Kagdi, Huzefa %A Collard, Michael L. %A Maletic, Jonathan I. %K function calls %K functions %K kernel %K linux %K sequence %K sequencing %K sequential-pattern mining %X Two approaches for mining function-call usage patterns from source code are compared. The first approach, itemset mining, has recently been applied to this problem. The other approach, sequential-pattern mining, has not been previously applied to this problem. Here, a call-usage pattern is a composition of function calls that occur in a function definition. Both approaches look for frequently occurring patterns that represent standard usage of functions and identify possible errors. Itemset mining produces unordered patterns, i.e., sets of function calls, whereas, sequential-pattern mining produces partially ordered patterns, i.e., sequences of function calls. The trade-off between the additional ordering context given by sequential-pattern mining and the efficiency of itemset mining is investigated. The two approaches are applied to the Linux kernel v2.6.14 and results show that mining ordered patterns is worth the additional cost. %B Fourth International Workshop on Mining Software Repositories (MSR'07:ICSE Workshops 2007) %I IEEE %C Minneapolis, MN, USA %P 20 - 20 %@ 0-7695-2950-X %R 10.1109/MSR.2007.3 %> https://flosshub.org/sites/flosshub.org/files/28300020.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Examining the evolution of code comments in PostgreSQL %A Zhen Ming Jiang %A Hassan, Ahmed E. %K code comments %K comments %K cvs %K evolution %K functions %K maintenance %K mining challenge %K msr challenge %K postgresql %K software evolution %K software maintenance %K source code %X It is common, especially in large software systems, for developers to change code without updating its associated comments due to their unfamiliarity with the code or due to time constraints. This is a potential problem since outdated comments may confuse or mislead developers who perform future development. Using data recovered from CVS, we study the evolution of code comments in the PostgreSQL project. Our study reveals that over time the percentage of commented functions remains constant except for early fluctuation due to the commenting style of a particular active developer. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 179–180 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138030 %R http://doi.acm.org/10.1145/1137983.1138030 %> https://flosshub.org/sites/flosshub.org/files/179ExaminingTheEvolution.pdf %0 Journal Article %D 2005 %T Exploring the Structure of Complex Software Designs: An Empirical Study of Open Source and Proprietary Code (updated) %A Alan MacCormack %A John Rusnak %A Carliss Baldwin %K complexity %K cost %K dependencies %K functions %K lines of code %K linux %K loc %K mozilla %K source code %X This paper reports data from a study that seeks to characterize the differences in design structure between complex software products. In particular, we use Design Structure Matrices (DSMs) to map the dependencies between the elements of a design and define metrics that allow us to compare the structures of different designs. We first use these metrics to compare the architectures of two software products - the Linux operating system and the Mozilla web browser - that were developed via contrasting modes of organization: specifically, open source versus proprietary development. We then track the evolution of Mozilla, paying particular attention to a purposeful "re-design" effort that was undertaken with the intention of making the product more "modular." We find significant differences in structure between Linux and the first version of Mozilla, suggesting that Linux had a more modular architecture. We also find that the redesign of Mozilla resulted in an architecture that was significantly more modular than that of its predecessor, and indeed, than that of Linux. Our results, while exploratory, are consistent with a view that different modes of organization are associated with designs that possess different structures. However, we also illustrate that purposeful managerial actions can have a large impact on structure. This latter result is important given recent moves to release proprietary software into the public domain. These moves are likely to fail unless the product possesses an architecture that facilitates participation. Our paper provides evidence that a tightly-coupled design can be adapted to meet this objective. %8 June %G eng %> https://flosshub.org/sites/flosshub.org/files/maccormackrusnakbaldwin2.pdf %0 Conference Paper %B Proceedings of the 2005 international workshop on Mining software repositories %D 2005 %T Recovering system specific rules from software repositories %A Williams, Chadd C. %A Hollingsworth, Jeffrey K. %K function usage patterns %K functions %K source code %K wine %X One of the most successful applications of static analysis based bug finding tools is to search the source code for violations of system-specific rules. These rules may describe how functions interact in the code, how data is to be validated or how an API is to be used. To apply these tools, the developer must encode a rule that must be followed in the source code. The difficulty is that many of these system-specific rules are undocumented and "grow" over time as the source code changes. Most research in this area relies on expert programmers to document these little-known rules. In this paper we discuss a method to automatically recover a subset of these rules, function usage patterns, by mining the software repository. We present a preliminary study that applies our work to a large open source software project. %B Proceedings of the 2005 international workshop on Mining software repositories %S MSR '05 %I ACM %C New York, NY, USA %P 7-11 %@ 1-59593-123-6 %U http://doi.acm.org/10.1145/1082983.1083144 %R http://doi.acm.org/10.1145/1082983.1083144 %> https://flosshub.org/sites/flosshub.org/files/7Recovering.pdf %0 Conference Paper %B International Workshop on Mining Software Repositories (MSR 2004) %D 2004 %T LASER: a lexical approach to analogy in software reuse %A Amin, R. %A Mel O Cinneide %A Veale, Tony %K class %K developers %K functions %K jrefactory %K method %K naming %K natural language %K reuse %K source code %K wordnet %X Software reuse is the process of creating a software system from existing software components, rather than creating it from scratch. With the increase in size and complexity of existing software repositories, the need to provide intelligent support to the programmer becomes more pressing. An analogy is a comparison of certain similarities between things which are otherwise unlike. This concept has shown to be valuable in developing UML-level reuse techniques. In the LASER project we apply lexically-driven Analogy at the code level, rather than at the UML-level, in order to retrieve matching components from a repository of existing components. Using the lexical ontology Word-Net, we have conducted a case study to assess if class and method names in open source applications are used in a semantically meaningful way. Our results demonstrate that both hierarchical reuse and parallel reuse can be enhanced through the use of lexically-driven Analogy. %B International Workshop on Mining Software Repositories (MSR 2004) %I IEE %C Edinburgh, Scotland, UK %V 2004 %P 112 - 116 %R 10.1049/ic:20040487 %> https://flosshub.org/sites/flosshub.org/files/112LASER.pdf %0 Journal Article %J Information Systems Journal %D 2002 %T Code quality analysis in open source software development %A Ioannis Stamelos %A Lefteris Angelis %A Apostolos Oikonomou %A Georgios L. Bleris %K C %K Code quality characteristics %K functions %K linux %K metrics %K open source development %K software measurement %K structural code analysis %K Suse %K user satisfaction %X Proponents of open source style software development claim that better software is produced using this model compared with the traditional closed model. However, there is little empirical evidence in support of these claims. In this paper, we present the results of a pilot case study aiming: (a) to understand the implications of structural quality; and (b) to figure out the benefits of structural quality analysis of the code delivered by open source style development. To this end, we have measured quality characteristics of 100 applications written for Linux, using a software measurement tool, and compared the results with the industrial standard that is proposed by the tool. Another target of this case study was to investigate the issue of modularity in open source as this characteristic is being considered crucial by the proponents of open source for this type of software development. We have empirically assessed the relationship between the size of the application components and the delivered quality measured through user satisfaction. We have determined that, up to a certain extent, the average component size of an application is negatively related to the user satisfaction for this application. %B Information Systems Journal %V 12 %P 43–60 %0 Conference Paper %B Proceedings of the International Conference on Software Maintenance (ICSM'00) %D 2000 %T Evolution in Open Source Software: A Case Study %A Godfrey, Michael W. %A Tu, Qiang %K evolution %K functions %K growth %K lines of code %K linux %K linux kernel %K loc %K source code %X Most studies of software evolution have been performed on systems developed within a single company using traditional management techniques. With the widespread availability of several large software systems that have been developed using an 'open source' development approach, we now have a chance to examine these systems in detail, and see if their evolutionary narratives are significantly different from commercially developed systems. This paper summarizes our preliminary investigations into the evolution of the best known open source system: the Linux operating system kernel. Because Linux is large (over two million lines of code in the most recent version) and because its development model is not as tightly planned and managed as most industrial software processes, we had expected to find that Linux was growing more slowly as it got bigger and more complex. Instead, we have found that Linux has been growing at a super-linear rate for several years. In this paper, we explore the evolution of the Linux kernel both at the system level and within the major subsystems, and we discuss why we think Linux continues to exhibit such strong growth. %B Proceedings of the International Conference on Software Maintenance (ICSM'00) %S ICSM '00 %I IEEE Computer Society %C Washington, DC, USA %P 131– %@ 0-7695-0753-0 %U http://portal.acm.org/citation.cfm?id=850948.853411 %> https://flosshub.org/sites/flosshub.org/files/godfrey00.pdf