%0 Conference Proceedings %B 12th Working Conference on Mining Software Repositories (MSR 2015) %D 2015 %T Investigating Code Review Practices in Defective Files: An Empirical Study of the Qt System %A Patanamon Thongtanunam %A McIntosh, Shane %A Hassan, Ahmed E. %A Hajimu Iida %K code review %K software quality %X Software code review is a well-established software quality practice. Recently, Modern Code Review (MCR) has been widely adopted in both open source and proprietary projects. To evaluate the impact that characteristics of MCR practices have on software quality, this paper comparatively studies MCR practices in defective and clean source code files. We investigate defective files along two perspectives: 1) files that will eventually have defects (i.e., future-defective files) and 2) files that have historically been defective (i.e., risky files). Through an empirical study of 11,736 reviews of changes to 24,486 files from the Qt open source project, we find that both future-defective files and risky files tend to be reviewed less rigorously than their clean counterparts. We also find that the concerns addressed during the code reviews of both defective and clean files tend to enhance evolvability, i.e., ease future maintenance (like documentation), rather than focus on functional issues (like incorrect program logic). Our findings suggest that although functionality concerns are rarely addressed during code review, the rigor of the reviewing process that is applied to a source code file throughout a development cycle shares a link with its defect proneness. %B 12th Working Conference on Mining Software Repositories (MSR 2015) %I IEEE %8 05/2015 %U http://sail.cs.queensu.ca/publications/pubs/msr2015-thongtanunam.pdf %> https://flosshub.org/sites/flosshub.org/files/msr2015-thongtanunam.pdf %0 Conference Paper %B Proceedings of the 29th IEEE International Conference on Software Maintainability %D 2013 %T How does Context affect the Distribution of Software Maintainability Metrics? %A Zhang, Feng %A Audris Mockus %A Ying Zou %A Foutse Khomh %A Hassan, Ahmed E. %K benchmark %K context %K contextual factor %K flossmole %K large scale %K metrics %K mining software repositories %K sampling %K software maintainability %K sourceforge %K static metrics %X Software metrics have many uses, e.g., defect prediction, effort estimation, and benchmarking an organization against peers and industry standards. In all these cases, metrics may depend on the context, such as the programming language. Here we aim to investigate if the distributions of commonly used metrics do, in fact, vary with six context factors: application domain, programming language, age, lifespan, the number of changes, and the number of downloads. For this preliminary study we select 320 nontrivial software systems from SourceForge. These software systems are randomly sampled from nine popular application domains of SourceForge. We calculate 39 metrics commonly used to assess software maintainability for each software system and use Kruskal Wallis test and Mann-Whitney U test to determine if there are significant differences among the distributions with respect to each of the six context factors. We use Cliff’s delta to measure the magnitude of the differences and find that all six context factors affect the distribution of 20 metrics and the programming language factor affects 35 metrics. We also briefly discuss how each context factor may affect the distribution of metric values.We expect our results to help software benchmarking and other software engineering methods that rely on these commonly used metrics to be tailored to a particular context. %B Proceedings of the 29th IEEE International Conference on Software Maintainability %S ICSM '13 %> https://flosshub.org/sites/flosshub.org/files/icsm2013_contextstudy.pdf %0 Journal Article %J Empirical Software Engineering %D 2013 %T Management of community contributions %A Bettenburg, Nicolas %A Hassan, Ahmed E. %A Adams, Bram %A Daniel M. German %K android %K contribution %K linux %K management %X In recent years, many companies have realized that collaboration with a thriving user or developer community is a major factor in creating innovative technology driven by market demand. As a result, businesses have sought ways to stimulate contributions from developers outside their corporate walls, and integrate external developers into their development process. To support software companies in this process, this paper presents an empirical study on the contribution management processes of two major, successful, open source software ecosystems. We contrast a for-profit (ANDROID) system having a hybrid contribution style, with a not-for-profit (LINUX kernel) system having an open contribution style. To guide our comparisons, we base our analysis on a conceptual model of contribution management that we derived from a total of seven major open-source software systems. A quantitative comparison based on data mined from the ANDROID code review system and the LINUX kernel code review mailing lists shows that both projects have significantly different contribution management styles, suited to their respective market goals, but with individual advantages and disadvantages that are important for practitioners. Contribution management is a real-world problem that has received very little attention from the research community so far. Both studied systems (LINUX and ANDROID) employ different strategies and techniques for managing contributions, and both approaches are valuable examples for practitioners. Each approach has specific advantages and disadvantages that need to be carefully evaluated by practitioners when adopting a contribution management process in practice. %B Empirical Software Engineering %I Springer %P 1–38 %U http://link.springer.com/article/10.1007/s10664-013-9284-6 %0 Conference Paper %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %D 2010 %T Should I contribute to this discussion? %A Ibrahim, Walid M %A Bettenburg, Nicolas %A Shihab, Emad %A Adams, Bram %A Hassan, Ahmed E. %K apache %K contributions %K developers %K email %K email archives %K mailing lists %K postgresql %K python %X Development mailing lists play a central role in facilitating communication in open source projects. Since these lists frequently host design and project discussions, knowledgeable contribution to these discussion threads is essential to avoid mis-communication that might slow-down the progress of a project. However, given the sheer volume of emails on these lists, it is easy to miss important discussions. To find out how developers are able to deal with mailing list discussions, we study the main factors that encourage developers to contribute to the development mailing lists. We develop personalized models to automatically identify discussion threads that a developer would contribute to based on his previous contribution behavior. Case studies on development mailing lists of three open source projects (Apache, PostgreSQL and Python) show that the average accuracy of our models is 89-85% and that the models vary significantly between different developers. %B 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010) %I IEEE %C Cape Town %P 181 - 190 %@ 978-1-4244-6802-7 %R 10.1109/MSR.2010.5463345 %> https://flosshub.org/sites/flosshub.org/files/181ibrahim-msr2010.pdf %0 Conference Paper %B the 2008 international workshopProceedings of the 2008 international workshop on Mining software repositories - MSR '08 %D 2008 %T Branching and merging in the repository %A Spacco, Jamie %A Williams, Chadd C. %Y Hassan, Ahmed E. %Y Lanza, Michele %Y Godfrey, Michael W. %K argouml %K changes %K cvs2svn %K diffj %K revision %K scm %K source code %K version control %X Two of the most complex operations version control software allows a user to perform are branching and merging. Branching provides the user the ability to create a copy of the source code to allow changes to be stored in version control but outside of the trunk. Merging provides the user the ability to copy changes from a branch to the trunk. Performing a merge can be a tedious operation and one that may be error prone. In this paper, we compare file revisions found on branches with those found on the trunk to determine when a change that is applied to a branch is moved to the trunk. This will allow us to study how developers use merges and to determine if merges are in fact more error prone than other commits. %B the 2008 international workshopProceedings of the 2008 international workshop on Mining software repositories - MSR '08 %I ACM Press %C New York, New York, USA %P 19-22 %8 05/2008 %@ 9781605580241 %! MSR '08 %R 10.1145/1370750.1370754 %> https://flosshub.org/sites/flosshub.org/files/p19-williams.pdf %0 Conference Paper %B Proceedings of the 2008 international workshop on Mining software repositories - MSR '08 %D 2008 %T Determinism and evolution %A González-Barahona, Jesús M. %A Gregorio Robles %A Herraiz, Israel %Y Hassan, Ahmed E. %Y Lanza, Michele %Y Godfrey, Michael W. %K changes %K evolution %K source code %K sourceforge %X It has been proposed that software evolution follows a Self-Organized Criticality (SOC) dynamics. This fact is supported by the presence of long range correlations in the time series of the number of changes made to the source code over time. Those long range correlations imply that the current state of the project was determined time ago. In other words, the evolution of the software project is governed by a sort of determinism. But this idea seems to contradict intuition. To explore this apparent contradiction, we have performed an empirical study on a sample of 3,821 libre (free, open source) software projects, finding that their evolution projects is short range correlated. This suggests that the dynamics of software evolution may not be SOC, and therefore that the past of a project does not determine its future except for relatively short periods of time, at least for libre software. %B Proceedings of the 2008 international workshop on Mining software repositories - MSR '08 %I ACM Press %C New York, New York, USA %P 1-9 %8 05/2008 %@ 9781605580241 %! MSR '08 %R 10.1145/1370750.1370752 %> https://flosshub.org/sites/flosshub.org/files/p1-herraiz.pdf %0 Conference Paper %B Proceedings of the 2006 international workshop on Mining software repositories %D 2006 %T Examining the evolution of code comments in PostgreSQL %A Zhen Ming Jiang %A Hassan, Ahmed E. %K code comments %K comments %K cvs %K evolution %K functions %K maintenance %K mining challenge %K msr challenge %K postgresql %K software evolution %K software maintenance %K source code %X It is common, especially in large software systems, for developers to change code without updating its associated comments due to their unfamiliarity with the code or due to time constraints. This is a potential problem since outdated comments may confuse or mislead developers who perform future development. Using data recovered from CVS, we study the evolution of code comments in the PostgreSQL project. Our study reveals that over time the percentage of commented functions remains constant except for early fluctuation due to the commenting style of a particular active developer. %B Proceedings of the 2006 international workshop on Mining software repositories %S MSR '06 %I ACM %C New York, NY, USA %P 179–180 %@ 1-59593-397-2 %U http://doi.acm.org/10.1145/1137983.1138030 %R http://doi.acm.org/10.1145/1137983.1138030 %> https://flosshub.org/sites/flosshub.org/files/179ExaminingTheEvolution.pdf