%0 Conference Proceedings %B Open Source Systems: Towards Robust Practices 13th International Conference on Open Source Systems %D 2017 %T Considering the use of walled gardens for FLOSS project communication %A Squire, Megan %K apache %K chat %K communication %K email %K free software %K irc %K mailing list %K open source %K Slack %K Stack Overflow %K teams %K Wordpress %X At its core, free, libre, and open source software (FLOSS) is defined by its adherence to a set of licenses that give various freedoms to the users of the software, for example the ability to use the software, to read or modify its source code, and to distribute the software to others. In addition, many FLOSS projects and developers also champion other values related to "freedom" and "openness", such as transparency, for example in communication and decision-making, or community-orientedness, for example in broadening access, collaboration, and participation. This paper explores how one increasingly common software development practice - communicating inside non-archived, third-party "walled gardens" - puts these FLOSS values into conflict. If communities choose to use non-archived walled gardens for communication, they may be prioritizing one type of openness (broad participation) over another (transparency). We use 18 FLOSS projects as a sample to describe how walled gardens are currently being used for intra-project communication, as well as to determine whether or not these projects provide archives of these communications. Findings will be useful to the FLOSS community as a whole as it seeks to under- stand the evolution and impact of its communication choices. %B Open Source Systems: Towards Robust Practices 13th International Conference on Open Source Systems %S IFIP Advances in Information and Communication Technology %8 05/2017 %U https://link.springer.com/content/pdf/10.1007%2F978-3-319-57735-7_1.pdf %R 10.1007/978-3-319-57735-7_1 %> https://flosshub.org/sites/flosshub.org/files/preprint_0.pdf %0 Conference Proceedings %B 13th International Conference on Mining Software Repositories (MSR '16) %D 2016 %T Data Sets: The Circle of Life in Ruby Hosting, 2003-2015 %A Squire, Megan %X Studying software repositories and hosting services can provide valuable insights into the behaviors of large groups of software developers and their projects. Traditionally, most analysis of metadata collected from hosting services has been conducted by specifying some short window of time, typically just a few years. To date, few - if any - studies have been built from data comprising the entirety of a repository's lifespan: from its birth to its death, and rebirth. Thus, the first contribution of this data set is to support the historical analysis of over ten years of collected metadata from the now-defunct RubyForge project hosting site, as well as the follow-on successor to RubyForge, the RubyGems hosting facility. The data sets and sample analyses in this paper will be relevant to researchers studying both software evolution and the distributed software development process. %B 13th International Conference on Mining Software Repositories (MSR '16) %I IEEE %P 452-455 %8 05/2016 %U https://docs.google.com/presentation/d/1rtdrwkfxt-p5gQBwMNT1WDlLMg0pJ4gxfM5jSAH-UUA/edit?usp=sharing %> https://flosshub.org/sites/flosshub.org/files/preprint.pdf %0 Conference Proceedings %B 12th International Symposium on Open Collaboration (OpenSym 2016) %D 2016 %T Differentiating Communication Styles of Leaders on the Linux Kernel Mailing List %A Schneider, Daniel %A Spurlock, Scott %A Squire, Megan %K email %K flossmole %K linus torvalds %K linux %K lkml %X Much communication between developers of free, libre, and open source software (FLOSS) projects happens on email mailing lists. Geographically and temporally dispersed development teams use email as an asynchronous, centralized, persistently stored institutional memory for sharing code samples, discussing bugs, and making decisions. Email is especially important to large, mature projects, such as the Linux kernel, which has thousands of developers and a multi-layered leadership structure. In this paper, we collect and analyze data to understand the communication patterns in such a community. How do the leaders of the Linux Kernel project write in email? What are the salient features of their writing, and can we discern one leader from another? We find that there are clear written markers for two leaders who have been particularly important to recent discussions of leadership style on the Linux Kernel Mailing List (LKML): Linux Torvalds and Greg Kroah-Hartman. Furthermore, we show that it is straightforward to use a machine learning strategy to automatically differentiate these two leaders based on their writing. Our findings will help researchers understand how this community works, and why there is occasional controversy regarding differences in communication styles on the LKML. %B 12th International Symposium on Open Collaboration (OpenSym 2016) %I ACM %8 08/2016 %> https://flosshub.org/sites/flosshub.org/files/v3_0.pdf %0 Conference Proceedings %B 11th International Conference on Open Source Systems %D 2015 %T The diffusion of pastebin tools to enhance communication in FLOSS mailing lists %A Squire, Megan %A Smith, Amber %X This paper describes how software developers who use mailing lists to communicate reacted and adjusted to a new supplementary collaboration tool, called a pastebin service. Using publicly-available archives of 8800 mailing lists, we examine the adoption of the pastebin tool by software developers and compare it to the model presented in Diffusion of Innovation (DoI) theory. We then compare the rate at which software developers decided whether to accept or reject the new pastebin tools. We find that the overall rate of pastebin adoption follows the S-curve predicted by classic DoI theory. We then compare the individual pastebin services and their rates of adoption, as well as the reaction of different communities to the new tools and the various rationales for accepting or rejecting them. %B 11th International Conference on Open Source Systems %P 45-57 %8 05/2015 %R http://dx.doi.org/10.1007/978-3-319-17837-0_5 %> https://flosshub.org/sites/flosshub.org/files/pastebinOSS2015Preprint.pdf %0 Conference Proceedings %B 48th Hawaii International Conference on System Sciences %D 2015 %T FLOSS as a source for profanity and insults: Collecting the data %A Squire, Megan %A Gazda, Rebecca %X An important task in machine learning and natural language processing is to learn to recognize different types of human speech, including humor, sarcasm, insults, and profanity. In this paper we describe our method to produce test and training data sets to assist in this task. Our test data sets are taken from the domain of free, libre, and open source software (FLOSS) development communities. We describe our process in constructing helper sets of relevant data, such as profanity lists, lists of insults, and lists of projects with their codes of conduct. Contributions of this paper are to describe the background literature on computer-aided methods of recognizing insulting or profane speech, to describe the parameters of data sets that are useful in this work, and to outline how FLOSS communities are such a rich source of insulting or profane speech data. We then describe our data sets in detail, including how we created these data sets, and provide some initial guidelines for usage. %B 48th Hawaii International Conference on System Sciences %I IEEE %8 1/2015 %U https://docs.google.com/presentation/d/1DIkv_Qrq0mPtbkS3eCH2w-Ly4nvz0h5qy8y8NjZjhMU/edit?usp=sharing %R 10.1109/HICSS.2015.623 %> https://flosshub.org/sites/flosshub.org/files/hicssInsultsv2.pdf %0 Conference Proceedings %B 37th International Conference on Software Engineering %D 2015 %T "Should we move to Stack Overflow?" Measuring the utility of social media for developer support %A Squire, Megan %K developer support %K forums %K mailing list %K metrics %K quality %K social media %K Stack Overflow %K technical support %X Stack Overflow is an enormously popular question-and-answer web site intended for software developers to help each other with programming issues. Some software projects aimed at developers (for example, application programming interfaces, application engines, cloud services, development frameworks, and the like) are closing their self-supported developer discussion forums and mailing lists and instead directing developers to use special-purpose tags on Stack Overflow. The goals of this paper are to document the main reasons given for moving developer support to Stack Overflow, and then to collect and analyze data from a group of software projects that have done this, in order to show whether the expected quality of support was actually achieved. The analysis shows that for all four software projects in this study, two of the desired quality indicators, developer participation and response time, did show improvements on Stack Overflow as compared to mailing lists and forums. However, we also found several projects that moved back from Stack Overflow, despite achieving these desired improvements. The results of this study are applicable to a wide variety of software projects that provide developer support using social media. %B 37th International Conference on Software Engineering %I IEEE %P 10pp %8 05/2015 %> https://flosshub.org/sites/flosshub.org/files/SEIP2015stackv2.pdf %0 Conference Proceedings %B 47th International Hawai'i Conference on System Sciences (HICSS-47) %D 2014 %T "A bit of code": How the Stack Overflow Community Creates Quality Postings %A Squire, Megan %A Funkhouser, Christian %K COLLABORATION %K collaborative development %K data mining %K developer network %K knowledge collaboration %K open content %K text mining %X The Stack Overflow web site is an online community where programmers can ask and answer one another's questions, earning points and badges. The site offers guidance in the form of a Frequently Asked Questions (FAQ), beginning with "What kind of questions can I ask here?" The answer explains that "the best Stack Overflow questions have a bit of source code in them". This paper explores the role of source code and non-source code text on Stack Overflow in both questions and answers. The primary contribution of this paper is to provide a more detailed understanding of whether the presence of source code (and how much) actually will produce the "best" Stack Overflow questions or answers. A second contribution of this paper is to determine how the non-code portions of the text might also contribute the "best" Stack Overflow postings. %B 47th International Hawai'i Conference on System Sciences (HICSS-47) %I IEEE Computer Society %P 1425-1434 %8 01/2014 %R http://dx.doi.org/10.1109/HICSS.2014.185 %> https://flosshub.org/sites/flosshub.org/files/hicssSMFinalWatermark.pdf %0 Conference Proceedings %B 47th International Hawai'i Conference on System Sciences (HICSS-47) %D 2014 %T Forge++: The changing landscape of FLOSS development %A Squire, Megan %X Software forges are centralized online systems that provide useful tools to help distributed development teams work together, especially in free, libre, and open source software (FLOSS). Forge-provided tools may include web space, version control systems, mailing lists and communication forums, bug tracking systems, file downloads, wikis, and the like. Empirical software engineering researchers can mine the artifacts from these tools to better understand how FLOSS is made. As the landscape of distributed software development has grown and changed, the tools needed to make FLOSS have changed as well. There are three newer tools at the center of FLOSS development today: distributed version control based forges (like Github), programmer question-and-answer communities (like Stack Overflow), and pastebin tools (like Gist or Pastebin.com). These tools are extending and changing the toolset used for FLOSS development, and redefining what a software forge looks like. The main contributions of this paper are to describe each of these tools, to identify the data and artifacts available for mining from these tools, and to outline some of the ways researchers can use these artifacts to continue to understand how FLOSS is made. %B 47th International Hawai'i Conference on System Sciences (HICSS-47) %I IEEE Computer Society %P 3266-3275 %8 01/2014 %R 10.1109/HICSS.2014.405 %> https://flosshub.org/sites/flosshub.org/files/hicssFLOSSfinalWatermark_0.pdf %0 Conference Proceedings %B 10th Working Conference on Mining Software Repositories (MSR2013) %D 2013 %T Apache-Affiliated Twitter Screen Names: A Dataset %A Squire, Megan %K apache %K dataset %K twitter %X This paper describes a new dataset containing Twitter screen names for members of the projects affiliated with the Apache Software Foundation (ASF). The dataset includes the confirmed Twitter screen names, as well as the real name as listed on Twitter, and the user identification as used within the Apache organization. The paper also describes the process used to collect and clean this data, and shows some sample queries for learning how to use the data. The dataset has been donated to the FLOSSmole project and is available for download (https://code. google.com/p/flossmole/downloads/detail?name=apacheTwitter2013-Jan.zip) or direct querying via a database client. %B 10th Working Conference on Mining Software Repositories (MSR2013) %8 05/2013 %> https://flosshub.org/sites/flosshub.org/files/apacheTwitterPREPRINT.pdf %> https://flosshub.org/sites/flosshub.org/files/MSR%20presentation.pdf %0 Conference Proceedings %D 2013 %T Project Roles in the Apache Software Foundation: A Dataset %A Squire, Megan %K apache %K dataset %K roles %X This paper outlines the steps in the creation and maintenance of a new dataset listing leaders of the various projects of the Apache Software Foundation (ASF). Included in this dataset are different levels of committers to the various ASF project code bases, as well as regular and emeritus members of the ASF, and directors and officers of the ASF. The dataset has been donated to the FLOSSmole project under an open source license, and is available for download (https://code.google.com /p/flossmole/downloads/detail?name=apachePeople2013-Jan.zip), or for direct querying via a database client. %8 05/2013 %> https://flosshub.org/sites/flosshub.org/files/apacheRolesPREPRINT.pdf %> https://flosshub.org/sites/flosshub.org/files/MSR%20presentation_0.pdf %0 Conference Proceedings %B 3rd International Workshop on Replication in Empirical Software Engineering Research (RESER2013) %D 2013 %T A Replicable Infrastructure for Empirical Studies of Email Archives %A Squire, Megan %K apache %K cleaning %K collection %K couchdb %K database %K document-oriented database %K email %K lucene %K mailing lists %K nosql %K replication %K storage %X This paper describes a replicable infrastructure solution for conducting empirical software engineering studies based on email mailing list archives. Mailing list emails, such as those affiliated with free, libre, and open source software (FLOSS) projects, are currently archived in several places online, but each research team that wishes to study these email artifacts closely must design their own solution for collection, storage and cleaning of the data. Consequently, research results will be difficult to replicate, especially as the email archive for any living project will still be continually growing. This paper describes a simple, replicable infrastructure for the collection, storage, and cleaning of project email data and analyses. %B 3rd International Workshop on Replication in Empirical Software Engineering Research (RESER2013) %I IEEE %C Baltimore, MD, USA %P 43-50 %8 10/2013 %@ 978-0-7695-5121-0 %> https://flosshub.org/sites/flosshub.org/files/RESERv2.pdf %0 Conference Paper %B 45th Hawai'i International Conference on System Sciences %D 2012 %T Describing the Software Forge Ecosystem %A Squire, Megan %A Williams, David %K features %K FLOSS %K forge %K hosting %K metrics %X Code forges are online software systems that are designed to support teams doing software development work. There have been few if any attempts in the research literature to describe the web of people, projects, and tools that make up the free, libre, and open source (FLOSS) forge ecosystem. The main contributions of this paper are (1) to introduce a classification of FLOSS-oriented forges according to their characteristics; (2) to describe the forge-level and project-level data and artifacts currently available at each FLOSS forge; (3) to show various patterns already discovered in the FLOSS forge ecosystem, such as timelines of creation or arrangements by size or feature; (4) to make some recommendations to forge providers and data collectors about how to expose the structure and information in the forges; and (5) to describe the effort needed to extend our publicly- available information about the FLOSS forge ecosystem into the future. %B 45th Hawai'i International Conference on System Sciences %P 3416-3425 %8 01/2012 %> https://flosshub.org/sites/flosshub.org/files/SquireWilliamsHICSS2012.pdf %0 Journal Article %J International Journal of Open Source Software and Processes %D 2012 %T How the FLOSS Research Community Uses Email Archives %A Squire, Megan %K email %K email archives %K literature %K mailing lists %K review %K Survey %X Artifacts of the software development process, such as source code or emails between developers, are a frequent object of study in empirical software engineering literature. One of the hallmarks of free, libre, and open source software (FLOSS) projects is that the artifacts of the development process are publicly-accessible and therefore easily collected and studied. Thus, there is a long history in the FLOSS research community of using these artifacts to gain understanding about the phenomenon of open source software, which could then be compared to studies of software engineering more generally. This paper looks specifically at how the FLOSS research community has used email artifacts from free and open source projects. It provides a classification of the relevant literature using a publicly-available online repository of papers about FLOSS development using email. The outcome of this paper is to provide a broad overview for the software engineering and FLOSS research communities of how other researchers have used FLOSS email message artifacts in their work %B International Journal of Open Source Software and Processes %V 4 %P 37 - 59 %8 12/2012 %N 1 %R 10.4018/jossp.2012010103 %> https://flosshub.org/sites/flosshub.org/files/ijossp_v3_PREPRINT.pdf %0 Conference Paper %B 2011 Second International Workshop on Replication in Empirical Software Engineering Research (RESER) %D 2011 %T A Secondary Data Archive for Code-Level Debian Metrics %A Kozak, Carter %A Squire, Megan %X In this paper, we describe a new process to collect, calculate, archive, and distribute interesting metrics for all the packages in the standard Debian GNU/Linux installation. Our method replicates and extends previous work done by other groups studying free and open source software systems (FLOSS) in three important ways. First, although there have been other previous studies that attempted to collect a large set of code-level metrics for a small set of projects, and there have been studies that generated a small set of metrics for the large Debian codebase, our project does both: we generate a larger set of metrics for the entire set of Debian packages. Second, our integration of new Debian metadata and additional code-level metrics not gathered before adds several additional layers for exploration. Finally, and most importantly, because we integrate our collection and analysis process into the automated FLOSSmole data store, we ensure timely, repeatable, and very easy comparison, replication and analysis by other groups. Thus our collection activity will continue in an automated fashion even after this paper is published, providing the foundation for additional studies to be conducted later, all freely accessible to any interested research group. After outlining our process, we discuss a few observations about the data, we outline some implications for the research community, and we present opportunities for further research. %B 2011 Second International Workshop on Replication in Empirical Software Engineering Research (RESER) %I IEEE %C Banff, Alberta, Canada %P 43 - 51 %8 09/2011 %@ 978-1-4673-0972-1 %R 10.1109/RESER.2011.9 %0 Journal Article %J International Journal of Open Source Software and Processes %D 2010 %T Repositories with Public Data about Software Development %A Jesus M. Gonzalez-Barahona %A Izquierdo-Cortazar, Daniel %A Squire, Megan %X Empirical research on software development based on data obtained from project repositories and code forges is increasingly gaining attention in the software engineering research community. The studies in this area typically start by retrieving or monitoring some subset of data found in the repository or forge, and this data is later analyzed to find interesting patterns. However, retrieving information from these locations can be a challenging task. Meta-repositories providing public information about software development are useful tools that can simplify and streamline the research process. Public data repositories that collect and clean the data from other project repositories or code forges can help ensure that research studies are based on good quality data. This paper provides some insight as to how these meta-repositories (sometimes called a “repository of repositories”, RoR) of data about open source projects should be used to help researchers. This paper describes in detail two of the most widely used collections of data about software development: FLOSSmole and FLOSSMetrics. %B International Journal of Open Source Software and Processes %V 2 %P 1 - 13 %8 04/2010 %N 2 %R 10.4018/jossp.2010040101 %> https://flosshub.org/sites/flosshub.org/files/ijossp2010.pdf %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T 4th International Workshop on Public Data about Software Development %A González-Barahona, Jesús %A Squire, Megan %A Izquierdo-Cortázar, Daniel %X Libre (free, open source) projects offer publicly available data sources. The research community is starting to produce, use and exchange large data sets of information. These data sets have to be retrieved, purged, described, and can be published for public consumption by other groups. Their availability allows for the decoupling of research activities, the reproducibility of research results, and even the collaboration (and competition) in the analysis of data. This activity is frequently presented at workshops and conferences, but since the focus of these conferences is not specific to the use of public data, discussions of techniques and experiences are not as deep and fruitful as they could be. This workshop is once again (for the fourth year in a row) such a place. We will host discussions specifically about these sorts of public data sets about software development, how they are retrieved, how they can be analyzed and mined, how they can be exchanged and extended. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 351 - 352 %8 2009/// %G eng %& 31 %R http://dx.doi.org/10.1007/978-3-642-02032-2_31 %> https://flosshub.org/sites/flosshub.org/files/4the%20International%20Workshop%20on%20Public%20Data.pdf %0 Report %D 2009 %T Envisioning National and International Research on the Multidisciplinary Empirical Science of Free/Open Source Software %A Walt Scacchi %A Kevin Crowston %A Madey, Greg %A Squire, Megan %8 Spring 2009 %G eng %0 Journal Article %J International Journal of Open Source Software and Processes %D 2009 %T Integrating Projects from Multiple Open Source Code Forges %A Squire, Megan %K data integration %K forges %X Much of the data about free, libre, and open source (FLOSS) software development comes from studies of code forges or code repositories used for managing projects. This paper presents a method for integrating data about open source projects by way of matching projects (entities) across multiple code forges. After a review of the relevant literature, a few of the methods are chosen and applied to the FLOSS domain, including a comparison of some simple scoring systems for pairwise project matches. Finally, the paper describes limitations of this approach and recommendations for future work. %B International Journal of Open Source Software and Processes %V 1 %P 46 - 57 %8 31/2009 %N 1 %R 10.4018/jossp.2009010103 %0 Conference Paper %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %D 2009 %T Using FLOSS Project Metadata in the Undergraduate Classroom %A Squire, Megan %A Duvall, Shannon %K artificial intelligence %K database %K education %K teaching %K undergraduate %K undergraduate research %X This paper describes our efforts to use the large amounts of data available from public repositories of free, libre, and open source software (FLOSS) in our undergraduate classrooms to teach concepts that would have previously been taught using other types of data from other sources. %B OSS2009: Open Source Ecosystems: Diverse Communities Interacting (IFIP 2.13) %S IFIP Advances in Information and Communication Technology %I Springer %V 299/2009 %P 330 - 339 %8 2009/// %G eng %& 29 %R http://dx.doi.org/10.1007/978-3-642-02032-2_29 %> https://flosshub.org/sites/flosshub.org/files/Using%20FLOSS%20Project%20Metadata.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T 2nd International Workshop on Public Data about Software Development (WoPDaSD 2007) %A Gonzalez-Barahona, Jesus %A Conklin, Megan %A Gregorio Robles %X Exchange of detailed data about software development between research teams, and specifically about data available from public repositories of libre (free, open source) software projects is becoming more and more common. This workshop will explore the benefits and problems of such exchange, and the steps needed to foster it. As a case example of data exchange, the workshop organizers suggest two large datasets to be analyzed by participants. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 381 - 383 %8 2007/// %G eng %& 51 %R http://dx.doi.org/10.1007/978-0-387-72486-7_51 %> https://flosshub.org/sites/flosshub.org/files/2nd%20Intl%20Workshop%20on%20Public%20Data.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T How to Gather FLOSS Metrics %A Conklin, Megan %A Gonzalez-Barahona, Jesus %A Gregorio Robles %X In this half-day tutorial, participants will gain hands-on exposure to key technologies for data collection about open source projects. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 361 - 362 %8 2007/// %G eng %& 44 %R http://dx.doi.org/10.1007/978-0-387-72486-7_44 %> https://flosshub.org/sites/flosshub.org/files/How%20to%20gather%20Floss%20Metrics.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T Project Entity Matching across FLOSS Repositories %A Conklin, Megan %X Much of the data about free, libre, and open source (FLOSS) software development comes from studies of code repositories used for managing projects. This paper presents a method for integrating data about open source projects by way of matching projects (entities) and deleting duplicates across multiple code repositories. After a review of the relevant literature, a few of the methods are chosen and applied to the FLOSS domain, including a simple scoring system for confidence in pairwise project matches. Finally, the paper describes limitations of this approach and recommendations for future work. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 45 - 57 %8 2007/// %G eng %& 4 %R http://dx.doi.org/10.1007/978-0-387-72486-7_4 %> https://flosshub.org/sites/flosshub.org/files/Project%20Entity%20Matching.pdf %0 Conference Paper %B OSS2006: Open Source Systems (IFIP 2.13) %D 2006 %T Beyond Low-Hanging Fruit: Seeking the Next Generation in FLOSS Data Mining %A Conklin, Megan %X This paper will discuss the motivations and methods for collecting quantitative data about free, libre and open source (FLOSS) software projects. The paper also describes the current state of the art in collecting this data, and some of the problems with this process. Finally, the paper outlines the challenges data miners should look forward to when trying to improve the usefulness of their quantitative data streams. %B OSS2006: Open Source Systems (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %P 47 - 56 %G eng %R http://dx.doi.org/10.1007/0-387-34226-5_5 %> https://flosshub.org/sites/flosshub.org/files/Beyond%20Low-Hanging%20Fruit.pdf %0 Journal Article %J International Journal of Information Technology and Web Engineering %D 2006 %T FLOSSmole: A Collaborative Repository for FLOSS Research Data and Analyses %A James Howison %A Conklin, Megan %A Kevin Crowston %B International Journal of Information Technology and Web Engineering %V 1 %P 17-26 %G eng %1 information systems %2 computational %0 Conference Paper %B Symposium on Mining Software Repositories %D 2005 %T Collaboration Using OSSmole: A repository of FLOSS data and analyses %A Conklin, Megan %A James Howison %A Kevin Crowston %B Symposium on Mining Software Repositories %C St. Louis %8 17 May %G eng %0 Conference Paper %B OSS2005: Open Source Systems %D 2005 %T OSSmole: A collaborative repository for FLOSS research data and analyses %A Howison, James %A Conklin, Megan %A Kevin Crowston %X This paper introduces a collaborative project, “OSSmole”, designed to gather, share and store comparable data and analyses of free and open source software development for academic research. The project draws on the ongoing collection and analysis efforts of many research groups, reducing duplication, and promoting compatibility both across sources of FLOSS data and across research groups and analyses. The paper outlines current difficulties with the current typical quantitative FLOSS research process and uses these to develop requirements and presents the design of the system. %B OSS2005: Open Source Systems %P 54-60 %U http://pascal.case.unibz.it/handle/2038/1422 %0 Conference Paper %B 2004 Open Source Conference (OSCON) %D 2004 %T Do the Rich Get Richer? The Impact of Power Laws on Open Source Development Projects %A Conklin, Megan %K open source %K power law %K social network analysis %K sourceforge %B 2004 Open Source Conference (OSCON) %C Portland, OR, USA %G eng %9 conference presentation