Research Infrastructure for Empirical Science of F/OSS

Submitted by msquire on Thu, 2011-04-14 16:08

Title	Research Infrastructure for Empirical Science of F/OSS
Publication Type	Conference Paper
Year of Publication	2004
Authors	Gasser, L, Ripoche, G, Sandusky, RJ
Secondary Title	Proc. Intern. Workshop on Mining Software Repositories
Pagination	12-16
Keywords	data, Data Collection, empirical, infrastructure
Abstract	F/OSS research faces a new and unusual situation: the traditional difficulties of gathering enough empirical data have been replaced by issues of dealing with enormous amounts of freely available data from many disparate sources (forums, code, bug reports, etc.) At present no means exist for assembling these data under common access points and frameworks for comparative, longitudinal, and collaborative research. Gathering and maintaining large F/OSS data collections reliably and making them usable present several research challenges. For example, current projects usually rely on “web scraping” or on direct access to raw data from groups that generate it, and both of these methods require unique effort for each new corpus, or even for updating existing corpora. In this paper we identify several common needs and critical factors in F/OSS empirical research, and suggest orientations and recommendations for the design of a shared research infrastructure.
Full Text