Trends That Affect Temporal Analysis Using SourceForge Data
|Title||Trends That Affect Temporal Analysis Using SourceForge Data|
|Publication Type||Conference Paper|
|Year of Publication||2010|
|Authors||MacLean, Alexander C., Pratt Landon J., Krein Jonathan L., and Knutson Charles D.|
|Secondary Title||5th Workshop on Public Data about Software Development (WoPDaSD 2010)|
|Keywords||cliff walls, committers, cvs, evolution, growth, source code, sourceforge, time, time series|
SourceForge is a valuable source of software artifact data for researchers who study project evolution and developer behavior. However, the data exhibit patterns that may bias temporal analyses. Most notable are cliff walls in project source code repository timelines, which indicate large commits that are out of character for the given project. These cliff walls often hide significant periods of development and developer collaboration—a threat to studies that rely on SourceForge repository data. We demonstrate how to identify these cliff walls, discuss reasons for their appearance, and propose preliminary measures for mitigating their effects in evolution-oriented studies.
"In this paper we examine some of the limitations of artifact data by specifically addressing the applicability of SourceForge data to the study of project evolution."
"For our analysis we examine 9,997 Production/Stable or Maintenance phase projects stored in CVS on SourceForge and extracted in October of 2006 "