Experiences Mining Open Source Release Histories
Title | Experiences Mining Open Source Release Histories |
Publication Type | Conference Paper |
Year of Publication | 2011 |
Authors | Tsay, J, Wright, H, Perry, D |
Secondary Title | International Conference on Software and Systems Process (ICSSP 2011) |
Date Published | 05/2011 |
Keywords | doap, flossmole cited, life cycle, release engineering, release history, release management, releases |
Abstract | Software releases form a critical part of the life cycle of a software project. Typically, each project produces releases in its own way, using various methods of versioning, archiving, announcing and publishing the release. Understanding the release history of a software project can shed light on the project history, as well as the release process used by that project, and how those processes change. However, many factors make automating the retrieval of release history information difficult, such as the many sources of data, a lack of relevant standards and a disparity of tools used to create releases. In spite of the large amount of raw data available, no attempt has been made to create a release history database of a large number of projects in the open source ecosystem. This paper presents our experiences, including the tools, techniques and pitfalls, in our early work to create a software release history database which will be of use to future researchers who want to study and model the release engineering process in greater depth. |
Notes | "First, we selected the projects to initially target, using several criteria to get a broad picture of the open source landscape. Second, we collected the actual data, using a framework of parsers and some manual inspection. Third, we standardized and inserted the data into a database for later use." "but we plan to eventually cross reference our list of projects with existing open source project information (such as FLOSSmole) to take advantage of the work already done by other researchers." "For each release, we collected the following data: the project it belonged to, the date the release was published, the type of release, the release label (version number) and the source of the data" discussion of their difficulties "We conclude that programmatically creating a release history database from existing open source data is not trivial," "We have currently collected 1579 distinct releases from 22 different open source projects" |
Full Text |
Attachment | Size |
---|---|
icssp11short-p034-tsay.pdf | 181.45 KB |
- Log in or register to post comments
- Google Scholar
- BibTeX
- Tagged
- EndNote XML