A Replicable Infrastructure for Empirical Studies of Email Archives
Title | A Replicable Infrastructure for Empirical Studies of Email Archives |
Publication Type | Conference Proceedings |
Year of Publication | 2013 |
Authors | Squire, M |
Refereed Designation | Refereed |
Secondary Title | 3rd International Workshop on Replication in Empirical Software Engineering Research (RESER2013) |
Pagination | 43-50 |
Date Published | 10/2013 |
Publisher | IEEE |
Place Published | Baltimore, MD, USA |
ISBN Number | 978-0-7695-5121-0 |
Keywords | apache, cleaning, collection, couchdb, database, document-oriented database, email, lucene, mailing lists, nosql, replication, storage |
Abstract | This paper describes a replicable infrastructure solution for conducting empirical software engineering studies based on email mailing list archives. Mailing list emails, such as those affiliated with free, libre, and open source software (FLOSS) projects, are currently archived in several places online, but each research team that wishes to study these email artifacts closely must design their own solution for collection, storage and cleaning of the data. Consequently, research results will be difficult to replicate, especially as the email archive for any living project will still be continually growing. This paper describes a simple, replicable infrastructure for the collection, storage, and cleaning of project email data and analyses. |
Full Text |
Attachment | Size |
---|---|
RESERv2.pdf | 628.97 KB |
- Log in or register to post comments
- Google Scholar
- BibTeX
- Tagged
- EndNote XML