Developer Learning Dynamics in Open Source Software Projects: A Hidden Markov Model Analysis

TitleDeveloper Learning Dynamics in Open Source Software Projects: A Hidden Markov Model Analysis
Publication TypeJournal Article
Year of Publication2006
AuthorsSingh, PV, Youn, N, Tan, Y
Date PublishedDecember
Abstract

This work proposes a dynamic model of developer learning in open source software (OSS) projects. A Hidden Markov Model (HMM) is proposed to explain how the code contribution behaviors of OSS de-velopers change as their levels of knowledge on their projects increase. In this model, discrete hidden states represent the unobserved knowledge levels of developers, and their observed code contribution be-haviors are modeled as state dependent. Developers??? knowledge levels evolve as they learn about the pro-jects over time. Two modes of learning are considered: learning-by-doing (code development) and learn-ing through interactions with peers. The model is calibrated using data spanning six years for 25 OSS pro-jects and 251 developers hosted at Sourceforge. The proposed model identifies three knowledge states (high, medium, and low) and estimates the impact of the two modes of learning on the transition of devel-opers between the three knowledge states. The model results suggest that in the low knowledge state de-velopers exhibit the greatest inertia, followed by those in the medium and high states. Both modes of learning are found to have varying impact across the three knowledge states. Interactions with peers ap-pear to be an important source of learning for developers in all states. A developer in the low state learns only through participation in threads started by others. Prior code contribution and starting discussion by initiating threads do not impact the knowledge level of a developer in the low state. Initiating threads, par-ticipating in threads started by others, and prior code contributions have positive impacts on the knowl-edge level of a developer in the medium or high state and, hence, influence his long term code contribu-tion behavior. Explanations for these varying impacts of learning activities on the transitions of develop-ers between the three states are provided. We also find a lack of persistence of knowledge in all states. The HMM better describes the data than a latent class model which would suggests that the learning ac-tivities have a long term, dynamic impact, rather than an immediate, static impact on the code contribu-tion behavior of a developer.

Full Text
AttachmentSize
PDF icon singh-youn-tan.pdf225.83 KB