Abstract | Sustaining large open source development efforts requires recruiting new participants; however, a lack of
architectural documentation might inhibit new participants since large amounts of project knowledge are unavailable to
newcomers. We present the results of a multitrait, multimethod analysis of the effects of introducing architectural documentation
into a substantial open source project—the Hadoop Distributed File System (HDFS). HDFS had only minimal architectural
documentation, and we wanted to discover whether the putative benefits of architectural documentation could be observed over
time. To do this, we created and publicized an architecture document and then monitored its usage and effects on the project.
The results were somewhat ambiguous: by some measures the architecture documentation appeared to effect the project but
not by others. Perhaps of equal importance is our discovery that the project maintained, in its web-accessible JIRA archive of
software issues and fixes, enough architectural discussion to support architectural thinking and reasoning. This “emergent”
architecture documentation served an important purpose in recording core project members’ architectural concerns and
resolutions. However, this emergent architecture documentation did not serve all project members equally well; it appears that
those on the periphery of the project—newcomers and adopters—still require explicit architecture documentation, as we will
show.
|