Investigating the Geography of Open Source Software through Github

TitleInvestigating the Geography of Open Source Software through Github
Publication TypeUnpublished
Year of Publication2010
AuthorsTakhteyev, Y, Hilts, A
Abstract

The paper presents an empirical study of the geography of open source software development that looks at Github, a popular project hosting website. We show that developers are highly clustered and concentrated primarily in North America and Western and Northern Europe, though a substantial minority is present in other regions. Code contributions and attention show a strong local bias. Users in North America account for a larger share of received contributions than of contributions made. They also receive a disproportionate amount of attention.

Notes

"We collected our data through Github’s public API, which offers the same data as available on the Github’s website but presents it in a structured format for simpler processing. The data were collected from May to July of 2010"

"The data collection followed a recursive procedure. We started with a single account, belonging to one of Github’s founders. We then identified accounts connected to this user, then looked for accounts connected to the newly found ones, repeating this procedure until we achieved closure. New accounts were identified through the four kinds of connections mentioned in the previous section: (1) those that follow accounts collected earlier, (2) those followed by the accounts collected earlier, (3) those whose repositories were being “watched” by accounts collected earlier, (4) those who had made code contributions to the repositories watched by accounts collected earlier."

Full Text
AttachmentSize
PDF icon Takhteyev-Hilts-2010.pdf237.76 KB