Visualizing collaboration and influence in the open-source software community
Title | Visualizing collaboration and influence in the open-source software community |
Publication Type | Conference Paper |
Year of Publication | 2011 |
Authors | Marschner, E, Rosenfeld, E, Heer, J, Heller, B |
Tertiary Authors | van Deursen, A, Xie, T, Zimmermann, T |
Secondary Title | Proceedings of the 8th working conference on Mining software repositories - MSR '11 |
Pagination | 223-226 |
Date Published | 05/2011 |
Publisher | ACM Press |
Place Published | New York, New York, USA |
ISBN Number | 9781450305747 |
Keywords | COLLABORATION, data exploration, geography, geoscatter, github, graph, mapping, metadata, open source, social graph, user profiles, visualization |
Abstract | We apply visualization techniques to user profiles and repository metadata from the GitHub source code hosting service. Our motivation is to identify patterns within this development community that might otherwise remain obscured. Such patterns include the effect of geographic distance on developer relationships, social connectivity and influence among cities, and variation in project-specific contribution styles (e.g., centralized vs. distributed). Our analysis examines directed graphs in which nodes represent users' geographic locations and edges represent (a) follower relationships, (b) successive commits, or (c) contributions to the same project. We inspect this data using a set of visualization techniques: geo-scatter maps, small multiple displays, and matrix diagrams. Using these representations, and tools based on them, we develop hypotheses about the larger GitHub community that would be difficult to discern using traditional lists, tables, or descriptive statistics. These methods are not intended to provide conclusive answers; instead, they provide a way for researchers to explore the question space and communicate initial insights. |
Notes | "This data set includes the complete social graph of 500,000 follow links as well as over 1,000,000 commits and 50,000 users." "...a large fraction of [GitHub] users provide a location in their profile, which we can turn into geographic coordinates using a geocoding API like PlaceFinder... "For each repository, we extract the owner, collaborator, and contributor usernames, plus branch names. New user- names help to find new repositories, while branch names are used to fetch commit metadata. Using this method, the crawler uncovered 40,860 code repositories, representing 33,388 unique project names and 1,219,872 individual commits." "In addition to crawled data, we use the complete GitHub user follower graph from Jan 19, 2011. This graph includes 452,248 links connecting 106,247 unique users, 47% (49,500) of which could be geocoded with the PlaceFinder API" |
URL | http://vis.stanford.edu/files/2011-GotHub-MSR.pdf |
DOI | 10.1145/1985441.1985476 |
Full Text |
- Log in or register to post comments
- Google Scholar
- DOI
- BibTeX
- Tagged
- EndNote XML