%0 Journal Article %J ISRN Software Engineering %D 2013 %T Interlinking Developer Identities within and across Open Source Projects: The Linked Data Approach %A Iqbal, Aftab %A Hausenblas, Michael %K developer %K identity %K linked data %X Software developers use various software repositories in order to interact with each other or to solve related problems. These repositories provide a rich source of information for a wide range of tasks. However, one issue to overcome in order to make this information useful is the identification and interlinking of multiple identities of developers. In this paper, we propose a Linked Data-based methodology to interlink and integrate multiple identities of a developer found in different software repositories of a project as well as across repositories of multiple projects. Providing such interlinking will enable us to keep track of a developer’s activity not only within a single project but also across multiple projects. The methodology will be presented in general and applied to 5 Apache projects as a case study. Further, we show that the few methods suggested so far are not always appropriate to overcome the developer identification problem. %B ISRN Software Engineering %V 201330692164 %P 1 - 12 %8 2013 %N 42111–3 %! ISRN Software Engineering %R 10.1155/2013/58473110.1007/s10664-009-9110-310.1109/TSE.2005.7010.1007/978-0-387-72486-7_4 %> https://flosshub.org/sites/flosshub.org/files/584731.pdf %0 Conference Paper %B Proceedings of the 2005 international workshop on Mining software repositories %D 2005 %T Developer identification methods for integrated data from various sources %A Gregorio Robles %A Jesus M. Gonzalez-Barahona %K anonymization %K bug tracker %K developers %K email %K email address %K gnome %K identity %K mailing list %K privacy %K source code %K version control %X Studying a software project by mining data from a single repository has been a very active research field in software engineering during the last years. However, few efforts have been devoted to perform studies by integrating data from various repositories, with different kinds of information, which would, for instance, track the different activities of developers. One of the main problems of these multi-repository studies is the different identities that developers use when they interact with different tools in different contexts. This makes them appear as different entities when data is mined from different repositories (and in some cases, even from a single one). In this paper we propose an approach, based on the application of heuristics, to identify the many identities of developers in such cases, and a data structure for allowing both the anonymized distribution of information, and the tracking of identities for verification purposes. The methodology will be presented in general, and applied to the GNOME project as a case example. Privacy issues and partial merging with new data sources will also be considered and discussed. %B Proceedings of the 2005 international workshop on Mining software repositories %S MSR '05 %I ACM %C New York, NY, USA %P 106-110 %@ 1-59593-123-6 %U http://doi.acm.org/10.1145/1082983.1083162 %R http://doi.acm.org/10.1145/1082983.1083162 %> https://flosshub.org/sites/flosshub.org/files/106DeveloperIdentification.pdf