%0 Journal Article %J Information and Software Technology %D 2010 %T Survival analysis on the duration of open source projects %A Samoladas, Ioannis %A Lefteris Angelis %A Ioannis Stamelos %K flossmetrics %K prediction %K source code %K survival analysis %X Context Open source (FLOSS) project survivability is an important piece of information for many open source stakeholders. Coordinators of open source projects would like to know the chances for the survival of the projects they coordinate. Companies are also interested in knowing how viable a project is in order to either participate or invest in it, and volunteers want to contribute to vivid projects. Objective The purpose of this article is the application of survival analysis techniques for estimating the future development of a FLOSS project. Method In order to apply such approach, duration data regarding FLOSS projects from the FLOSSMETRICS (This work was partially supported by the European Community’s Sixth Framework Program under the Contract FP6-033982) database were collected. Such database contains metadata for thousands of FLOSS projects, derived from various forges. Subsequently, survival analysis methods were employed to predict the survivability of the projects, i.e. their probability of continuation in the future, by examining their duration, combined with other project characteristics such as their application domain and number of committers. Results It was shown how probability of termination or continuation may be calculated and how a prediction model may be built to upraise project future. In addition, the benefit of adding more committers to FLOSS projects was quantified. Conclusion Analysis results demonstrate the usefulness of the proposed framework for assessing the survival probability of a FLOSS project. %B Information and Software Technology %V 52 %P 902 - 922 %8 9/2010 %N 9 %! Information and Software Technology %R 10.1016/j.infsof.2010.05.001 %0 Conference Paper %B 3rd Workshop on Public Data about Software Development (WoPDaSD 2008) %D 2008 %T Are FLOSS developers committing to CVS/SVN as much as they are talking in mailing lists? Challenges for Integrating data from Multiple Repositories %A Sowe, Sulayman K. %A Samoladas, Ioannis %A Ioannis Stamelos %A Lefteris Angelis %K cvs %K cvsanaly %K developers %K email %K email archives %K flossmetrics %K mailing list %K mlstats %K source code %X This paper puts forward a framework for investigating Free and Open Source Software (F/OSS) developers activities in both source code and mailing lists repositories. We used data dumps of fourteen pro jects from the FLOSSMetrics (FM) retrieval system. Our intentions are (i) to present a possible methodology, its advantages and disadvantages which can benefit future researchers using some aspects of the FM retrieval system’s data dumps, and (ii) discuss our initial research results on the contributions developers make to both coding and lists activities. %B 3rd Workshop on Public Data about Software Development (WoPDaSD 2008) %P 49-54 %8 09/2008 %> https://flosshub.org/sites/flosshub.org/files/49-542008.pdf %0 Journal Article %J Empirical Software Engineering %D 2008 %T A statistical framework for analyzing the duration of software projects %A Sentas, P. %A Lefteris Angelis %A Ioannis Stamelos %X The duration of a software project is a very important feature, closely related to its cost. Various methods and models have been proposed in order to predict not only the cost of a software project but also its duration. Since duration is essentially the random length of a time interval from a starting to a terminating event, in this paper we present a framework of statistical tools, appropriate for studying and modeling the distribution of the duration. The idea for our approach comes from the parallelism of duration to the life of an entity which is frequently studied in biostatistics by a certain statistical methodology known as survival analysis. This type of analysis offers great flexibility in modeling the duration and in computing various statistics useful for inference and estimation. As in any other statistical methodology, the approach is based on datasets of measurements on projects. However, one of the most important advantages is that we can use in our data information not only from completed projects, but also from ongoing projects. In this paper we present the general principles of the methodology for a comprehensive duration analysis and we also illustrate it with applications to known data sets. The analysis showed that duration is affected by various factors such as customer participation, use of tools, software logical complexity, user requirements volatility and staff tool skills. %B Empirical Software Engineering %V 13 %P 147-184 %G eng %M WOS:000254743000003 %1 software engineering %2 statistical modeling %0 Journal Article %J Journal of Systems and Software %D 2008 %T Understanding knowledge sharing activities in free/open source software projects: An empirical study %A Sowe, Sulayman K. %A Ioannis Stamelos %A Lefteris Angelis %K debian %K email %K email archives %K mailing list %X Free/Open Source Software (F/OSS) projects are people-oriented and knowledge intensive software development environments. Many researchers focused on mailing lists to study coding activities of software developers. How expert software developers interact with each other and with non-developers in the use of community products have received little attention. This paper discusses the altruistic sharing of knowledge between knowledge providers and knowledge seekers in the Developer and User mailing lists of the Debian project. We analyze the posting and replying activities of the participants by counting the number of email messages they posted to the lists and the number of replies they made to questions others posted. We found out that participants interact and share their knowledge a lot, their positing activity is fairly highly correlated with their replying activity, the characteristics of posting and replying activities are different for different kinds of lists, and the knowledge sharing activity of self-organizing Free/Open Source communities could best be explained in terms of what we called "Fractal Cubic Distribution" rather than the power-law distribution mostly reported in the literature. The paper also proposes what could be researched in knowledge sharing activities in F/OSS projects mailing list and for what purpose. The research findings add to ' our understanding of knowledge sharing activities in F/OSS projects. (C) 2007 Elsevier Inc. All rights reserved. %B Journal of Systems and Software %V 81 %P 431-446 %G eng %M WOS:000254709200010 %1 information systems %2 computational? %> https://flosshub.org/sites/flosshub.org/files/JSS_0.pdf %0 Conference Paper %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %D 2007 %T Using Repository of Repositories (RoRs) to Study the Growth of F/OSS Projects: A Meta-Analysis Research Approach %A Sowe, Sulayman %A Lefteris Angelis %A Stamelos, I. %A Y. Manolopoulos %X Free/Open Source Software (F/OSS) repositories contain valuable data and their usefulness in studying software development and community activities continues to attract a lot of research attention. A trend in F/OSS studies is the use of metadata stored in a repository of repositories or RoRs. This paper utilizes data obtained from such RoRs -FLOSSmole- to study the types of projects being developed by the F/OSS community. We downloaded projects by topics data in five areas (Database, Internet, Software Development, Communications, and Games/Entertainment) from Flossmole’s raw and summary data of the sourceforge repository. Time series analysis show the numbers of projects in the five topics are growing linearly. Further analysis supports our hypothesis that F/OSS development is moving “up the stack” from developer tools and infrastructure support to end-user applications such as Databases. The findings have implications for the interpretation of the F/OSS landscape, the utilization and adoption of open source databases, and problems researchers might face in obtaining and using data from RoRs. %B OSS2007: Open Source Development, Adoption and Innovation (IFIP 2.13) %S IFIP International Federation for Information Processing %I Springer %V 234/2007 %P 147 - 160 %8 2007/// %G eng %& 12 %R http://dx.doi.org/10.1007/978-0-387-72486-7_12 %> https://flosshub.org/sites/flosshub.org/files/Using%20Repository%20of%20Repositories.pdf %0 Journal Article %J Information and Software Technology %D 2006 %T Identifying Knowledge Brokers that Yield Software Engineering Knowledge in OSS Projects %A Sowe, Sulayman K. %A Ioannis Stamelos %A Lefteris Angelis %K debian %K email %K email archives %K expertise %K knowledge sharing %K mailing list %K project success %K social network analysis %X Much research on open source software development concentrates on developer lists and other software repositories to investigate what motivates professional software developers to participate in open source software projects. Little attention has been paid to individuals who spend valuable time in lists helping participants on some mundane yet vital project activities. Using three Debian lists as a case study we investigate the impact of knowledge brokers and their associated activities in open source projects. Social network analysis was used to visualize how participants are affiliated with the lists. The network topology reveals substantial community participation. The consequence of collaborating in mundane activities for the success of open source software projects is discussed. The direct beneficiaries of this research are in the identification of knowledge experts in open source software projects. %B Information and Software Technology %V 46 %P 1025-1033 %8 11/2006 %G eng %R 10.1016/j.infsof.2005.12.019 %> https://flosshub.org/sites/flosshub.org/files/IST-Vol-48-11-2006.pdf %0 Journal Article %J Information Systems Journal %D 2002 %T Code quality analysis in open source software development %A Ioannis Stamelos %A Lefteris Angelis %A Apostolos Oikonomou %A Georgios L. Bleris %K C %K Code quality characteristics %K functions %K linux %K metrics %K open source development %K software measurement %K structural code analysis %K Suse %K user satisfaction %X Proponents of open source style software development claim that better software is produced using this model compared with the traditional closed model. However, there is little empirical evidence in support of these claims. In this paper, we present the results of a pilot case study aiming: (a) to understand the implications of structural quality; and (b) to figure out the benefits of structural quality analysis of the code delivered by open source style development. To this end, we have measured quality characteristics of 100 applications written for Linux, using a software measurement tool, and compared the results with the industrial standard that is proposed by the tool. Another target of this case study was to investigate the issue of modularity in open source as this characteristic is being considered crucial by the proponents of open source for this type of software development. We have empirically assessed the relationship between the size of the application components and the delivered quality measured through user satisfaction. We have determined that, up to a certain extent, the average component size of an application is negatively related to the user satisfaction for this application. %B Information Systems Journal %V 12 %P 43–60