Abstract | The development of Open Source Software (OSS) projects is a process of collective innovation in the environment of online community. The paper addresses the challenge of efficiently mining data from OSS web repositories and building models to study OSS community features. Data collection for OSS community study is nontrivial since most OSS projects are develope
d by distributed developers using web tools. We design a mining process which combines web mining and database mining together to identify, extract, filter and analyze data. We address and analyze the difficulty of mining OSS community data. Our work provides a
general solution for researchers to implement advanced techniques, such as web mining, data mining, statistics, and algorithms to collect and analyze online community data.
|