Analyzing cloning evolution in the Linux kernel

TitleAnalyzing cloning evolution in the Linux kernel
Publication TypeJournal Article
Year of Publication2002
AuthorsAntoniol, G, Villano, U, Merlo, E, Di Penta, M
Secondary TitleInformation and Software Technology
Accession NumberWOS:000178367900005
Keywordscvs, kernel, lines of code, linux, loc, project success, source code

Identifying code duplication in large multi-platform software systems is a challenging problem. This is due to a variety of reasons including the presence of high-level programming languages and structures interleaved with hardware-dependent low-level resources and assembler code, the use of GUI-based configuration scripts generating commands to compile the system, and the extremely high number of possible different configurations. This paper studies the extent and the evolution of code duplications in the Linux kernel. Linux is a large, multi-platform software system; it is based on the Open Source concept, and so there are no obstacles in discussing its implementation. In addition, it is decidedly too large to be examined manually: the current Linux kernel release (2.4.18) is about three million LOCs. Nineteen releases, from 2.4.0 to 2.4.18, were processed and analyzed, identifying code duplication among Linux subsystems by means of a metric-based approach. The obtained results support the hypothesis that the Linux system does not contain a relevant fraction of code duplication. Furthermore, code duplication tends to remain stable across releases, thus suggesting a fairly stable structure, evolving smoothly without any evidence of degradation. (C) 2002 Elsevier Science B.V. All rights reserved.
Full Text
PDF icon infsoft2002.pdf280.22 KB