Analyzing and mining a code search engine usage log

Publication TypeJournal Article
Year of Publication2012
AuthorsBajracharya, SK, Lopes, CV
Secondary TitleEmpirical Software Engineering
Pagination424 - 466
Date Published8/2012
ISSN Number1573-7616
Keywordscode search, koders, search, search engine, topics

This paper presents an analysis of a year long usage log of Koders, the first commercially available Internet-Scale code search engine ( The usage log comprises about ten million activities from more than three million users. Analysis of the usage data shows that despite of attracting a large number of visitors, Koders has a very sparse usage and that it lacks regular usage from many of its users. When compared to Web search, search behavior in Koders showed many similar patterns. A topic modeling analysis of the usage data shows what topics users of Koders are looking for. Observations on the prevalence of these topics among the users, and observations on how search and download activities vary across topics, lead to the conclusion that users who find code search engines usable are those who already know to a high level of specificity what to look for. This paper also presents a general categorization of these topics that provides insights on the different ways code search engine users express their queries. It identifies various forms of queries in Koders’s log and the kinds of results addressed by the queries. It also provides several suggestions for improvements in code search engines based on the analysis of usage, topics, and query forms. The work presented in this paper is the first of its kind that reveals several insights on the usage of an Internet-Scale code search engine.

