Finding Source Code on the Web for Remix and Reuse

Publication TypeBook
Year of Publication2013
AuthorsBajracharya, SK
Secondary AuthorsSim, SE, Gallardo-Valencia, RE
Secondary TitleInfrastructure for Building Code Search Applications for Developers
Pagination135 - 164
PublisherSpringer New York
Place PublishedNew York, NY
ISBN Number978-1-4614-6596-6
Keywordscode search, flossmole cited

The large availability of open source code on the Web provides great opportunities to build useful code search applications for developers. Building such applications requires addressing several challenges inherent in collecting and analyzing code from open source repositories to make them available for search. An infrastructure that supports collection, analysis, and search services for open source code available on the Web can greatly facilitate building effective code search applications. This chapter presents such an infrastructure called Sourcerer that facilitates collection, analysis, and search of source code available in code repositories on the Web. This chapter provides useful information to researchers and implementors of code search applications interested in harnessing the large availability of source code in the repositories on the Web. In particular, this chapter highlights key aspects of Sourcerer that supports combining Software Engineering and Information Retrieval techniques to build effective code search applications.


In "further reading": "Although not a code search infrastructure, FLOSSmole [13] is another major undertaking
in building large collection of metadata about open source projects on the
Web. Currently, FLOSSmole reports a massive data collection of more than 500,000
open source projects in its web site [32]. For code search infrastructure builders,
now it is possible to leverage FLOSSmole’s project metadata to build code repositories
instead of spending an effort in implementing custom spiders and crawlers for

