Raghavan Muthuregunathan

BIO

Raghavan Muthuregunathan is a search engine expert who leads the Linkedin Search AI team. Apart from his day job, he volunteers for UN ITU and genaicommons.org (part of Linux Foundation’s LF AI +DATA). He is also an avid participant of lablab.ai hackathons.

Title

Locally deployable Semantic Code Search - Increase productivity of new contributors

ABSTRACT

Imagine a new contributor is trying to solve a simple beginner task but is overwhelmed by the complexity of the repository. Knowing which file to change and where to make the change can be time-consuming.

Popular code search tools are based on keyword matches and are not based on natural language. Existing Code search tools do not semantically understand the codebase. This talk will explore building a locally deployable Semantic Code Search tool to simplify the process of navigating complex codebases, which is particularly beneficial for new contributors onboard to open-source GitHub projects.

We show how to build a locally deployable Natural Language-based Code Search tool using open-source LLMs that can help beginners understand any repository and start contributing.

Leveraging the RAG paradigm (Retrieval Augmented Generation), we show a two-step process of retrieving relevant files for the task using Vector Search and an open-source LLM to generate answers to the natural language questions to accomplish the task at hand.

The solution would be a Locally deployable solution that anyone can build, test, and deploy locally for their local repositories using open-source technologies only.