code-rag

Semantic code search for your entire codebase - ask questions in plain English, get relevant code snippets with source locations

Semantic code search for your entire codebase. Ask questions in plain English and get relevant code snippets with source locations. Instead of grepping for function names, ask “authentication logic” and find all related auth code across your project.

Why Use Code-RAG?

  • Understand unfamiliar codebases: Ask questions instead of reading everything
  • Find examples: “error handling with retries” finds all relevant patterns
  • Refactoring aid: Locate all code related to a feature you’re changing
  • Documentation: Extract context for writing docs or onboarding

Key Features

  • MCP Server Integration: Works seamlessly with Claude Code and other MCP-compatible coding assistants
  • Semantic Search: AI-powered understanding beyond keyword matching
  • Syntax-Aware Chunking: Intelligent code chunking for Python, JavaScript, TypeScript, Go, Rust, Java, C, and C++
  • Multiple Database Backends: Supports ChromaDB and Qdrant
  • Flexible Embedding Models:
    • Local: nomic-ai/CodeRankEmbed (code-optimized, requires GPU)
    • Cloud: OpenAI, Azure, Google Vertex AI, Cohere, AWS Bedrock, and more via LiteLLM
  • Reranking Support: Optional result reranking for improved accuracy
  • Shared Embedding Server: Reduces memory footprint when running multiple instances

Quick Start

Installation

# Using uv (recommended)
uvx --from code-rag-mcp code-rag-setup --install

# Using pip
pip install code-rag-mcp && code-rag-setup

# One-command installer
curl -sSL https://raw.githubusercontent.com/qduc/code-rag/main/scripts/install.sh | bash

Add to Claude Code

claude mcp add code-rag -- code-rag-mcp

That’s it! Claude can now search your codebase semantically.

CLI Usage

code-rag-cli --path /path/to/your/project
code-rag-cli --reindex  # Force reindex
code-rag-cli --results 10  # More results
code-rag-cli --model text-embedding-3-small  # OpenAI embeddings
code-rag-cli --database qdrant  # Use Qdrant

How It Works

  1. Scans your codebase (respects .gitignore)
  2. Chunks code intelligently using syntax-aware parsing
  3. Embeds chunks as vectors using ML models
  4. Stores in vector database (ChromaDB or Qdrant)
  5. Searches semantically when you query

Pluggable architecture - swap databases, embedding models, or add new ones.

Configuration

Configure via environment variables or config files:

  • CODE_RAG_EMBEDDING_MODEL: Embedding model (default: nomic-ai/CodeRankEmbed)
  • CODE_RAG_DATABASE_TYPE: Database backend - chroma or qdrant (default: chroma)
  • CODE_RAG_CHUNK_SIZE: Chunk size in characters (default: 1024)
  • CODE_RAG_RERANKER_ENABLED: Enable result reranking (default: false)
  • CODE_RAG_SHARED_SERVER: Share embedding server across instances (default: true)

Tech Stack

Check out the GitHub repository for detailed documentation, API usage, and troubleshooting.