code-rag
Semantic code search for your entire codebase - ask questions in plain English, get relevant code snippets with source locations
Semantic code search for your entire codebase. Ask questions in plain English and get relevant code snippets with source locations. Instead of grepping for function names, ask “authentication logic” and find all related auth code across your project.
Why Use Code-RAG?
- Understand unfamiliar codebases: Ask questions instead of reading everything
- Find examples: “error handling with retries” finds all relevant patterns
- Refactoring aid: Locate all code related to a feature you’re changing
- Documentation: Extract context for writing docs or onboarding
Key Features
- MCP Server Integration: Works seamlessly with Claude Code and other MCP-compatible coding assistants
- Semantic Search: AI-powered understanding beyond keyword matching
- Syntax-Aware Chunking: Intelligent code chunking for Python, JavaScript, TypeScript, Go, Rust, Java, C, and C++
- Multiple Database Backends: Supports ChromaDB and Qdrant
- Flexible Embedding Models:
- Local:
nomic-ai/CodeRankEmbed(code-optimized, requires GPU) - Cloud: OpenAI, Azure, Google Vertex AI, Cohere, AWS Bedrock, and more via LiteLLM
- Local:
- Reranking Support: Optional result reranking for improved accuracy
- Shared Embedding Server: Reduces memory footprint when running multiple instances
Quick Start
Installation
# Using uv (recommended)
uvx --from code-rag-mcp code-rag-setup --install
# Using pip
pip install code-rag-mcp && code-rag-setup
# One-command installer
curl -sSL https://raw.githubusercontent.com/qduc/code-rag/main/scripts/install.sh | bash
Add to Claude Code
claude mcp add code-rag -- code-rag-mcp
That’s it! Claude can now search your codebase semantically.
CLI Usage
code-rag-cli --path /path/to/your/project
code-rag-cli --reindex # Force reindex
code-rag-cli --results 10 # More results
code-rag-cli --model text-embedding-3-small # OpenAI embeddings
code-rag-cli --database qdrant # Use Qdrant
How It Works
- Scans your codebase (respects
.gitignore) - Chunks code intelligently using syntax-aware parsing
- Embeds chunks as vectors using ML models
- Stores in vector database (ChromaDB or Qdrant)
- Searches semantically when you query
Pluggable architecture - swap databases, embedding models, or add new ones.
Configuration
Configure via environment variables or config files:
CODE_RAG_EMBEDDING_MODEL: Embedding model (default:nomic-ai/CodeRankEmbed)CODE_RAG_DATABASE_TYPE: Database backend -chromaorqdrant(default:chroma)CODE_RAG_CHUNK_SIZE: Chunk size in characters (default:1024)CODE_RAG_RERANKER_ENABLED: Enable result reranking (default:false)CODE_RAG_SHARED_SERVER: Share embedding server across instances (default:true)
Tech Stack
- Python 3.10+
- ChromaDB / Qdrant for vector storage
- sentence-transformers for embeddings
- Tree-sitter for syntax-aware parsing
- LiteLLM for cloud embedding providers
Check out the GitHub repository for detailed documentation, API usage, and troubleshooting.