
CodeAncestry
Ask your Git history anything. Why did this function change? Who refactored this module? CodeAncestry answers in seconds — semantically.
The Invisible History Problem
Every large repository carries thousands of commits, but engineers spend hours running git log --grep and reading through diffs to understand why a decision was made. CodeAncestry was built to transform that entire history into a queryable semantic knowledge base — connect your GitHub repo, and ask questions as if you were talking to the engineer who wrote every line of code.
Hybrid Query Modes
Intelligently classifies queries as Temporal ("what changed last month"), Semantic ("why was authentication refactored"), or Hybrid — routing each to the right retrieval strategy automatically.
Contributor Analytics
Visual commit timelines with relevance scoring, contributor dashboards, and Recharts-powered analytics — understand not just what changed, but who shaped the codebase and when.
Snowflake-Powered RAG
Devntra designed a full RAG pipeline: Gemini AI analyzes and summarizes every commit, Snowflake Cortex generates 768-dimension vector embeddings, and VECTOR_COSINE_SIMILARITY retrieves the most relevant commits for any natural language question — answered via Mistral-7B.
Snowflake Cortex Vector DB
All commits are embedded using EMBED_TEXT_768 and stored as vectors in Snowflake. Queries run VECTOR_COSINE_SIMILARITY at warehouse scale — fast, accurate, and citation-aware.
Voice-First Interface
Ask questions using your microphone and receive spoken AI answers — making repository exploration as natural as a conversation with a colleague who knows every commit by heart.
GitHub OAuth Integration
Seamless GitHub OAuth connects directly to any repository. All secrets managed via 1Password, JWT tokens securing every API route.
Query classification → embedding → vector similarity → LLM response, all powered by Snowflake Cortex.
Temporal, Semantic, and Hybrid search modes intelligently classified from natural language questions.
Ask questions using voice commands and receive AI-generated answers read back aloud.
“I asked CodeAncestry why a critical API endpoint was refactored six months ago. It pulled up three relevant commits, explained the reasoning, and cited the exact engineers involved — in under three seconds. That used to take me an hour of digging through git blame.”
Ready to build your AI search product?
Let's discuss how our RAG pipeline expertise can make any dataset semantically queryable.