I will audit and optimize your rag vector search performance

V
valhallasoft
V
valhallasoft
Martin Poli

About this gig

Your RAG is in production but returning bad results. Latency is slow. Costs are climbing. Hallucinations slip through. Sound familiar?


I audit and fix RAG pipelines that look right on paper but fail in the real world. 10+ years of production backend work, currently leading the AI search migration for one of Latin America's largest retailers (50K+ products, 1M+ daily users).


What I audit:

  • Embedding model fit for your domain
  • - Chunking strategy and overlap
  • - Retrieval recall and precision (with eval set)
  • - Reranking effectiveness
  • - Hybrid search weights (keyword vs semantic)
  • - Latency per stage and cost per query
  • - Hallucination patterns

What you get:

  • Written diagnostic with prioritized fixes
  • - Code changes for top issues (Standard / Premium)
  • - Eval set so you can measure progress
  • - Monitoring setup (Premium)

Stack: Python, OpenAI, Anthropic, Pinecone, Weaviate, Qdrant, pgvector, LangChain.


Send me your stack and one example query that fails. I will tell you what is likely broken before you pay.

Get to know Martin Poli

Martin Poli

Senior RAG and AI Search Engineer for Backend at Scale

  • FromUruguay
  • Member sinceMar 2020
  • Languages

    English
Senior Platform Engineer with 10+ years building production systems at scale. Currently leading platform infra and AI search for Argentina's largest retail chain (200+ stores, 1M+ users/day), replacing Google Search API with RAG-based semantic search across 50K+ products. What I do best: - RAG, embeddings, OpenAI/Anthropic/Bedrock - Vector DBs: Pinecone, Weaviate, Qdrant, pgvector - Backend at scale: Python, Go, Node.js, PHP 8 - AWS EKS, Karpenter, Terraform, multi-account IaC Have a search problem or an LLM pipeline that won't ship? Send me your stack.

My Portfolio