j
john_whmatrix

John M.

@john_whmatrix

Semantic Indexing Engineer RAG Pipelines FAISS and E5 Large V2

United States
English
About me
I design and deliver production-ready semantic indexing systems for RAG, semantic search, and document retrieval. I transform raw text into structured vector datasets using semantic chunking, dense embeddings, FAISS indexing, and metadata alignment — with validation so retrieval stays reliable over time. Clients use my indexes to power document Q&A, compliance search, knowledge base retrieval, and research discovery. Applied across multiple research organizations and 100+ datasets. Compatible with LangChain, LlamaIndex, Haystack, pgvector, and Pinecone.... Read more

Skills

j
john_whmatrix
John M.
Offline • 

See my services

AI Technology Consulting
I will run a rag readiness audit with indexing risk assessment
AI Technology Consulting
I will build a production ready faiss index for your rag pipeline

Portfolio

Work experience

Independent

Freelance • 2 yrs 10 mos

Semantic Indexing Engineer

Mar 2025 - Present1 yr 2 mos

Build production-grade semantic indexing pipelines for RAG and search systems. Deliver validated FAISS indexes, chunked corpora, and metadata manifests ready for LangChain/LlamaIndex integration. ▪ Indexed 100+ datasets across legal, regulatory, scientific, and general knowledge domains ▪ Applied methodology across research institutions spanning economics, migration, water policy, and labor ▪ Developed standardized deliverable spec with quality gates, validation thresholds, and audit contracts

ML Infrastructure Engineer

Mar 2025 - Present1 yr 2 mos

Configure and maintain local LLM inference environments on A6000 GPU hardware for development and evaluation workflows. ▪ Quantized and deployed Llama 70B, Mixtral 8x7B, DeepSeek-R1, and CodeLlama models ▪ Managed multiple GGUF and embedding models for retrieval and code generation tasks ▪ Built dataset preparation and telemetry pipelines supporting live model evaluation

Research Corpus Analyst

Jun 2025 - Dec 20256 mos

Built and deployed semantic search indexes for policy research organizations, enabling cross-document discovery across large institutional archives. ▪ Indexed corpora spanning diverse research organizations across economics, migration, water policy, and labor domains ▪ Designed persona-driven query evaluation (journalist, analyst, researcher use cases) ▪ Consistently strong retrieval quality across all tested corpora