I will test your llm chatbot for jailbreaks, data leaks and unsafe behavior

V
vladislav_boev
V
vladislav_boev
Vladislav Boev

About this gig

LLM Behavioral & Safety Testing by a QA Lead

I'm a QA Lead (6+ yrs) applying systematic test design to AI. I build test sets that surface where your LLM-powered bot behaves unsafely or breaks its own rules jailbreaks, prompt injection, prompt leaks, hallucinations, refusal failures, and data-access risks.

How it works:

  1. You share your system prompt + how the bot is used
  2. I map the risk zones specific to your use-case
  3. I build the test cases (input expected behavior + severity + rationale)
  4. You get JSONL + CSV + a readable report ready for your eval harness

Premium: I also run the tests against your model and deliver a findings report each failure with input, expected vs actual, and severity.

What I don't do: I don't judge factual or domain accuracy (legal, medical, etc.) that needs a subject-matter expert. I test behavior, safety & instruction-following.

Need a large or ongoing set? Message me for a custom quote. Written-first, GMT+7. Message me before ordering.

Get to know Vladislav Boev

Vladislav Boev

Senior QA Lead and Test Architect

  • FromVietnam
  • Member sinceJun 2026
  • Avg. response time1 hour
  • Languages

    Russian, English
QA Lead with 6+ yrs. Test at architecture level: data flows, integrations, system design, risks. Services: QA Audit: process + test code review. Top risks + roadmap. Test Strategy: levels, tools, effort estimates. Auto-tests: Python + Playwright + Pytest (UI/API). Code Review for test automation. Requirements analysis: find contradictions, gaps, risks. I don't: CI/CD setup (only requirements), performance testing. Written-first. Clear reports. GMT+7 (Asia). Message me before ordering.

Related tags