Mithra is now liveRead the docs

Find Security Risks in LLM APIs

Mithra is an LLM security scanner that attacks REST-accessible models for risks like prompt injection, data leakage, jailbreaks, toxicity, misinformation, and more.

Mithra Logo
What Mithra Does — Key Features

A Penetration-Testing Toolkit for LLMs

Probes language models with a curated library of adversarial prompts to surface weaknesses: hallucination, data leakage, prompt injection, misinformation, toxicity, jailbreaks, and more.

Prompt Injection

Tests for input-driven behavior changes, context bleeding, and access control bypass via crafted prompts.

Leakage & Misinformation

Evaluates data exfiltration, hallucinations, and harmful content including toxicity and policy bypass.

Curated Attack Library

Research-backed prompts (e.g., DAN jailbreaks, encoding bypasses, exfiltration) mapped to known categories.

Coverage of Known Vulnerability Categories

Backed by Research. Focused on Real Exploits.

Pre-defined attacks cover documented weaknesses including jailbreaks, encoding bypasses, and exfiltration techniques. Scenarios are curated rather than dynamically generated.

REST API Support

Works with any LLM accessible via HTTP REST calls—OpenAI, Anthropic, or self-hosted models.

Pre-defined Attacks

Curated, research-backed adversarial prompts (e.g., jailbreaks, encoding bypasses) mapped to vulnerability categories.

How the REST Scanning Works in Practice

Programmatic, Research-Grounded Testing

Target any LLM with an HTTP endpoint. Mithra generates REST requests, applies curated attacks, detects exploit signals, and produces human- and machine-readable reports.

1) Send REST Requests

Uses a REST generator to articulate prompts and parameters over HTTP to your target model.

2) Apply Attacks

Runs a variety of adversarial prompts to test leakage, jailbreaks, policy bypass, and more.

3) Detect Signals

Detectors analyze responses to assess whether an exploit succeeded or behavior is undesirable.

4) Report Results

Get concise summaries and machine-readable outputs for automation and auditing.

TL;DR

Fast, Research-Grounded LLM API Scanning

  • A library of known attack prompts
  • Programmatic testing over HTTP for any REST-exposed model
  • Automatic detectors and clear reports