| title | Quasar - Adversarial Defense Engine | ||||
|---|---|---|---|---|---|
| emoji | 🌌 | ||||
| colorFrom | indigo | ||||
| colorTo | purple | ||||
| sdk | docker | ||||
| pinned | false | ||||
| tags |
|
- Live Deployment (Hugging Face): https://huggingface.co/spaces/kalpeshparashar/quasar
- Source Code (GitHub): https://github.com/Kalpesh-ops/quasar
As enterprise systems increasingly integrate Large Language Models (LLMs) and Continuous Learning pipelines, they become vulnerable to a silent threat: Adversarial Data Poisoning and Prompt Injection.
Malicious actors inject syntactically valid but semantically manipulative payloads into continuous data streams (like user feedback logs or API requests). The goal is to silently corrupt the AI's training weights, trigger prompt injections, or bypass alignment safeguards. Traditional Web Application Firewalls (WAFs) and signature-based firewalls fail here because the payloads look like standard JSON text.
Quasar is an advanced OpenEnv simulation that trains autonomous AI security agents to act as next-generation SOC analysts. The agent must monitor a continuous enterprise data pipeline, maintain backend database integrity, and accurately isolate adversarial anomalies without disrupting legitimate business data.
To ensure maximum real-world utility, Quasar does not rely solely on toy strings. It actively fetches real adversarial prompt injections from live Hugging Face datasets (deepset/prompt-injections) to generate its network traffic.
The environment provides a strictly typed QuasarObservation Pydantic model containing:
recent_traffic: A batch of the 5 most recent JSON API requests hitting the pipeline.database_integrity_score: The current health of the backend model (0.0 - 100.0). Drops heavily if poisoned data is allowed through.active_firewall_rules: A list of currently isolated IPs.
The agent responds with a strictly typed QuasarAction:
command: Must be one ofallow_packet,flag_packet,isolate_ip, orpass.target_id: The specificpacket_idto flag orsource_ipto isolate.
- Traffic Generation:
traffic_gen.pypulls real adversarial data from Hugging Face and wraps it in simulated enterprise JSON payloads. - State Management:
env.pymanages the step logic. It processes the agent's action, calculates the damage of missed packets, drops the Database Integrity score if necessary, and calculates the reward based on catch-rate vs. false-positive rate. - API Serving:
app.pyexposes the environment via the standard OpenEnv FastAPI endpoints (/step,/reset,/state) alongside a human-readable UI at the root (/).
Quasar tests agents across three escalating threat levels:
-
Task 1: Volumetric Flood (Easy)
- Scenario: A single noisy IP is flooding the pipeline with identical prompt-injection payloads.
- Objective: The agent must identify the repetitive source and use
isolate_ipto block it at the firewall level.
-
Task 2: Contextual Prompt Injection (Medium)
- Scenario: Real-world adversarial instructions (sourced from HF datasets) are hidden within normal-looking user queries from varying IPs.
- Objective: The agent must inspect the payload bodies and use
flag_packetto drop specific malicious requests while allowing benign traffic through.
-
Task 3: Stealth Data Poisoning (Hard)
- Scenario: Payloads contain subtle statistical anomalies and deeply nested JSON structures designed to slowly drift and corrupt the backend model's weights over time.
- Objective: The agent must detect these deep semantic anomalies and quarantine them, balancing a high catch rate with a zero false-positive rate.
QUASAR/
│
├── server/ # API Server configuration
│ ├── __init__.py
│ └── app.py # FastAPI application & UI wrapper for OpenEnv
│
├── src/ # Core Environment Engine
│ ├── __init__.py
│ ├── env.py # The OpenEnv Step/Reset/State logic and Grader
│ ├── models.py # Strict Pydantic Action and Observation schemas
│ └── traffic_gen.py # HF Dataset fetching and mock enterprise log generation
│
├── .gitattributes
├── .gitignore
├── Dockerfile # Hugging Face deployment configuration (Python 3.10)
├── inference.py # Baseline LLM inference script for validation
├── openenv.yaml # OpenEnv task definitions and metadata
├── pyproject.toml # Build system and dependencies
├── README.md # You are here
└── uv.lock # Fast dependency resolution lockfile
- OpenEnv Core: Meta's standard API specification for RL environments.
- Hugging Face Datasets: Used to stream real-world adversarial prompt injections (
deepset/prompt-injections) directly into the simulation. - FastAPI & Uvicorn: Powers the asynchronous HTTP server for agent interaction.
- Pydantic: Ensures strict, deterministic typing for LLM inputs and outputs to prevent hallucinated actions.
- Docker & uv: Used for lightning-fast containerized deployment on Hugging Face Spaces.
- OpenAI Python SDK: Used in the baseline script to test the environment using frontier models.
Ensure you have uv installed, then run the OpenEnv validator to confirm schema compliance:
openenv validateThe inference.py script proves the environment is solvable. It uses an LLM to navigate the environment, strictly adhering to the [START], [STEP], and [END] logging formats.
Configure your environment variables:
export API_BASE_URL="[https://api.groq.com/openai/v1](https://api.groq.com/openai/v1)" # Or OpenAI/vLLM equivalent
export MODEL_NAME="llama-3.3-70b-versatile"
export HF_TOKEN="your_api_key_here"Run the baseline:
python inference.pyNote: In baseline testing with llama-3.3-70b-versatile, the model achieved a perfect 1.0 score on Task 1, but dropped to ~0.6 on Tasks 2 and 3, mathematically proving that Quasar provides a genuine, difficult challenge for frontier LLMs rather than being a solved toy environment.
(This section provides static verification of all mandatory judging criteria)
- Automated Validation: Passes
openenv validatewith[OK] Ready for multi-mode deployment. - Container Limits:
Dockerfileoptimized withuvto build and execute well within the strict 2 vCPU / 8GB RAM constraint. - Inference Runtime: The baseline script executes 45 sequential API steps (3 tasks * 15 steps) comfortably under the 20-minute execution limit.
- HF Space Connectivity: Server binds to
0.0.0.0:7860and returnsHTTP 200 OKon both the root/UI and the OpenEnv/resetand/stependpoints.
- Configuration:
openenv.yamlcontains all mandatory metadata, entrypoints, and exact task definitions. - Strict Typing: Utilizes Pydantic for
QuasarObservation(State),QuasarAction(Action), andQuasarReward(Reward) models to enforce deterministic LLM output parsing. - Interface: Fully implements the
Environmentbase class withstep(),reset(), andstate()asynchronous methods returning typedStepResultobjects.
Contains exactly 3 graded tasks mapping to a strict Easy → Medium → Hard progression:
task_1_volumetric_flood(Easy)task_2_contextual_injection(Medium)task_3_stealth_poisoning(Hard)
- Grader Determinism: The reward function mathematically calculates a deterministic float between
0.0and1.0based on backend integrity, catch-rate, and false-positive penalties.
- Location: Deployed at the exact root directory.
- Client Restriction: Strictly utilizes the mandatory
OpenAIclient (viaAsyncOpenAI). - Environment Variables: Dynamically loads
API_BASE_URL,MODEL_NAME, andHF_TOKENprior to client initialization. - Strict stdout Logging: Implements the exact mandatory regex-parsable stdout logs:
[START] task={task} env={env} model={model}[STEP] step={step} action={action} reward={reward} done={done} error={error}[END] success={success} steps={steps} score={score} rewards={rewards}