#

behavioral-testing

Here are 7 public repositories matching this topic...

Basaltlabs-app / Gauntlet

Community-driven behavioral reliability benchmark for LLMs. 88 probes across 24 categories, deterministic TrustScore, hardware-stratified community rankings, performance prediction. Every test contributes to the community dataset.

benchmark mcp community-driven model-evaluation ai-evaluation llm ollama sycophancy hallucination-detection llm-testing hardware-benchmark ai-trust trust-scoring behavioral-testing llm-benchmark deterministic-scoring

Updated Apr 13, 2026
Python

senaayy / Computational-Cognitive-Lab

python machine-learning neuroscience computational-neuroscience cognitive-science mne-python biomedical-engineering eeg-analysis stroop-test neurotechnology behavioral-testing erp-analysis

Updated Dec 12, 2025
Python

stef41 / modeldiff

Behavioral regression testing for LLMs — diff, drift, fingerprint. Zero deps.

python nlp machine-learning evaluation regression-testing fingerprinting model-comparison drift-detection llm behavioral-testing

Updated Apr 10, 2026
Python

Swanand33 / llm-behave

Behavioral testing for LLM applications. pytest plugin with semantic assertions, multi-turn conversation testing, and drift detection. No LLM judge needed.

python testing ai pytest openai pytest-plugin llm langchain ai-testing llm-testing behavioral-testing

Updated Mar 14, 2026
Python

stef41 / modeldiffx

Model behavioral diffing - compare LLM outputs across versions, detect regressions.

python testing regression-testing model-evaluation llm behavioral-testing

Updated Apr 11, 2026
Python

SyncTek-LLC / specterqa

AI persona-based behavioral testing for web apps. No test scripts. YAML-configured. Vision-powered.

python testing cli qa ai vision developer-tools code-of-conduct software-quality persona playwright behavioral-testing trust-index

Updated Mar 21, 2026
Python

GenesisClawbot / llm-drift

LLM drift detector — know within 5 min when GPT-4o, Claude, or Gemini silently changes behaviour. Open source, self-hostable.

saas gemini openai regression-testing gpt claude mlops drift-detection production-ml model-testing ai-monitoring llm llmops prompt-testing llm-monitoring llm-observability behavioral-testing

Updated Apr 15, 2026
Python

Improve this page

Add a description, image, and links to the behavioral-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the behavioral-testing topic, visit your repo's landing page and select "manage topics."