Skip to content

ImGJUser1/SentinelML-Tool

Repository files navigation

SentinelML

PyPI version Python 3.8+ License: MIT

Unified Reliability Engine for AI/ML Systems

SentinelML is a comprehensive framework for monitoring, evaluating, and ensuring the reliability of machine learning systems across traditional ML, deep learning, generative AI, RAG pipelines, and agentic systems.


πŸš€ Features

Multi-Domain Support

  • Traditional ML: Drift detection, anomaly detection, out-of-distribution detection
  • Deep Learning: Uncertainty quantification, adversarial detection, feature drift monitoring
  • Generative AI: Input/output guardrails, hallucination detection, bias detection
  • RAG Systems: Retrieval relevance, faithfulness checking, end-to-end evaluation (RAGAS, ARES)
  • Agent Systems: Trajectory validation, tool monitoring, reasoning consistency

Core Capabilities

Capability Description
πŸ” Drift Detection KS-test, PSI, MMD, Adversarial drift detectors
πŸ›‘οΈ Trust Scoring Mahalanobis distance, Isolation Forest, VAE-based anomaly detection
🎯 Uncertainty Quantification MC Dropout, Deep Ensembles, Evidential Networks, Temperature Scaling
πŸ”’ Guardrails Prompt injection detection, PII filtering, toxicity detection, schema validation
πŸ“Š Visualization Trust dashboards, drift plots, interactive Plotly dashboards
πŸ–₯️ Serving FastAPI and gRPC servers for production monitoring

πŸ“¦ Installation

# Basic installation (Traditional ML only)
pip install sentinelml

# With PyTorch support
pip install sentinelml[torch]

# With TensorFlow support
pip install sentinelml[tensorflow]

# For Generative AI / LLM applications
pip install sentinelml[genai]

# For RAG applications
pip install sentinelml[rag]

# For production serving
pip install sentinelml[serving]

# Complete installation
pip install sentinelml[all]

# Development installation
pip install sentinelml[dev]

πŸƒ Quick Start

Traditional ML Monitoring

import numpy as np
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sentinelml import Sentinel, KSDriftDetector, MahalanobisTrust

# Load data
X, y = load_iris(return_X_y=True)
X_train, X_test = X[:100], X[100:]

# Train your model
model = RandomForestClassifier().fit(X_train, y[:100])

# Initialize Sentinel with drift and trust monitoring
sentinel = Sentinel(
    drift_detector=KSDriftDetector(threshold=0.05),
    trust_model=MahalanobisTrust(),
    verbose=True
)

# Fit on reference (training) data
sentinel.fit(X_train)

# Assess new samples
results = []
for x in X_test:
    result = sentinel.assess(x)
    results.append(result)
    print(f"Trust: {result.trust_score:.3f}, Drift: {result.has_drift}")

# Visualize
from sentinelml.viz import plot_trust
trust_scores = [r.trust_score for r in results]
plot_trust(trust_scores, title="Trust Scores on Test Data")

Detecting Drift

import numpy as np

# Simulate drifted data
drift_data = X_test + np.random.normal(0, 2, X_test.shape)

# Assess drifted samples
for x in drift_data[:5]:
    result = sentinel.assess(x)
    print(f"Trust: {result.trust_score:.3f}, "
          f"Drift p-value: {result.drift_pvalue:.4f}, "
          f"Is Trustworthy: {result.is_trustworthy}")

GenAI Guardrails

from sentinelml import PromptInjectionDetector, HallucinationDetector

# Input validation
injection_detector = PromptInjectionDetector(threshold=0.7)
result = injection_detector.detect("Ignore previous instructions and...")
print(f"Injection detected: {result.is_violation}, Score: {result.score}")

# Output validation (RAG context)
hallucination_detector = HallucinationDetector(method="self_consistency")
context = ["Paris is the capital of France.", "France is in Europe."]
generated = "Paris is the capital of Germany."
result = hallucination_detector.verify(context, generated)
print(f"Hallucination detected: {result.is_hallucination}")

RAG Evaluation

from sentinelml import RAGASEvaluator, FaithfulnessChecker

# End-to-end RAG evaluation
evaluator = RAGASEvaluator(metrics=["faithfulness", "answer_relevancy", "context_recall"])
results = evaluator.evaluate(
    questions=["What is the capital of France?"],
    answers=["Paris is the capital of France."],
    contexts=[["Paris is the capital of France."]],
    ground_truths=["Paris"]
)

# Component-level checking
faithfulness = FaithfulnessChecker()
score = faithfulness.check(answer="Paris is the capital.", context="Paris is France's capital city.")

Agent Monitoring

from sentinelml import StepValidator, LoopDetector, BudgetManager

# Monitor agent execution
validator = StepValidator()
loop_detector = LoopDetector(window_size=5)
budget = BudgetManager(max_steps=50, max_tokens=10000)

# Validate each step
for step_num, (thought, action, observation) in enumerate(agent_steps):
    validation = validator.validate_step(thought, action, observation)
    if loop_detector.detect_loop(agent_steps[:step_num+1]):
        print("Loop detected! Breaking...")
        break
    if not budget.consume_step(tokens_used=len(thought.split())):
        print("Budget exceeded!")
        break

πŸ–₯️ Command Line Interface

# Scan dataset for drift and anomalies
sentinelml scan data.csv --drift-detector mmd --trust-model mahalanobis --output report.json

# Evaluate model reliability
sentinelml evaluate model.pkl test.csv --labels target --output evaluation.json

# Start monitoring server
sentinelml serve --port 8000 --config sentinel.yaml

# Generate configuration template
sentinelml config --type genai --output sentinel.yaml

πŸ“ Project Structure (v2.0)

sentinelml/
β”œβ”€β”€ core/                    # Core engine and orchestration
β”‚   β”œβ”€β”€ sentinel.py         # Main Sentinel orchestrator
β”‚   β”œβ”€β”€ pipeline.py         # Processing pipelines
β”‚   β”œβ”€β”€ ensemble.py         # Adaptive trust ensembles
β”‚   └── report.py           # Reporting infrastructure
β”œβ”€β”€ traditional/            # Traditional ML monitoring
β”‚   β”œβ”€β”€ drift/             # Drift detection methods
β”‚   β”œβ”€β”€ trust/             # Anomaly/trust scoring
β”‚   └── familiarity/       # OOD detection
β”œβ”€β”€ deep_learning/         # Deep learning specific
β”‚   β”œβ”€β”€ uncertainty/       # UQ methods (MC Dropout, Ensembles, etc.)
β”‚   β”œβ”€β”€ feature_drift/     # Activation/embedding monitoring
β”‚   └── adversarial/       # Adversarial attack detection
β”œβ”€β”€ genai/                 # Generative AI guardrails
β”‚   β”œβ”€β”€ guardrails/        # Input/output validation
β”‚   β”œβ”€β”€ alignment/         # Bias and toxicity detection
β”‚   └── uncertainty/       # LLM uncertainty estimation
β”œβ”€β”€ rag/                   # RAG pipeline evaluation
β”‚   β”œβ”€β”€ retrieval/         # Retrieval metrics
β”‚   β”œβ”€β”€ generation/        # Generation quality
β”‚   β”œβ”€β”€ advanced/          # Claim verification, contradiction detection
β”‚   └── end_to_end/        # RAGAS, ARES evaluators
β”œβ”€β”€ agents/                # Agent system monitoring
β”‚   β”œβ”€β”€ trajectory/        # Step validation, loop detection
β”‚   β”œβ”€β”€ reasoning/         # Logic checking, consistency
β”‚   └── state/             # Budget and checkpoint management
β”œβ”€β”€ adapters/              # Framework integrations
β”‚   β”œβ”€β”€ sklearn_adapter.py
β”‚   β”œβ”€β”€ torch_adapter.py
β”‚   β”œβ”€β”€ tensorflow_adapter.py
β”‚   β”œβ”€β”€ openai_adapter.py
β”‚   β”œβ”€β”€ langchain_adapter.py
β”‚   └── ...
β”œβ”€β”€ infrastructure/        # Production infrastructure
β”‚   β”œβ”€β”€ serving/           # FastAPI/gRPC servers
β”‚   β”œβ”€β”€ storage/           # Vector store integration
β”‚   └── streaming/         # Kafka consumers
└── viz.py                # Visualization utilities

πŸ”§ Configuration

Create a configuration file for different deployment scenarios:

# sentinel.yaml - Traditional ML
sentinel:
  drift_detector:
    type: mmd
    threshold: 0.05
  trust_model:
    type: mahalanobis
    calibration: isotonic

monitoring:
  batch_size: 1000
  check_interval: 3600
# sentinel.yaml - GenAI
sentinel:
  guardrails:
    input:
      - type: prompt_injection
        threshold: 0.7
      - type: pii_detection
        entities: [email, phone, ssn]
    output:
      - type: hallucination_detection
        method: self_consistency

llm:
  model: gpt-4
  temperature: 0.7

πŸ“Š Benchmarking & Comparison

SentinelML includes comprehensive benchmarking tools to compare against baseline methods:

from sentinelml.benchmarks import BenchmarkComparison
from sklearn.ensemble import IsolationForest
from sklearn.neighbors import LocalOutlierFactor

# Compare Sentinel against baselines
benchmark = BenchmarkComparison(sentinel=sentinel, model=model)
results = benchmark.evaluate(X_test, y_test)

# Returns comparison of:
# - sentinel: Trust scores from SentinelML
# - entropy: Prediction entropy (uncertainty)
# - isolation_forest: Isolation Forest anomaly scores
# - lof: Local Outlier Factor scores

πŸ›£οΈ Roadmap

Version 2.1 (Current)

  • βœ… Modular architecture rewrite
  • βœ… GenAI guardrails (input/output)
  • βœ… RAG evaluation framework
  • βœ… Agent monitoring tools
  • βœ… FastAPI/gRPC serving

Version 2.2 (Upcoming)

  • Streaming drift detection (Kafka integration)
  • Distributed monitoring (Ray/Spark)
  • Advanced attribution methods
  • Automated threshold tuning

Version 3.0 (Future)

  • Multi-modal support (vision, audio)
  • Real-time adversarial defense
  • LLM-powered root cause analysis
  • Enterprise dashboard

🀝 Contributing

Contributions are welcome! Please see our Contributing Guide.

# Development setup
git clone https://github.com/sentinelml/sentinelml.git
cd sentinelml
pip install -e ".[dev]"

# Run tests
pytest tests/ --cov=sentinelml

# Code quality
black sentinelml/ tests/
isort sentinelml/ tests/
flake8 sentinelml/ tests/

πŸ“š Research Background

SentinelML integrates research from:

  • Out-of-Distribution Detection: Hendrycks & Gimpel, Liu et al.
  • Drift Detection: Rabanser et al. (MMD), dos Reis et al. (PSI)
  • Uncertainty Quantification: Gal & Ghahramani (MC Dropout), Lakshminarayanan et al. (Deep Ensembles)
  • LLM Safety: Perez & Ribeiro (red teaming), Minding the Gap (hallucination detection)
  • RAG Evaluation: Es et al. (RAGAS), Saad-Falcon et al. (ARES)

πŸ“„ Citation

If you use SentinelML in your research:

@software{sentinelml2024,
  title={SentinelML: Unified Reliability Engine for AI/ML Systems},
  author={SentinelML Team},
  year={2024},
  version={2.0.0},
  url={https://github.com/sentinelml/sentinelml}
}

πŸ“œ License

MIT License - see LICENSE file.


πŸ”— Links


πŸ’‘ Support

For questions and support:

  • πŸ“§ Email: team@sentinelml.ai
  • πŸ’¬ Discussions: GitHub Discussions
  • πŸ› Issues: GitHub Issues

SentinelML: Trustworthy AI through continuous monitoring

About

PyPLTool : Runtime trust layer for machine learning systems. Detects drift, uncertainty, and reliability risks in production ML models.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors