Skip to content

Delphictunic/mv_prototype

Repository files navigation

Surveillance Anomaly Detection System

Production-ready research pipeline for video anomaly detection with:

  • baseline temporal-visual model
  • text-prior semantic guidance
  • heatmap-prior spatial guidance
  • dual-prior fusion with spike-robust event logic

Project Layout

  • single_video_anomaly_system.py
    Main runnable system for a single MP4, with 4 overlays and metrics.

  • dataset_loader.py
    Dataset loading utilities for custom/Avenue/UCSD/ShanghaiTech flows.

  • baseline_model.py
    Baseline detector logic (frozen vision features + anomaly scoring).

  • text_prior_model.py
    Text-prior detector logic (CLIP/fallback text semantics + anomaly ranking).

  • spatial_prior_models.py
    Heatmap-prior and dual-prior detector logic.

  • benchmark_runner.py
    Multi-model benchmark runner for dataset-level comparisons.

  • ARCHITECTURE.md
    Detailed architecture notes and scoring logic.

  • DATASET_GUIDE.py
    Additional setup examples for real dataset layouts.

  • outputs/theft_run_mv_final_guarded_f1/
    Latest final overlays + metrics.


Architecture Summary

1) Baseline

  • Extract temporal/appearance cues per frame.
  • Build anomaly score from learned classifier calibration.
  • Apply spike filtering (minimum ON/OFF persistence) to reduce false bursts.

2) Text Prior (Semantic Prior)

  • Use detailed theft-focused prompts (forceful snatch/robbery language).
  • Compute frame-text similarity (full frame + center crop).
  • Blend semantic score with baseline score.

Potential: improves contextual understanding when visual motion alone is ambiguous (e.g., normal close interaction vs forceful theft).

3) Heatmap Prior (Spatial Prior)

  • Build center-focused spatial saliency for likely theft interaction area.
  • Generate spatial anomaly score from ROI dynamics.
  • Blend with baseline score.

Potential: suppresses irrelevant background activity and emphasizes suspicious motion inside important regions.

4) Dual Prior (Text + Heatmap)

  • Combine baseline, text prior, and heatmap prior scores.
  • Apply event-level spike suppression/hysteresis.
  • Use guarded blending to avoid AUROC degradation relative to baseline.

Potential: best practical signal when event meaning (text) and location relevance (heatmap) are both needed.


Run the System (Single MP4)

.venv/bin/python single_video_anomaly_system.py \
  --video-path "/Users/abhinavs/Desktop/mv.mp4" \
  --anomaly-ranges "0.21.2-0.22.4,1.04.6-1.05.5,1.41.5-1.44.7,2.24.9-2.28.6,3.35.4-3.36.7,6.23.9-6.26.2,7.02.4-7.03.3,7.51.0-7.54."

Outputs are written under:

  • outputs/theft_run_mv_final_guarded_f1/ (or the --work-dir you provide)

Includes:

  • test_predictions_overlay_baseline.mp4
  • test_predictions_overlay_text_prior.mp4
  • test_predictions_overlay_heatmap_prior.mp4
  • test_predictions_overlay_both_priors.mp4
  • metrics.json

Design Notes

  • Event decisions use spike filtering to ignore very short score spikes.
  • Threshold/window selection is tuned for event-level robustness, not frame-level jitter.
  • Guarded blending keeps AUROC stable (or improved) vs baseline when priors are weak.

Dependencies

Install once:

.venv/bin/pip install -r requirements.txt

Optional for stronger text priors:

.venv/bin/pip install git+https://github.com/openai/CLIP.git

mv_prototype

About

influence of text+heatmap prior study

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages