Skip to content

Project: Reasoning Agents (Azure AI Foundry) - SentinelSage Incident Triage Copilot #126

@MinzLong

Description

@MinzLong

Track

Creative Apps (GitHub Copilot)

Project Name

SentinelSage

GitHub Username

MinzLong

Repository URL

https://github.com/MinzLong/sentinelsage-incident-triage-copilot

Project Description

SentinelSage is an explainable incident-triage copilot for SOC/IT security teams. It ingests suspicious authentication alerts, correlates multi-signal evidence (failed logins, MFA failure, impossible travel, privileged role targeting, admin operations, threat-intel context, and user incident history), and produces a transparent triage report with case correlation, risk score, severity, ranked hypotheses, and recommended actions.

The project addresses a practical SOC pain point: analysts spend too much time triaging noisy alerts without enough decision context. SentinelSage compresses this process into a policy-driven, auditable workflow. Each decision includes evidence trace, timeline, uncertainty notes, follow-up questions, and counterfactual analysis to improve trust and reviewer confidence.

Key capabilities:

  • Policy-driven scoring via configurable JSON policy (no code changes required for tuning)
  • Deterministic safety guardrails with human-in-the-loop enforcement for high-risk actions
  • Optional OpenAI GPT / Azure AI Foundry enrichment with automatic fallback to deterministic mode
  • Multi-source ingestion (JSON, CSV, API URL)
  • Streamlit Command Center UI + FastAPI endpoints
  • SQLite audit trail and analyst feedback loop for continuous improvement

The implementation is reproducible, safety-first, and integration-ready for enterprise workflows.

Demo Video or Screenshots

Text Demo (Step-by-step):

Environment:

  • OS: Windows
  • Python: 3.14
  • Run location: local machine, repo root
  1. Install dependencies
    Command:
    py -3 -m pip install -r requirements.txt

Expected:

  • Dependencies install successfully
  • No required provider key for deterministic mode
  1. Run deterministic batch triage (extended context)
    Command:
    py -3 -m src.main --input data/extended_alerts.json --output data/reports_extended --summary-output data/triage_summary_extended.md --provider none

Expected artifacts:

  • data/reports_extended/*.md (per-alert reports)
  • data/triage_summary_extended.md (batch summary)
  • data/audit.db (audit log)

Observed behavior:

  • Alerts are triaged with case correlation (case_id)
  • Severity and risk differ by context strength
  • High-risk cases require human approval
  • Reports include executive summary, timeline, evidence trace, hypotheses, actions, and counterfactual analysis
  1. Open generated summary
    File:
    data/triage_summary_extended.md

Observed:

  • Total alerts count
  • Correlated cases count
  • Severity distribution (critical/high/medium/low)
  • Ranked table with: alert_id, case_id, risk, confidence, human approval, top hypothesis
  1. Inspect a detailed report
    File:
    data/reports_extended/ALERT-2026-2001.md

Observed:

  • Case ID generated for grouping
  • Severity and risk score
  • Executive Summary (manager-friendly)
  • Masked entities (PII-aware output)
  • Timeline reconstruction
  • Ranked hypotheses with confidence and rationale
  • Recommended actions with safety guardrails
  1. Launch Streamlit Command Center
    Command:
    py -3 -m streamlit run app.py

Observed in UI:

  • Command Center with KPI cards and triage queue
  • Risk filters and approval-only filter
  • Incident spotlight card
  • Role-based view (analyst / manager / soc_lead)
  • Detailed Reports tab with feedback form
  • Scenario Lab with presets (Balanced, Credential Stuffing, Account Takeover, Benign Login)
  • Compare mode for deterministic vs enriched outputs (if provider enabled)
  1. Scenario Lab proof of reasoning
    Action:
  • Run “Account Takeover” preset, then “Benign Login” preset

Observed:

  • Risk and severity change appropriately
  • Action recommendations become stricter for high-risk scenarios
  • Human approval requirement toggles based on severity/actions
  1. API service and integration readiness
    Command:
    py -3 -m uvicorn api:app --host 0.0.0.0 --port 8000 --reload

Endpoints verified:

  • POST /triage
  • POST /batch-triage
  • POST /feedback
  • GET /runs

Observed:

  • API returns structured triage outputs
  • Feedback endpoint stores analyst notes
  • /runs returns recent audited triage records
  1. Reliability check
    Command:
    py -3 -m unittest discover -s tests -p "test_*.py" -v

Observed:

  • All tests pass
  • Core logic, parsing, enrichment config, storage, and rendering paths validated

Conclusion:
SentinelSage delivers a reproducible, policy-driven, explainable triage workflow with deterministic safety controls, optional LLM enrichment, and enterprise-ready audit/feedback integration.

Primary Programming Language

Python

Key Technologies Used

  • Python 3.10+ (core implementation)
  • Streamlit (interactive SOC Command Center UI)
  • FastAPI + Uvicorn (integration-ready API service)
  • SQLite (audit trail and analyst feedback persistence)
  • Policy-driven scoring engine via JSON config (config/policy.json)
  • Optional OpenAI GPT enrichment (OpenAI Responses API)
  • Optional Azure AI Foundry enrichment
  • Multi-source ingestion pipeline (JSON, CSV, API URL)

Submission Type

Individual

Team Members

No response

Submission Requirements

  • My project meets the track-specific challenge requirements
  • My repository includes a comprehensive README.md with setup instructions
  • My code does not contain hardcoded API keys or secrets
  • I have included demo materials (video or screenshots)
  • My project is my own work with proper attribution for any third-party code
  • I agree to the Code of Conduct
  • I have read and agree to the Disclaimer
  • My submission does NOT contain any confidential, proprietary, or sensitive information
  • I confirm I have the rights to submit this content and grant the necessary licenses

Quick Setup Summary

  1. Clone the repository and move into the project folder.
  2. Install dependencies: py -3 -m pip install -r requirements.txt
  3. Run deterministic batch demo:
    py -3 -m src.main --input data/extended_alerts.json --output data/reports_extended --summary-output data/triage_summary_extended.md --provider none
  4. Launch the dashboard:
    py -3 -m streamlit run app.py
  5. (Optional) Start API service:
    py -3 -m uvicorn api:app --host 0.0.0.0 --port 8000 --reload
  6. (Optional) Enable GPT/Foundry enrichment by filling .env variables, then run with --provider openai or --provider foundry.

Technical Highlights

  • Built a policy-driven triage engine (config/policy.json) so scoring thresholds/weights can be tuned without code changes.
  • Implemented explainable reasoning outputs: evidence trace, executive summary, timeline, uncertainty notes, follow-up questions, and counterfactual analysis.
  • Added case correlation (case_id) and PII-aware masking for safer operational reporting.
  • Designed deterministic safety guardrails with human-in-the-loop enforcement for high-risk/destructive actions.
  • Added optional AI enrichment layers (OpenAI GPT / Azure AI Foundry) with automatic fallback to deterministic mode when keys are missing or calls fail.
  • Implemented multi-source ingestion (JSON, CSV, API URL) for realistic SOC input workflows.
  • Built an interactive Streamlit Command Center with role-based views, scenario presets, compare mode, and analyst feedback capture.
  • Added FastAPI endpoints (/triage, /batch-triage, /feedback, /runs) for integration into external systems and automation pipelines.
  • Implemented SQLite audit logging for triage runs and feedback to support traceability and continuous improvement.
  • Maintained reproducibility and reliability with automated unit tests across parsing, scoring, rendering, enrichment config, and storage paths.

Challenges & Learnings

Main challenge: balancing strong automation with safe operational behavior. In SOC workflows, over-aggressive auto-containment can cause business disruption, while under-reacting increases security risk.

How it was handled:

  • Kept deterministic policy-driven scoring as the baseline for reproducibility and auditability.
  • Added strict human-in-the-loop guardrails for high-risk and destructive actions.
  • Introduced explainability layers (evidence trace, timeline, counterfactuals, uncertainty notes) so analysts can validate decisions quickly.

Another challenge: making the system useful for different audiences (analysts vs managers) without losing technical depth.

  • Solved with role-oriented views and executive summaries in the UI/report outputs.

Key learnings:

  • Explainability and safety controls are as important as model intelligence.
  • Policy configurability is critical for real-world adoption because risk tolerance differs by organization.
  • Optional LLM enrichment works best as an additive layer; core decision safety should not depend on model availability.
  • Audit + feedback loops are essential for continuous improvement and trust in production-like environments.

Contact Information

minhlong1510.dna@gmail.com

Country/Region

Viet Nam

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions