Skip to content

FTS18/Matcha-AI-DTU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

124 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Matcha-AI-DTU

GitHub Issues GitHub Pull Requests GitHub Last Commit Build Status Security Hardening GSSoC '26 GitHub Contributors GitHub License CI Build Status Container Registry

Matcha-AI-DTU is a comprehensive AI-powered sports video analysis platform. It processes sports footage (specifically soccer/football), tracks the ball and players, automatically detects goals and events using SoccerNet + custom vision models, generates insightful commentary using Google Gemini LLMs, synthesizes high-quality audio voiceovers using a 3-tier neural TTS pipeline (Kokoro-82M → edge-tts → silence), and builds a complete analytics suite including player heatmaps, ball speed estimation, and team colour detection.

This is a monorepo built with Next.js, NestJS, and FastAPI, orchestrated with Docker and Turborepo.


Architecture Overview

The system is broken down into three main applications:

graph TD
    subgraph "Core & Tooling"
        Shared["packages/shared (ApiClient/Utils)"]
        Database["packages/database (Prisma Client)"]
        Env["packages/env (Strict Validation)"]
        Contracts["packages/contracts (Zod Schemas)"]
        UI["packages/ui (Standardized Components)"]
        Theme["packages/theme (Design Tokens)"]
    end

    subgraph "Services"
        Orchestrator["services/orchestrator (NestJS Hub)"]
        Inference["services/inference (FastAPI Engine)"]
    end

    subgraph "Data Layer"
        PG["PostgreSQL (Storage)"]
        RDS["Redis (Cache/PubSub)"]
    end

    Web -.-> UI
    Web -.-> Theme
    Web -.-> Shared
    Mobile -.-> Shared
    Orchestrator ==> Database
    Orchestrator ==> Contracts
    Orchestrator ==> Env
    Web ==> Orchestrator
    Mobile ==> Orchestrator
    Orchestrator ==> Inference
    Database --- PG
    Orchestrator --- RDS
Loading
  1. Frontend Web (apps/web): A modern Next.js 14.2 interface (Security Hardened). Consumes @matcha/ui for standardized components.
  2. Frontend Mobile (apps/mobile): An Expo (React Native) app sharing business logic via @matcha/shared.
  3. Orchestrator (services/orchestrator): NestJS backend for API orchestration and database management.
  4. Shared Tooling (packages/*):
    • @matcha/ui: React component system (VideoPlayer, MatchReportPDF, ScoreBadge).
    • @matcha/theme: Global design system (Tailwind config, Brand colors, Fonts).
    • @matcha/shared: Universal API client, WebSocket registries, and logical utilities.
    • @matcha/database: Shared Prisma schema and generated client.
    • @matcha/contracts: Centralized Zod validation schemas for all API payloads.
    • @matcha/env: Strict, boot-time environment variable validation via T3-Env.
  5. Inference Engine (services/inference): Python FastAPI AI pipeline (YOLO, SoccerNet, Gemini).

Key Features

  • Automated Video Analysis: Upload raw sports footage and let the system automatically analyze the content through a 5-phase AI pipeline.
  • Advanced Goal Detection: Custom GoalDetectionEngine built with Kalman filtering, homography projection, and a finite state machine that auto-calibrates to the goal line, tracks the ball across frames, and confirms goals with high precision.
  • Action Event Recognition: SoccerNet-trained model detects 11 event types: GOAL, SAVE, TACKLE, FOUL, CORNER, YELLOW_CARD, RED_CARD, PENALTY, OFFSIDE, CELEBRATION, HIGHLIGHT. Falls back to motion-peak detection when SoccerNet returns too few results.
  • AI Commentary Generation: Context-aware, natural-sounding commentary (40–60 words per event) generated by Gemini 2.0 Flash, describing the on-field action with late-game intensity boosts.
  • 3-Tier Neural TTS Pipeline:
      1. Kokoro-82M (hexgrad/Kokoro-82M) — #1 ranked in TTS-Spaces-Arena, British female sports voice
      1. Microsoft edge-tts (en-GB-RyanNeural) — British male broadcaster, always available, no key needed
      1. FFmpeg silent audio — absolute fallback
  • Real-Time Progress Tracking: WebSocket integration provides users with live updates across the 5-phase analysis pipeline, including per-event streaming as events are detected.
  • Player Heatmap: OpenCV-rendered top-down pitch diagram showing where each team concentrated their play throughout the entire match, with Gaussian-blurred density overlays coloured by auto-detected jersey colour.
  • Ball Speed Estimation: YOLO ball tracking data is used to estimate the peak ball speed across the match (95th-percentile of pixel delta → km/h). Displayed in the Analytics tab.
  • Team Colour Detection: NumPy K-Means clustering on jersey crops automatically identifies which colour belongs to which team. Displayed as hex swatches.
  • Highlight Reel Generation: Top-N non-overlapping clips are cut, overlaid with scrolling text, mixed with TTS commentary + crowd ambience + background music, and concatenated into a single MP4 via FFmpeg.
  • Context Score Intelligence: Every event is scored 0–10 via a weighted formula balancing event type importance, motion intensity, temporal position, and detection confidence. Late goals are doubly weighted.
  • Shared Architecture: A dedicated @matcha/shared package centralizes TypeScript interfaces, API clients, WebSocket message registries, and formatters for universal type safety across web, mobile, and orchestrator.
  • Secure Authentication: Built-in JWT-based authentication via NestJS AuthModule and bcrypt, linking analyzed matches to distinct user accounts.
  • Universal Design & PDF Generation: Reusable Tailwind-driven React components (ScoreBadge, CopyButton) and standard logical themes applied globally, with an integrated @react-pdf/renderer module capable of instantly generating rich, visual Match Reports as downloadable PDFs.

Quick Start (The Matcha CLI)

The easiest way to develop is using our built-in Matcha CLI (Makefile).

# 1. Clone the repo
git clone https://github.com/FTs18/Matcha-AI-DTU.git && cd Matcha-AI-DTU

# 2. Setup your .env files (see Step 4 below)

# 3. Start Infrastructure (Postgres, Redis, MinIO)
make up

# 4. Validate Locally (Optional but Recommended)
make check

# 5. Install & Start Development (Frontend + API + AI Engine)
npm install
make dev

Pro Tip: Run make help to see all available shortcuts!


One-Click Development (Dev Containers)

This project supports VS Code Dev Containers.

  1. Open this project in VS Code.
  2. Click "Reopen in Container" when prompted.
  3. Everything (Node, Python, Docker, PostgreSQL) is pre-installed and configured for you.

Prerequisites

Tool Minimum Version Install Link
Node.js 18+ https://nodejs.org (choose LTS)
Python 3.9+ https://www.python.org/downloads/
Docker Desktop Latest https://www.docker.com/products/docker-desktop/
Git Latest https://git-scm.com/downloads
Make Latest See SETUP.md
FFmpeg Latest https://ffmpeg.org/download.html
NVIDIA GPU + CUDA 12.4 (optional) Boosts YOLO inference speed

Python Note: The local inference engine has been tested and confirmed working on Python 3.9 (the macOS system default). Python 3.10+ also works. Do not use 3.7 or 3.8.


Step 1 — Clone & Prepare

git clone https://github.com/YOUR-ORG/Matcha-AI-DTU.git
cd Matcha-AI-DTU

# Create the uploads directory (git-ignored, must be created manually)
mkdir uploads

Step 2 — Start Infrastructure (Docker)

Make sure Docker Desktop is open and running, then:

docker-compose up -d --build

This starts:

  • PostgreSQL on port 5433 (not 5432 — intentional to avoid local conflicts)
  • Redis on port 6380

Step 3 — Initialize Database & Demo Data

make db-migrate
make seed

Step 4 — Install Node.js Dependencies & Build

# Install all monorepo packages
npm install

# Build all shared TypeScript packages
npx turbo run build

Then apply the database schema:

cd services/orchestrator
npx prisma migrate deploy   # use 'migrate dev' on a fresh local DB
cd ../..

Step 4 — Configure Environment Variables

services/orchestrator/.env

PORT=4000
CORS_ORIGIN=http://localhost:3000
DATABASE_URL="postgresql://matcha_user:matcha_password@localhost:5433/matcha_db?schema=public"
INFERENCE_URL=http://localhost:8000
GEMINI_API_KEY=your_google_ai_studio_key   # https://aistudio.google.com/app/apikey
HF_TOKEN=hf_your_token_here               # optional, from huggingface.co/settings/tokens

apps/web/.env.local

NEXT_PUBLIC_API_URL=http://localhost:4000/api/v1

services/inference/.env

ORCHESTRATOR_URL=http://localhost:4000/api/v1
GEMINI_API_KEY=your_google_ai_studio_key
HF_TOKEN=hf_your_token_here

Step 5 — Set Up the Python Inference Engine

cd services/inference

macOS / Linux:

python3 -m venv venv
source venv/bin/activate

Windows (PowerShell):

py -3.11 -m venv venv
.\venv\Scripts\activate

Then install dependencies:

pip install --upgrade pip

# CPU only (works on all machines):
pip install -r requirements.txt

# OR — with CUDA 12.4 GPU acceleration (NVIDIA only):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt

Go back to project root:

cd ../..

Step 6 — Run the Full Stack

One command starts everything — Frontend, Orchestrator, AND the Python Inference Engine:

# From project root
npx turbo run dev

This works because services/inference/package.json now has a dev script that calls ./venv/bin/python -m uvicorn ... directly — no shell activation needed. Turbo runs all three services concurrently and streams their logs together.

First run only: YOLO will automatically download model weights (~22 MB for yolov8s-pose.pt and ~6 MB for yolov8n.pt). This takes ~30 seconds with a good internet connection and is cached for all future runs.

Windows users: The dev script uses ./venv/bin/python which is the macOS/Linux path. On Windows, run the inference manually:

cd services/inference
.\venv\Scripts\activate
python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Or run npm run dev:win from inside services/inference.

If you prefer separate terminals:

Terminal Command Port
Frontend cd apps/web && npm run dev 3000
Orchestrator cd services/orchestrator && npm run dev 4000
Inference (macOS/Linux) cd services/inference && ./venv/bin/python -m uvicorn main:app --port 8000 --reload 8000
Inference (Windows) cd services/inference && .\venv\Scripts\python.exe -m uvicorn main:app --port 8000 8000

Step 7 — Verify

# Check Orchestrator is healthy
curl http://localhost:4000/api/v1/health
# Expected: {"status":"ok","service":"orchestrator",...}

# Check Inference is alive
curl http://localhost:8000/

Then open http://localhost:3000 (or 3001 if 3000 was busy — check terminal output).



Project Structure

Matcha-AI-DTU/
├── apps/
│   ├── web/                # Next.js 15 Web App
│   └── mobile/             # Expo (React Native) Mobile App
├── packages/
│   ├── contracts/          # Zod validation schemas (Universal)
│   ├── database/           # Prisma schema & Client (Shared)
│   ├── env/                # T3-Env validation (Safe env access)
│   ├── shared/             # API clients & logical utilities
│   ├── theme/              # Tailwind & Design tokens
│   └── ui/                 # React component library
├── services/
│   ├── orchestrator/       # NestJS API (Port 4000)
│   │   └── prisma/         # Database schema & migrations
│   └── inference/          # Python FastAPI AI Pipeline (Port 8000)
│       ├── app/core/       # Pipeline modules (analysis, heatmap, goal detection)
│       └── AI_PIPELINE.md  # Detailed 5-phase pipeline documentation
├── uploads/                # Global assets (raw/, processed/, analytics/, audio/)
├── docs/
│   ├── ARCHITECTURE.md     # Full system architecture
│   ├── API.md              # HTTP & WebSocket API contracts
│   └── CONTRIBUTING.md     # Contribution guidelines & dev setup
└── docker-compose.yml      # PostgreSQL & Redis infrastructure

Maintenance & Security

This project prioritizes long-term stability and functional reliability. We maintain a strict versioning policy for core frameworks to ensure compatibility across our AI pipeline and reporting tools.

For details on our security posture, known vulnerabilities, and why certain major upgrades (like React 19 or Next.js 16) are currently deferred, please refer to our CONTRIBUTING.md.


Documentation

Document Description
docs/ARCHITECTURE.md Full system architecture — data flow, DB schema, analytics pipeline, TTS tiers, frontend tree, env var reference
docs/API.md Complete HTTP + WebSocket API contracts with request/response bodies, event type table, and score formula
docs/CONTRIBUTING.md Contributing guidelines, local dev setup, adding new features, mobile responsiveness guide
services/inference/AI_PIPELINE.md Deep-dive into all 5 pipeline phases: YOLO, SoccerNet, Gemini, TTS, and analytics
docs/ONBOARDING.md Onboarding guide for new developers
docs/ALGORITHMS.md Detailed AI logic and mathematical foundations
API Docs (Swagger) http://localhost:4000/api/docs (Orchestrator)
AI API Docs http://localhost:8000/docs (Inference Engine)

Troubleshooting

  • Ports already in use: Use netstat -ano | Select-String ":3000" to find and kill conflicting operations.
  • Postgres/Redis connection failures: Ensure Docker daemon is running and containers haven't exited (docker ps -a).
  • Python Import errors (TTS/Torch): Double-check that your venv is activated. Run pip install -r requirements.txt again.
  • Analysis stuck at 0%: Ensure ORCHESTRATOR_URL is correct in the Inference .env. The callback route may be unreachable. Check the Python terminal for error logs.
  • Kokoro TTS not working: Set HF_TOKEN in services/inference/.env. Without it, anonymous HuggingFace rate limits may block Kokoro — the system will automatically fall back to edge-tts.
  • Heatmap not showing in Analytics tab: The heatmap is only generated if YOLO detected players during the analysis. Re-analyze the match. Check the inference logs for Heatmap saved → or Heatmap generation failed:.
  • prisma generate fails with EPERM: This means the orchestrator is running and has the Prisma DLL locked. Stop the service, run npx prisma generate, then restart.

Contributors

We are deeply grateful to everyone who has contributed to Matcha AI. Whether it is fixing a bug, adding a new feature, or improving documentation, your support makes this project possible.

Our Contributors


Built for Matcha-AI-DTU

About

Matcha AI is an automated sports broadcast assistant that transforms raw match footage into a professional, engaging viewing experience. By leveraging computer vision and multi-modal AI, it generates live commentary and emotionally meaningful highlights, bringing elite production quality to local and amateur sports.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors