feat(notifications): Enhanced enterprise notifications stack with multi-channel delivery and monitoring#459
Conversation
…ti-channel delivery and monitoring - Replace basic notification setup with complete enterprise notification solution - Add Gotify for self-hosted push notifications with user management - Add NTFY for simple HTTP pub/sub notifications with topics - Add Apprise-API for 75+ notification services (Telegram, Slack, Discord, Email, SMS, etc.) - Add webhook receiver for custom integrations with signature verification - Include comprehensive monitoring (Grafana, Prometheus, AlertManager) - Add message queuing with Redis for reliable delivery - Add PostgreSQL for persistent storage and message history - Provide setup and validation scripts for easy deployment - Enterprise-grade features: rate limiting, retry logic, priority handling, security hardening This enhanced solution provides a complete, production-ready notification platform suitable for homelabs and businesses, offering multi-channel delivery and enterprise reliability features missing from the current implementation.
There was a problem hiding this comment.
Pull request overview
Adds an “enterprise notifications stack” bundle under stacks/notifications/ with setup/validation automation and extensive documentation intended to support multi-channel delivery (Gotify/NTFY/Apprise/webhooks) and optional monitoring.
Changes:
- Added a setup script that generates service configs (Gotify/NTFY/Apprise/Prometheus/Alertmanager/webhook receiver) and attempts to start containers via Docker Compose.
- Added a validation script that checks container health and basic HTTP endpoints / message flow.
- Added documentation and an environment template for configuring the stack, plus an “original” compose file snapshot.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| stacks/notifications/scripts/validate-notifications.sh | New validation harness for container + endpoint checks. |
| stacks/notifications/scripts/setup-notifications.sh | New setup automation that generates configs and starts services. |
| stacks/notifications/README.md | New end-to-end documentation for the notifications stack. |
| stacks/notifications/docker-compose.original.yml | Added compose file copy (“original”). |
| stacks/notifications/.env.example | Added env template for stack configuration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Check port availability | ||
| local ports=(8080 8081 8000 3002 9091 9093 6379 5432 8082 25 587) | ||
| for port in "${ports[@]}"; do | ||
| if netstat -tuln | grep ":$port " >/dev/null; then | ||
| print_warning "Port $port is already in use" | ||
| fi | ||
| done |
There was a problem hiding this comment.
check_prerequisites uses netstat to check port availability, but the script doesn’t verify netstat exists. On hosts without net-tools, netstat will cause the setup script to exit immediately (due to set -e). Consider using ss when available or gating the port check behind a command -v netstat/ss test.
| mkdir -p "$PROJECT_ROOT"/data/{notification-server,ntfy/cache,apprise,grafana,prometheus,alertmanager,redis,postgres,webhook/logs,smtp} | ||
| mkdir -p "$PROJECT_ROOT"/config/{notification-server,ntfy,apprise,grafana,prometheus,alertmanager,webhook} | ||
| mkdir -p "$PROJECT_ROOT"/logs | ||
|
|
||
| chmod -R 755 "$PROJECT_ROOT"/config | ||
| chmod -R 755 "$PROJECT_ROOT"/data |
There was a problem hiding this comment.
The script applies chmod -R 755 to config/ and data/. These directories can contain credentials (Gotify admin password, Alertmanager SMTP creds, Apprise URLs), and 755 makes them world-readable. Please tighten defaults (e.g., 700 on dirs + 600 on files, or at least restrict sensitive files) and document expected ownership/permissions for containers.
| responseheaders: | ||
| Access-Control-Allow-Origin: "*" | ||
| Access-Control-Allow-Methods: "GET,POST,OPTIONS" | ||
| Access-Control-Allow-Headers: "*" | ||
| stream: | ||
| allowedorigins: | ||
| - "*" | ||
| pingperiodseconds: 45 | ||
| keepaliveperiodseconds: 0 | ||
| database: | ||
| dialect: sqlite3 | ||
| connection: /etc/gotify/data/gotify.db | ||
| automigrate: true | ||
| defaultuser: | ||
| name: admin | ||
| pass: changeme | ||
| passstrength: 10 | ||
| uploadedimagesdir: /etc/gotify/data/images | ||
| pluginsdir: /etc/gotify/plugins | ||
| cors: | ||
| alloworigins: | ||
| - "*" | ||
| allowmethods: |
There was a problem hiding this comment.
Gotify config is generated with a hard-coded default admin password (defaultuser.pass: changeme) and very permissive CORS/response headers (*). This is a risky default for something described as production-ready. Consider templating defaultuser from .env.notifications (or prompting) and defaulting CORS to a restricted origin list (or disabled unless explicitly enabled).
| @app.route('/webhook/<channel>', methods=['POST']) | ||
| def webhook_receiver(channel): | ||
| """Receive webhook notifications""" | ||
| signature = request.headers.get('X-Signature') | ||
| payload = request.get_data() | ||
|
|
||
| if not verify_signature(payload, signature): | ||
| return jsonify({'error': 'Invalid signature'}), 401 | ||
|
|
There was a problem hiding this comment.
In the generated webhook receiver, verify_signature(payload, signature) is called even when X-Signature is missing. hmac.compare_digest(expected, None) raises a TypeError, leading to a 500 instead of a clean 401. Handle missing/invalid signatures explicitly before calling compare_digest (e.g., treat absent header as unauthorized).
| # Load environment variables | ||
| if [ -f .env.notifications ]; then | ||
| export $(grep -v '^#' .env.notifications | xargs) | ||
| fi |
There was a problem hiding this comment.
Environment loading uses export $(grep ... | xargs) which is fragile (breaks on spaces/quotes) and can mis-handle values containing # or =; it’s also easier to accidentally introduce shell injection. Prefer set -a; source .env.notifications; set +a (with a controlled, trusted file) or a safer line-by-line parser.
| ### Webhook Receiver | ||
| - **Purpose**: Custom webhook endpoint for integrations | ||
| - **Endpoint**: http://<server-ip>:8082/webhook/<channel> | ||
| - **Features**: Signature verification, rate limiting, logging | ||
| - **Configuration**: `config/webhook/` |
There was a problem hiding this comment.
The README lists “rate limiting” as a Webhook Receiver feature, but the generated Flask app only verifies an HMAC signature and prints the payload; it doesn’t implement request rate limiting. Either implement rate limiting (e.g., via Flask-Limiter / reverse-proxy limits) or adjust the feature list to avoid overstating capabilities.
|
|
||
| --- | ||
|
|
||
| *Last updated: $(date)* |
There was a problem hiding this comment.
*Last updated: $(date)* won’t be evaluated in Markdown and will render literally, which is confusing/outdated. Prefer a static date string, omit the line, or use a docs build step that actually renders the timestamp.
| # General Configuration | ||
| TIMEZONE=UTC | ||
| NOTIFICATIONS_NETWORK_SUBNET=172.23.0.0/24 | ||
| NOTIFICATIONS_NETWORK_GATEWAY=172.23.0.1 | ||
|
|
||
| # Notification Server (Gotify) Configuration | ||
| NOTIFICATION_ADMIN_USER=admin | ||
| NOTIFICATION_ADMIN_PASSWORD=changeme_CHANGE_IN_PRODUCTION | ||
| NOTIFICATION_PASSWORD_STRENGTH=10 | ||
| NOTIFICATION_SERVER_URL=http://localhost:8080 |
There was a problem hiding this comment.
This env example doesn’t define TZ or DOMAIN, but the compose file currently in the repo (docker-compose.yml) references both (TZ=${TZ:-...}, ntfy.${DOMAIN}, apprise.${DOMAIN}). As a result, users following the README will likely end up with the default timezone and an unset DOMAIN. Either add TZ/DOMAIN to the env example or update the compose/configs to use TIMEZONE consistently.
| # AlertManager Configuration (Optional) | ||
| ALERTMANAGER_ENABLED=false | ||
| ALERTMANAGER_SMTP_FROM=${ALERTMANAGER_SMTP_FROM:-alerts@example.com} | ||
| ALERTMANAGER_SMTP_SMARTHOST=${ALERTMANAGER_SMTP_SMARTHOST:-smtp.example.com:587} | ||
| ALERTMANAGER_SMTP_AUTH_USERNAME=${ALERTMANAGER_SMTP_AUTH_USERNAME} | ||
| ALERTMANAGER_SMTP_AUTH_PASSWORD=${ALERTMANAGER_SMTP_AUTH_PASSWORD} | ||
|
|
||
| # Redis Configuration (Optional) | ||
| REDIS_ENABLED=false | ||
| REDIS_PASSWORD=redispassword_CHANGE | ||
| REDIS_MAXMEMORY=256mb | ||
| REDIS_MAXMEMORY_POLICY=allkeys-lru | ||
|
|
||
| # PostgreSQL Configuration (Optional) | ||
| POSTGRES_ENABLED=false | ||
| POSTGRES_DB=notifications | ||
| POSTGRES_USER=notifications | ||
| POSTGRES_PASSWORD=postgrespassword_CHANGE | ||
| POSTGRES_MAX_CONNECTIONS=100 | ||
|
|
||
| # SMTP Relay Configuration (Optional) | ||
| SMTP_RELAY_ENABLED=false | ||
| SMTP_RELAY_HOST=smtp.gmail.com | ||
| SMTP_RELAY_PORT=587 | ||
| SMTP_RELAY_USERNAME=your-email@gmail.com | ||
| SMTP_RELAY_PASSWORD=your-app-password | ||
|
|
||
| # Webhook Configuration | ||
| WEBHOOK_SECRET=webhooksecret_CHANGE | ||
| WEBHOOK_MAX_BODY_SIZE=10M | ||
| WEBHOOK_RATE_LIMIT=100/hour | ||
|
|
||
| # Security Configuration | ||
| ENABLE_HTTPS=false | ||
| SSL_CERT_PATH=./certs | ||
| SSL_CERT_DOMAIN=notifications.local | ||
| ENABLE_AUTHENTICATION=true | ||
| API_KEY_GENERATION=true | ||
|
|
||
| # Channel Configuration | ||
| TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN} | ||
| TELEGRAM_CHAT_ID=${TELEGRAM_CHAT_ID} | ||
| SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL} | ||
| DISCORD_WEBHOOK_URL=${DISCORD_WEBHOOK_URL} | ||
| PUSHOVER_USER_KEY=${PUSHOVER_USER_KEY} | ||
| PUSHOVER_API_TOKEN=${PUSHOVER_API_TOKEN} |
There was a problem hiding this comment.
Many variables are set to self-references (e.g., TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}) or bash-style defaults (${VAR:-default}). That only works if the file is evaluated by a shell, but it won’t behave as expected if users feed it to tooling that parses .env files literally (including Docker Compose --env-file). Consider using plain KEY= placeholders (and document that the setup script sources the file) to avoid surprising behavior.
| services: | ||
| ntfy: | ||
| image: binwiederhier/ntfy:v2.11.0 | ||
| container_name: ntfy | ||
| restart: unless-stopped | ||
| networks: | ||
| - proxy | ||
| volumes: | ||
| - ntfy-data:/var/lib/ntfy | ||
| - ntfy-cache:/var/cache/ntfy | ||
| environment: | ||
| - TZ=${TZ:-Asia/Shanghai} | ||
| command: serve | ||
| labels: | ||
| - traefik.enable=true | ||
| - "traefik.http.routers.ntfy.rule=Host(`ntfy.${DOMAIN}`)" | ||
| - traefik.http.routers.ntfy.entrypoints=websecure | ||
| - traefik.http.routers.ntfy.tls=true | ||
| - traefik.http.services.ntfy.loadbalancer.server.port=80 | ||
| healthcheck: |
There was a problem hiding this comment.
docker-compose.original.yml appears to be a duplicate of the existing docker-compose.yml (same services, networks, volumes). Keeping two nearly-identical compose files tends to drift and confuses which one is authoritative. Consider removing this file or clearly documenting why it exists and how it should be used (e.g., rename to docker-compose.yml.example / move to docs).
Enhanced enterprise notifications stack with Gotify (push notifications), NTFY (HTTP pub/sub), Apprise-API (75+ services), webhook receiver, and comprehensive monitoring. This PR competes for the Notifications Stack bounty.