Skip to content

Production-grade analytics instrumentation for greenfield projects No silent failures. No dropped events. Rock-solid from day one.

License

Notifications You must be signed in to change notification settings

MacFall7/taxonomy-tap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analytics Starter

Production-grade analytics instrumentation for greenfield projects No silent failures. No dropped events. Rock-solid from day one.

TypeScript License: MIT


Why this exists Most teams discover analytics problems weeks after launch: missing events, schema drift, PII leaks, flaky vendors. This starter treats analytics as an SRE problem, not a tracking snippet.


What is this?

Analytics Starter is an opinionated, production-ready monorepo that demonstrates how to build reliable, validated, and observable analytics into a new application from the ground up.

Built for teams who want:

  • Zero silent failures — invalid events are quarantined, not dropped
  • Type-safe tracking — event taxonomy validated at compile-time and runtime
  • PII protection — automatic detection and scrubbing of sensitive data
  • Real-time quality metrics — dashboards showing event health and validation failures
  • Resilience built-in — offline queues, retries, circuit breakers, and deduplication
  • Developer-friendly — strongly typed SDK with autocomplete and typo detection

This is not a toy demo—it's a scaffold a senior engineer or CTO would use to start a new product.


Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                         CLIENT (Browser)                         │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │  Analytics Client SDK (@analytics-client)                   │ │
│  │  • Type-safe event tracking                                 │ │
│  │  • Schema validation (client-side)                          │ │
│  │  • Offline queue + retry logic                              │ │
│  │  • PII detection                                             │ │
│  │  • Event deduplication (idempotency)                        │ │
│  └─────────────────────┬──────────────────────────────────────┘ │
└────────────────────────┼────────────────────────────────────────┘
                         │ HTTPS POST
                         ▼
┌─────────────────────────────────────────────────────────────────┐
│                      BACKEND API (Fastify)                       │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │  Validation Pipeline                                        │ │
│  │  1. Event name ✓                                            │ │
│  │  2. Schema validation ✓                                     │ │
│  │  3. Business rules ✓                                        │ │
│  │  4. PII detection ✓                                         │ │
│  │  5. Deduplication ✓                                         │ │
│  └─────────────────────┬──────────────────────────────────────┘ │
│                        │                                         │
│              ┌─────────┴──────────┐                             │
│              ▼                    ▼                             │
│     ┌─────────────────┐  ┌─────────────────┐                   │
│     │  Valid Event     │  │ Invalid Event   │                   │
│     └────────┬─────────┘  └────────┬────────┘                   │
│              │                     │                            │
└──────────────┼─────────────────────┼────────────────────────────┘
               │                     │
               ▼                     ▼
┌──────────────────────┐  ┌──────────────────────┐
│  events_validated    │  │ events_quarantined   │
│  (Postgres)          │  │  (Postgres)          │
│  • event_name        │  │  • raw_payload       │
│  • payload (JSONB)   │  │  • failure_reasons   │
│  • schema_version    │  │  • client_info       │
│  • adapter_status    │  │  • timestamp         │
│  • timestamps        │  │                      │
└──────────────────────┘  └──────────────────────┘
          │
          ▼
┌──────────────────────┐
│  Analytics Adapters  │
│  (Segment, Amplitude,│
│   Mixpanel, etc.)    │
│  + Circuit Breaker   │
└──────────────────────┘

Quick Start

# 1. Clone and install
git clone https://github.com/MacFall7/taxonomy-tap.git
cd taxonomy-tap
pnpm install

# 2. Configure environment
cp apps/api/.env.example apps/api/.env
cp apps/web/.env.example apps/web/.env
# Edit apps/api/.env with your Postgres connection string

# 3. Set up database
pnpm db:migrate

# 4. Start all services
pnpm dev

That's it! The following will be running:

  • Web app on http://localhost:5173 — E-commerce demo + analytics dashboard
  • API server on http://localhost:3000 — Event ingestion and validation

Prerequisites: Node.js >= 18, pnpm >= 8, Postgres database

Try it out

  1. Browse the demo store at http://localhost:5173

    • Click through products
    • Add items to cart
    • Complete a checkout
  2. View analytics dashboard at http://localhost:5173/analytics

    • See validated events in real-time
    • Inspect quarantined events (try submitting invalid data)
    • Monitor adapter health and circuit breaker status
    • View PII detection alerts

Project Structure

analytics-starter/
├── apps/
│   ├── web/              # React e-commerce demo + analytics dashboard
│   └── api/              # Fastify backend for event ingestion
├── packages/
│   └── analytics-client/ # Reusable analytics SDK (publishable to npm)
├── infra/
│   └── db/               # Postgres migrations and schema
├── docs/
│   ├── OVERVIEW.md       # Detailed project overview
│   ├── ARCHITECTURE.md   # System design and data flow
│   ├── TAXONOMY.md       # Event taxonomy and schema definitions
│   └── HARDENING_NOTES.md # Reliability principles and tradeoffs
├── package.json          # Monorepo root with workspace config
└── pnpm-workspace.yaml   # pnpm workspace configuration

Core Features

1. Type-Safe Event Tracking

The analytics client provides compile-time and runtime type safety:

import { trackEvent } from '@taxonomy-tap/analytics-client';

// ✅ Valid event with autocomplete
trackEvent('product_viewed', {
  product_id: 'abc-123',
  product_name: 'Organic Coffee',
  price: 12.99,
  currency: 'USD'
});

// ❌ TypeScript error: unknown event
trackEvent('prodcut_viewed', { ... });  // Typo caught at compile time!

// ❌ Runtime error: missing required field
trackEvent('product_viewed', {
  product_id: 'abc-123'
  // Missing required fields
});

2. Quarantine System

Invalid events are never silently dropped. They're quarantined with full context:

  • Raw payload
  • Validation failure reasons (categorized)
  • Client metadata (browser, IP, timestamp)
  • Stack traces (in dev mode)

View quarantined events in the dashboard to fix tracking issues proactively.

3. PII Protection

Automatic detection and scrubbing of:

  • Email addresses
  • Credit card numbers
  • Phone numbers
  • Custom patterns (configurable)

Configurable actions: scrub, hash, or reject events containing PII.

4. Resilience Layer

  • Offline queue: Events tracked while offline are queued and sent when reconnected
  • Exponential backoff: Failed requests retry with increasing delays
  • Deduplication: Events are hashed to prevent double-tracking
  • Circuit breakers: Unhealthy adapters are temporarily disabled to prevent cascading failures

5. Quality Metrics Dashboard

Real-time visibility into:

  • Event success vs. failure rates
  • Quarantine trends by reason
  • PII detection alerts
  • Adapter health status
  • Latency percentiles (p50, p95, p99)

Using the Analytics Client

Installation (for external projects)

npm install @taxonomy-tap/analytics-client

Setup

import { initAnalytics } from '@taxonomy-tap/analytics-client';

const analytics = initAnalytics({
  apiEndpoint: 'https://your-api.com/ingest',
  enableOfflineQueue: true,
  enablePiiDetection: true,
  debug: process.env.NODE_ENV === 'development'
});

Track Events

import { trackEvent } from '@taxonomy-tap/analytics-client';

// Product viewed
trackEvent('product_viewed', {
  product_id: 'prod_123',
  product_name: 'Wireless Headphones',
  category: 'Electronics',
  price: 99.99,
  currency: 'USD'
});

// Add to cart
trackEvent('item_added_to_cart', {
  product_id: 'prod_123',
  quantity: 1,
  cart_total: 99.99
});

// Checkout started
trackEvent('checkout_started', {
  cart_total: 99.99,
  item_count: 1
});

// Order completed
trackEvent('order_completed', {
  order_id: 'ord_456',
  revenue: 99.99,
  currency: 'USD',
  item_count: 1
});

Documentation

  • Overview — High-level concepts and goals
  • Architecture — System design, data flow, and component interaction
  • Taxonomy — Event catalog and schema definitions
  • Hardening Notes — Reliability principles and design tradeoffs

Production Hardening

This project is built with production reliability in mind. Key hardening features:

Client-Side Resilience

  • 🔄 Offline queue — Events tracked offline are queued in localStorage and synced on reconnect
  • 🔁 Exponential backoff — Failed requests retry with increasing delays (1s → 2s → 4s → 8s)
  • 🔒 Idempotency — Event IDs prevent duplicate tracking across retries
  • 🚦 Circuit breaker — Auto-disable unhealthy endpoints to prevent cascading failures

Server-Side Validation

  • Multi-layer validation — Schema → PII → Business rules → Deduplication
  • 🛡️ PII detection — Automatic scanning for emails, credit cards, phone numbers
  • 🔍 Quarantine system — Invalid events stored with full failure context, never silently dropped
  • 📊 Quality metrics — Real-time dashboards for validation success/failure rates

Adapter Delivery

  • 🔌 Circuit breakers — Per-adapter fault isolation (default: 5 failures, 60s timeout)
  • 🎯 Fire-and-forget — Adapter delivery is async and non-blocking
  • 📈 Health tracking — Monitor adapter status and recovery in real-time
  • 🔧 Configurable — Adjust thresholds via environment variables

Database Layer

  • 🏊 Connection pooling — Efficient Postgres connection management
  • 🗄️ JSONB storage — Flexible event payload storage without schema migrations
  • 📝 Audit trail — Full event history with timestamps and metadata
  • 🔐 Prepared statements — Protection against SQL injection

For detailed design decisions and tradeoffs, see HARDENING_NOTES.md.


Development

# Run all apps in dev mode
pnpm dev

# Run only the web app
pnpm dev:web

# Run only the API server
pnpm dev:api

# Build all packages and apps
pnpm build

# Build only the analytics client
pnpm build:client

# Run tests
pnpm test

# Lint all code
pnpm lint

# Type-check all code
pnpm typecheck

Deployment

Frontend (Web App)

Deploy to Vercel or Netlify:

cd apps/web
pnpm build
# Follow Vercel/Netlify deployment instructions

Backend (API)

Deploy to Railway, Render, or Fly.io:

cd apps/api
pnpm build
# Follow platform-specific deployment instructions

Database

Use a managed Postgres provider:

Run migrations on your hosted database:

DATABASE_URL=postgresql://user:pass@host/db pnpm db:migrate

License

MIT © 2025


Credits

Built with:


Questions or feedback? Open an issue or start a discussion!

About

Production-grade analytics instrumentation for greenfield projects No silent failures. No dropped events. Rock-solid from day one.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages