Skip to content

m3gr/email-journal

Repository files navigation

Email Journal

Email Journal is a Node.js/TypeScript proof of concept that resolves a project by quote ID or R-number, reads email and project history data from PostgreSQL, and renders a single self-contained HTML report. AWS Bedrock can optionally be used to tighten the narrative copy, while the timeline and key decision history are extracted deterministically from stored project data.

If no database is configured, the project can run in bundled mock mode for immediate demonstration.

For the latest implementation notes and handoff context, see ../Docs/project-mailbox-email-journal-handoff.md.

Report Sections

The generated HTML report contains four accordion sections:

Section Contents
Project Summary Study name, client, project manager, latest email date
Timeline & Key Milestones Category filter chips + paginated dated table of quote status changes, billing status changes, and key email milestones
Scope & Execution Details Client update groups with survey counts, targets, completes, and LOI
Current Status Against Scope Email date range, latest client update action, stat cards

The report header also shows: client name, end client, contact name/email, quote type, current quote status, current billing status, and stored email count.

Timeline & Key Milestones

Email subjects are not used as milestone text directly. Instead the message body is scanned for business-relevant phrases and distilled into short labels with details. Rows are deduplicated by milestone label — when the same milestone appears multiple times only the most recent entry is shown.

The timeline table uses three columns:

  • Date
  • Milestone (with inline category badge)
  • Details

Timeline Milestone Extraction

Email subjects are not used as milestone text directly. Instead the message body is scanned for business-relevant phrases and distilled into short labels with details:

Pattern matched Milestone shown Details column
local currency in body Local Currency Costs Currency amounts found (e.g. USD 50, USD 40)
final costs confirmed/approved Final Costs Currency amounts if present
costs approved Cost Approval Approved
approval + costs Cost Approval Requested
soft/full launch approved Soft/Full Launch Approved
kickoff Kickoff Discussed
end of project / project closed Project Close Discussed
incentive/honoraria increase Incentive/Honoraria Increase Discussed
changes implemented, test link ready, etc. Title-cased label

HTML markup is stripped from message bodies before pattern matching. Repeated messages matching the same milestone category are collapsed to the single most-recent entry.

Billing and quote status changes from q_quote_statuschange_audit and q_quote_status_log are also included, labelled as Billing Status and Quote Status with the status value in the Details column.

The detailed timeline is no longer trimmed to only the latest 12 email-derived rows on the server. Pagination remains in the UI, but the underlying email-derived milestone and decision set now reflects the full stored project email history.

Setup

  1. Copy .env.example to .env and fill in PostgreSQL and AWS values:
PGHOST=...
PGPORT=5432
PGDATABASE=...
PGUSER=...
PGPASSWORD=...
BEDROCK_ENABLED=false
BEDROCK_MODEL_ID=anthropic.claude-3-haiku-20240307-v1:0
BEDROCK_MODEL_ID_FAST=anthropic.claude-3-haiku-20240307-v1:0
BEDROCK_MODEL_ID_ACCURATE=anthropic.claude-3-5-sonnet-20241022-v2:0
AWS_REGION=us-east-1
  1. Install dependencies:
cd email-journal
npm install
  1. Optionally verify live connections before generating reports:
npm run check:connections
  1. Start the local server:
npm run dev
  1. Open a report in the browser:
http://localhost:3000/report/R88545
http://localhost:3000/report/806607
http://localhost:3000/report/R20185

CLI Usage

Generate a report and write it to disk:

npm run report -- 806607 --out report-output/806607.html
npm run report -- R88545 --out report-output/R88545.html

The CLI skips overwriting an output file when the report fingerprint is unchanged from the last run.

Pass --use-bedrock to force a Bedrock narrative pass on this run:

npm run report -- 806607 --use-bedrock

API Endpoints

Method Path Description
GET /health Liveness check
GET /api/report/:identifier Returns report data as JSON
GET /report/:identifier Returns the full HTML report

:identifier is either a numeric quote ID (e.g. 806607) or an R-number (e.g. R88545).

Optional query parameters on both report endpoints:

Parameter Default Description
maxEmails 12 Max email rows fetched for the recent-email narrative pool
maxTimelineItems 12 Max combined timeline rows shown

AWS Bedrock

Email Journal now supports two Bedrock model roles:

  • BEDROCK_MODEL_ID_FAST handles broad-sweep milestone classification and defaults to the legacy BEDROCK_MODEL_ID value when unset.
  • BEDROCK_MODEL_ID_ACCURATE handles narrative refreshes and falls back to BEDROCK_MODEL_ID_FAST, then BEDROCK_MODEL_ID, when unset.

This keeps the existing single-model setup working while allowing a cheaper classifier and a stronger narrative model.

When BEDROCK_ENABLED=true, the service runs a cost-controlled incremental pass before any model call for the narrative sections:

  1. Only email rows with a message_id greater than the last checkpoint are inspected.
  2. Message bodies are normalized — HTML stripped, quoted-thread prefixes removed, disclaimers cut.
  3. Each normalized message is hashed; already-seen hashes are skipped.
  4. Only messages containing a recognised event phrase (approvals, costs, lifecycle milestones, operational updates) are selected as candidates.
  5. A compact excerpt around the matched phrase (±220 chars) is sent to the model, not the full body.
  6. The model is asked to update only sections where the new excerpts add a new fact.

The checkpoint is stored locally under .cache/bedrock/quote-<id>.json and persists the last processed message ID, content hashes, and the accepted narrative. The timeline and Key Decisions table are regenerated deterministically from current database state on each report build and are not sourced from the Bedrock checkpoint.

The validated single-model default remains anthropic.claude-3-haiku-20240307-v1:0.

Recommended Model Matrix

Use the model pair that matches the project phase:

Use case Fast model Accurate model Why
Default day-to-day report generation anthropic.claude-3-haiku-20240307-v1:0 unset Cheapest option; current prompts and token budgets were designed around Haiku-class costs
Higher-confidence narrative refreshes anthropic.claude-3-haiku-20240307-v1:0 Sonnet-class model available in your Bedrock account Keeps broad classification cheap while improving borderline narrative rewrites
Large-scale future drill-down summarization Haiku-class model Sonnet-class model Use only for scoped user-triggered analysis, not whole-project reruns

The service is intentionally optimized for short, filtered excerpts and deterministic fallbacks. Start with Haiku for both roles and introduce a Sonnet-class accurate model only if you see missed nuance or unstable JSON output in narrative refreshes.

Data Sources

All data is read from the m3gr PostgreSQL schema:

Table Used for
q_quotes Project lookup, survey name, quote type
q_ref_client Client name (joined on client_id and end_client)
q_quote_status / q_quote_status_log Current and historical quote status
q_quote_statuschange_audit / q_billing_status Current and historical billing status
q_quote_history / q_quote_history_event_type Internal quote history events
quote_messages Email content and metadata
staff Project manager name
q_client_update_groups / q_client_update_group_surveys Client update groups
q_quote_targets Target counts, completes, LOI
client_update_history Latest client update export action

Production Deployment (ColdFusion Integration)

The full integration contract is documented in ../Docs/cf-integration-contract.md. The short version:

Architecture

  • Email Journal runs as an internal Node.js service (Kubernetes/ECS container). It is never exposed to the public internet.
  • MRBeta (ColdFusion) owns the UI surface. A CF page uses <cfhttp> to proxy the pre-rendered HTML from the Node app into the page.
  • Authentication and session management remain entirely in ColdFusion — the Node app has no login layer.

ColdFusion Proxy Snippet

<cfset nodeBase = "http://email-journal-internal:3000">
<cfhttp method="GET" url="#nodeBase#/report/#rNumber#" result="mailboxResp" timeout="120">
<cfif mailboxResp.statusCode eq "200 OK">
  #mailboxResp.fileContent#
<cfelse>
  <p>Unable to load email journal. (#mailboxResp.statusCode#)</p>
</cfif>

Set STATIC_BASE_URL in .env to the internal Node host URL so the stylesheet link is fully qualified (required when the page is served through the CF proxy):

STATIC_BASE_URL=http://projectmailbox-internal:3000

Container Persistence

Local file caches (.cache/bedrock/, in-memory report LRU) are wiped on every container restart. Before deploying to any containerised environment:

  1. Run the DB migration to create the checkpoint table:
    psql -f scripts/migrations/001_bedrock_checkpoints.sql
  2. Set CHECKPOINT_STORAGE=db in the container's environment variables.

This moves Bedrock summary checkpoints into m3gr.mailbox_bedrock_checkpoints (survives restarts). The in-memory report LRU is intentionally short-lived; it rebuilds quickly from the DB on each pod start.

Admin Page

The milestone keyword manager (/admin/milestone-keywords) uses dynamic XHR and cannot be proxied via <cfhttp>. Recommended approach: internal-only access, direct to the Node host URL. See the integration contract doc for alternatives.

Notes

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors