🚨 Lighthouse: [Best Practice/Perf Fix] Optimize concurrent COUNT queries by ldsgroups225 · Pull Request #293 · ldsgroups225/Yeko

ldsgroups225 · 2026-04-12T18:33:54Z

Impact on performance

Reduced Database Roundtrips: Consolidates multiple concurrent connections to the database into a single query per metric group.
Eliminated Redundant Table Scans: By leveraging PostgreSQL's FILTER (WHERE ...) clause inside count(), the database can calculate multiple subsets of counts (e.g., active, inactive, graduated students) in a single pass over the table data.

Technical Rationale

Previously, analytics endpoints (e.g., getAnalyticsOverview, getSchoolsPerformance, getEnrollmentStats, and getEnrollmentGrowth) mapped multiple independent count aggregations inside a Promise.all block. While executed in parallel at the Node layer, this dispatched individual SQL queries, overwhelming the database connection pool and forcing the DB engine to scan the same tables repeatedly. Grouping them into a single SELECT with conditional aggregation (count(*) filter (where ...)) brings the O(N) connections down to O(1) and maximizes PostgreSQL query planner efficiency.

PR created automatically by Jules for task 16578561623722322682 started by @ldsgroups225

Summary by CodeRabbit

Performance
- Optimized analytics query performance to compute metrics more efficiently with reduced database load.
Chores
- Updated internal schema formatting and documentation.

Replaced multiple parallel database `count()` queries in `packages/data-ops/src/queries/analytics.ts` with single aggregated queries utilizing PostgreSQL's `FILTER (WHERE ...)` clause. This change eliminates redundant connection overhead and redundant table scans when fetching complex analytics metrics across the same table for different conditions.

google-labs-jules · 2026-04-12T18:33:56Z

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.

For security, I will only act on instructions from the user who triggered this task.

chatgpt-codex-connector · 2026-04-12T18:34:02Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

coderabbitai · 2026-04-12T18:34:09Z

📝 Walkthrough

Walkthrough

The PR consolidates multiple sequential count queries in analytics functions into a single aggregated PostgreSQL query using SQL FILTER (WHERE ...) expressions through Drizzle, while also standardizing quote formatting in the auth schema definitions.

Changes

Cohort / File(s)	Summary
Analytics Query Consolidation `packages/data-ops/src/queries/analytics.ts`, `.jules/bolt.md`	Refactored three analytics query functions (schools, students, growth stats) to replace multiple concurrent `Promise.all` count queries with single aggregated queries using SQL `count(*) filter (where ...)` expressions, reducing database round-trips and eliminating separate table scans.
Auth Schema Formatting `packages/data-ops/src/drizzle/auth-schema.ts`	Standardized quote style from double to single quotes throughout table/column/index names, reordered imports from `drizzle-orm`, and reformatted `pgTable` definitions without altering schema structure or exported entity signatures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚨 Lighthouse: Optimize Analytics Queries for Maximum Parallelization and Batching #126: Also consolidates multiple per-metric queries into single aggregated SQL queries in packages/data-ops/src/queries/analytics.ts, following the same N+1 elimination pattern.
🚨 Lighthouse: [Best Practice/Perf Fix] Parallelize analytics queries #87: Modifies the same analytics query implementations (getAnalyticsOverview, getEnrollmentStats, getEnrollmentGrowth) with a different optimization approach using Promise.all parallelization.

Poem

🐰 One hop beats many hops, we've learned,
Aggregate queries—efficiency earned!
Single scans where many once thrived,
Database bliss—our metrics now arrived! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title references optimizing concurrent COUNT queries, which is the central change across analytics.ts and the journal entry. However, it uses an emoji and categorical label that add noise rather than clarity.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch lighthouse/optimize-analytics-counts-16578561623722322682

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

packages/data-ops/src/queries/analytics.ts (1)
344-357: ⚠️ Potential issue | 🔴 Critical

Enforce mandatory tenant scoping for student aggregates.

These queries run on students with an optional schoolId predicate, which allows unscoped cross-school counts when schoolId is missing. Make schoolId required (or fail early) and always apply where(eq(students.schoolId, schoolId)).

As per coding guidelines: Every query on school-scoped tables MUST include where(eq(table.schoolId, schoolId)) for multi-tenant isolation.

Also applies to: 413-423
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/data-ops/src/queries/analytics.ts` around lines 344 - 357, The
student aggregation allows unscoped queries when schoolId is missing; update the
logic in this function so tenant scoping is mandatory: if schoolId is falsy,
throw/return early with an error, and always include where(eq(students.schoolId,
schoolId)) (instead of the optional baseConditions) in the db.select and the
other student queries (referencing symbols: students, baseConditions, db.select
and the second query block around lines 413-423) so every students table query
enforces multi-tenant isolation.

🧹 Nitpick comments (1)

packages/data-ops/src/queries/analytics.ts (1)

416-423: Remove single-item Promise.all in getEnrollmentGrowth.

This now wraps only one DB call; replacing it with a direct await makes intent clearer.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/data-ops/src/queries/analytics.ts` around lines 416 - 423, The
Promise.all wrapper around a single DB call in getEnrollmentGrowth is
unnecessary; replace awaiting Promise.all([...]) with a direct await of
db.select(...) and adjust the destructuring accordingly. Specifically, call
db.select({...}).from(students).where(baseConditions.length > 0 ?
and(...baseConditions) : undefined) with await, and assign the result to
growthStats using a single-level destructure (replace the current const
[[growthStats]] = await Promise.all([...]) pattern). Keep the same selected
fields (current using currentStartDate and previous using previousStartDate) and
the same baseConditions/db/select/students symbols.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.jules/bolt.md:
- Line 1: The journal heading currently uses the literal shell expression
"$(date +%Y-%m-%d) - Optimize concurrent COUNT queries in analytics"; replace
the $(date +%Y-%m-%d) token with an explicit ISO date (e.g., 2026-04-12) so the
markdown contains a concrete date string in the heading; update the heading line
that begins with "$(date +%Y-%m-%d)" to the chosen fixed date followed by " -
Optimize concurrent COUNT queries in analytics".

In `@packages/data-ops/src/queries/analytics.ts`:
- Around line 47-57: The DB calls and Promise.all in analytics.ts must be
converted to return ResultAsync<T, DatabaseError> and have errors logged via
tapLogErr; replace the raw await/Promise usage around db.select(...) and the
parallel calls to getDailyActiveUsers, getUserActivityCount,
getMonthlyActiveUsers, getAverageResponseTime with ResultAsync.fromPromise(...)
(or ResultAsync.all/sequence) and chain .mapErr(tapLogErr(logger,
'analytics:<describe operation>')) so the function returns ResultAsync rather
than a raw Promise; ensure each call (e.g., the db.select block referencing
schools.createdAt, getDailyActiveUsers(startDate),
getUserActivityCount(startDate, now), getUserActivityCount(previousStartDate,
startDate), getMonthlyActiveUsers(...), getAverageResponseTime(startDate)) is
wrapped and combined via ResultAsync utilities and that the final returned value
uses .mapErr(tapLogErr(...)) to satisfy the coding guideline.

---

Outside diff comments:
In `@packages/data-ops/src/queries/analytics.ts`:
- Around line 344-357: The student aggregation allows unscoped queries when
schoolId is missing; update the logic in this function so tenant scoping is
mandatory: if schoolId is falsy, throw/return early with an error, and always
include where(eq(students.schoolId, schoolId)) (instead of the optional
baseConditions) in the db.select and the other student queries (referencing
symbols: students, baseConditions, db.select and the second query block around
lines 413-423) so every students table query enforces multi-tenant isolation.

---

Nitpick comments:
In `@packages/data-ops/src/queries/analytics.ts`:
- Around line 416-423: The Promise.all wrapper around a single DB call in
getEnrollmentGrowth is unnecessary; replace awaiting Promise.all([...]) with a
direct await of db.select(...) and adjust the destructuring accordingly.
Specifically, call db.select({...}).from(students).where(baseConditions.length >
0 ? and(...baseConditions) : undefined) with await, and assign the result to
growthStats using a single-level destructure (replace the current const
[[growthStats]] = await Promise.all([...]) pattern). Keep the same selected
fields (current using currentStartDate and previous using previousStartDate) and
the same baseConditions/db/select/students symbols.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cf2ff56b-3737-4ae0-a661-ba22e78c061d

📥 Commits

Reviewing files that changed from the base of the PR and between 81815b8 and 4d08318.

📒 Files selected for processing (3)

.jules/bolt.md
packages/data-ops/src/drizzle/auth-schema.ts
packages/data-ops/src/queries/analytics.ts

coderabbitai · 2026-04-12T18:37:02Z

-## 2025-02-23 - [Parallelize Independent Queries with Promise.all()]
-**Learning:** Sequential `for...of` loops that `await` independent database queries (like iterating over active school years to fetch report card discovery units) create severe N+1 bottlenecks.
-**Action:** Always replace sequential `await` loops for read-only queries with `Promise.all()` mapped over the array to allow the database driver to process the independent queries concurrently, effectively eliminating the N+1 I/O wait.
+## $(date +%Y-%m-%d) - Optimize concurrent COUNT queries in analytics


⚠️ Potential issue | 🟡 Minor

Use a concrete date in the journal heading.

$(date +%Y-%m-%d) will remain literal text in markdown unless a preprocessing step replaces it. Consider committing an explicit date.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.jules/bolt.md at line 1, The journal heading currently uses the literal shell expression "$(date +%Y-%m-%d) - Optimize concurrent COUNT queries in analytics"; replace the $(date +%Y-%m-%d) token with an explicit ISO date (e.g., 2026-04-12) so the markdown contains a concrete date string in the heading; update the heading line that begins with "$(date +%Y-%m-%d)" to the chosen fixed date followed by " - Optimize concurrent COUNT queries in analytics".

coderabbitai · 2026-04-12T18:37:02Z

+    db.select({
+      total: sql<number>`count(*)`,
+      currentPeriod: sql<number>`count(*) filter (where ${schools.createdAt} >= ${startDate})`,
+      previousPeriod: sql<number>`count(*) filter (where ${schools.createdAt} >= ${previousStartDate} and ${schools.createdAt} < ${startDate})`,
+    }).from(schools),
    getDailyActiveUsers(startDate),
    getUserActivityCount(startDate, now),
    getUserActivityCount(previousStartDate, startDate),
    getMonthlyActiveUsers(new Date(now.getTime() - 365 * 24 * 60 * 60 * 1000)),
    getAverageResponseTime(startDate),
  ])


⚠️ Potential issue | 🟠 Major

Wrap updated DB operations in ResultAsync with tapLogErr.

The updated query paths still use raw await/Promise chains instead of the required ResultAsync<T, DatabaseError> flow with .mapErr(tapLogErr(logger, context)).

As per coding guidelines: Every query function MUST return ResultAsync<T, DatabaseError> from '@praha/byethrow' and use tapLogErr for error logging.

Also applies to: 124-141, 351-358, 419-423

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/data-ops/src/queries/analytics.ts` around lines 47 - 57, The DB calls and Promise.all in analytics.ts must be converted to return ResultAsync<T, DatabaseError> and have errors logged via tapLogErr; replace the raw await/Promise usage around db.select(...) and the parallel calls to getDailyActiveUsers, getUserActivityCount, getMonthlyActiveUsers, getAverageResponseTime with ResultAsync.fromPromise(...) (or ResultAsync.all/sequence) and chain .mapErr(tapLogErr(logger, 'analytics:<describe operation>')) so the function returns ResultAsync rather than a raw Promise; ensure each call (e.g., the db.select block referencing schools.createdAt, getDailyActiveUsers(startDate), getUserActivityCount(startDate, now), getUserActivityCount(previousStartDate, startDate), getMonthlyActiveUsers(...), getAverageResponseTime(startDate)) is wrapped and combined via ResultAsync utilities and that the final returned value uses .mapErr(tapLogErr(...)) to satisfy the coding guideline.

coderabbitai Bot reviewed Apr 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚨 Lighthouse: [Best Practice/Perf Fix] Optimize concurrent COUNT queries#293

🚨 Lighthouse: [Best Practice/Perf Fix] Optimize concurrent COUNT queries#293
ldsgroups225 wants to merge 1 commit intomasterfrom
lighthouse/optimize-analytics-counts-16578561623722322682

ldsgroups225 commented Apr 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

google-labs-jules Bot commented Apr 12, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 12, 2026

Uh oh!

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Uh oh!

coderabbitai Bot Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ldsgroups225 commented Apr 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Impact on performance

Technical Rationale

Summary by CodeRabbit

Uh oh!

google-labs-jules Bot commented Apr 12, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 12, 2026

Uh oh!

coderabbitai Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ldsgroups225 commented Apr 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading