Skip to content

fix(codex): use real activity timestamps and local calendar days#222

Merged
NovakPAai merged 2 commits into
vakovalskii:mainfrom
embogomolov:fix/codex-activity-timezone
May 26, 2026
Merged

fix(codex): use real activity timestamps and local calendar days#222
NovakPAai merged 2 commits into
vakovalskii:mainfrom
embogomolov:fix/codex-activity-timezone

Conversation

@embogomolov
Copy link
Copy Markdown
Contributor

@embogomolov embogomolov commented May 25, 2026

Summary

Fixes Codex activity dates, daily hours, and local-day streak calculation.

Codex detailed sessions were using session-level timestamps for user messages instead of each JSONL entry timestamp. This could collapse daily Codex hours to 0 and hide active days in long-running sessions.

Changes

  • Use embedded Codex JSONL timestamps for session bounds; file mtime is only a fallback when no timestamps exist.
  • Attribute Codex user messages to their own entry timestamps and local calendar days.
  • Compute today, current streak, week keys, and burn-rate day from local calendar dates instead of UTC date slicing.
  • Send local timezone metadata with leaderboard sync payload.
  • Rename temp cache files from codedash-* to codbash-* so stale daily/analytics caches from older timestamp logic do not mask the fix after upgrade.

Test plan

  • node --test test/codex-activity-timezone.test.js
  • node --test test/pi-session.test.js
  • node --test test/*.test.js

@embogomolov embogomolov force-pushed the fix/codex-activity-timezone branch from b03099e to f87924c Compare May 25, 2026 16:09
@NovakPAai
Copy link
Copy Markdown
Collaborator

Code Review — PR #222

@embogomolov solid fix for a real bug. The UTC→local transition is the correct approach. A few findings below.


HIGH

1. NaN timestamp propagation in _computeSessionDailyBreakdown (codex branch)

After this PR, the codex branch sets ts = parseTimestampMs(entry.timestamp || entry.ts) which returns NaN for entries without timestamps. The existing fallback guard:

if (!ts || ts < 1000000000000) ts = s.first_ts;

does not catch NaN — because NaN < 1000000000000 evaluates to false. A NaN timestamp then flows into addMsg(day, ts)tsByDay[day].first = NaN(NaN - NaN) / 3600000 = NaN → the entire day's hours become NaN and propagate to the leaderboard.

Fix:

ts = parseTimestampMs(entry.timestamp || entry.ts);
if (!Number.isFinite(ts)) ts = s.first_ts;

2. normalizeTimestampMs threshold is correct but fragile — needs a comment

The < 1_000_000_000_000 boundary works for current Codex output (seconds-epoch values in the 1.7B range are correctly multiplied by 1000, while ms-epoch values in the 1.7T range pass through). But the magic number is non-obvious and will confuse future contributors. Consider either:

  • Adding a comment explaining the heuristic, or
  • Using < 10_000_000_000 (year 2286 in seconds) as a safer boundary that cannot misclassify any realistic millisecond timestamp

MEDIUM

3. Timezone data duplicated in leaderboard sync payload (server.js)

timezone and utcOffsetMinutes appear both at the top level and inside stats:

const payload = {
  timezone: stats.timezone || '',        // here
  utcOffsetMinutes: stats.utcOffsetMinutes, // and here
  stats: {
    timezone: stats.timezone || '',        // duplicate
    utcOffsetMinutes: stats.utcOffsetMinutes, // duplicate
  },
};

One of the two should be removed — the leaderboard server sees both, wastes bandwidth, and may cause confusion about which is authoritative. Top-level is the more conventional location for metadata.

4. _computeCostAnalytics does not guard parseLocalDayStart result

const d = parseLocalDayStart(s.date);

If s.date is malformed, d is new Date(NaN). The subsequent weekStart.setDate(d.getDate() - d.getDay()) silently propagates NaN into the week key. Add: if (isNaN(d.getTime())) continue;


LOW

5. process.env.TZ in test is fragile on some Node/platform combos

Setting TZ after process start is not portable on all platforms. It works here because it's set before any require, which is the safest moment. A comment explaining this constraint would help future contributors avoid moving the require above the TZ assignment.

6. getLocalTimezone and getUtcOffsetMinutes are exported but not tested

Both are now part of __test exports. A simple assertion like assert.equal(data.__test.getUtcOffsetMinutes(Date.parse('2026-05-24T00:00:00Z')), 180) inside the Moscow TZ test would confirm the sign convention.

7. lastTs = 0 sentinel value undocumented

The initial value was changed from stat.mtimeMs (always > 0) to 0. The fallback check lastTs <= 0 works correctly, but a brief comment noting "0 is sentinel for no timestamp seen" would aid comprehension since this is a non-obvious change from the previous behavior.


Summary

Severity Count
CRITICAL 0
HIGH 2
MEDIUM 2
LOW 3

The NaN propagation in HIGH-1 is the one to fix before merge — it will silently corrupt daily hours for any Codex entry that lacks a timestamp field. The rest are robustness improvements.

Copy link
Copy Markdown
Owner

@vakovalskii vakovalskii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ✅ Welcome @embogomolov — focused fix that addresses a real analytics bug.

Verified locally:

  • node --test test/codex-activity-timezone.test.js → 4/4 pass
  • Full suite: 131/0/2 pass

Spot-checks:

  • normalizeTimestampMs threshold (1e12) correctly distinguishes seconds-since-epoch from milliseconds-since-epoch — values < year 2001 in ms get *1000. Codex stores seconds, others ms, so this rationalises the input
  • parseCodexSessionFile was using stat.mtimeMs as the initial firstTs/lastTs — wrong because mtime is when the file was last touched, not when the conversation started. Walking JSONL entries first and falling back to mtime only when no embedded timestamps exist is the right inversion
  • Replacing new Date(s.date).toISOString().slice(0,10) with fmtLocalDay() in week-key computation fixes the UTC day-boundary skew that was hiding active days for late-evening users
  • Adding timezone + utcOffsetMinutes to leaderboard sync payload lets the leaderboard render local days correctly

Squash-merging.

@vakovalskii
Copy link
Copy Markdown
Owner

Approved and ready — but #221 (Pi support) just merged and now this conflicts on data.js / server.js (both touched Codex scanning + leaderboard payload).

Could you rebase onto main and force-push?

git fetch origin && git rebase origin/main
# resolve in scanCodexSessions / syncLeaderboard (keep both — Pi paths and timezone metadata are orthogonal)
git push --force-with-lease

I'm about to cut 7.5.0 so this one will likely land in 7.5.1. Won't take long after rebase.

@embogomolov embogomolov force-pushed the fix/codex-activity-timezone branch from f87924c to 7d425df Compare May 25, 2026 20:21
@NovakPAai
Copy link
Copy Markdown
Collaborator

@embogomolov None of the 7 review findings (posted 25 May) have been addressed — no new commits since the review.

Merge blocker — HIGH-1 (NaN propagation in _computeSessionDailyBreakdown): without a fix, daily hours silently become NaN for any Codex entry missing a timestamp field.

Also waiting on rebase onto main (conflict after #221 merge).

@embogomolov
Copy link
Copy Markdown
Contributor Author

@NovakPAai Thanks. I think this is looking at the old PR head.

The branch was rebased and force-pushed after the #221 conflict. Current head is 7d425df, with two commits:

  • 84d9219 fix(codex): use real activity timestamps and local calendar days
  • 7d425df chore(cache): use codbash temp cache names

HIGH-1 is fixed in the current code. Codex entry timestamps now go through parseEntryTimestampMs(), _computeSessionDailyBreakdown() only writes finite timestamps into tsByDay, and invalid/missing entry timestamps cannot turn daily hours into NaN. There is a regression test for this: Codex daily breakdown never stores NaN timestamps.

The other review notes are also addressed: safer epoch cutoff, top-level timezone metadata only, malformed date handling in cost analytics, TZ test comment, timezone/offset assertions, and explicit timestamp fallback logic.

Verified locally:

  • node --test test/codex-activity-timezone.test.js
  • node --test test/pi-session.test.js
  • node --test test/*.test.js

GitHub currently shows the PR as mergeable, not dirty.

@embogomolov
Copy link
Copy Markdown
Contributor Author

@NovakPAai Any updates?

The status of the each fix has been explicitly described above

Thank you

@NovakPAai NovakPAai merged commit 2fdd5a5 into vakovalskii:main May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants