Skip to content

Speed up historical forecast parsing by avoiding pd.json_normalize (#38)#44

Merged
jcofield merged 1 commit intomainfrom
faster-historical-forecast-parsing
Nov 2, 2025
Merged

Speed up historical forecast parsing by avoiding pd.json_normalize (#38)#44
jcofield merged 1 commit intomainfrom
faster-historical-forecast-parsing

Conversation

@jcofield
Copy link
Contributor

@jcofield jcofield commented Mar 14, 2025

What

Remove the use of json.normalize within a for loop.

Why

It is 10x slower than direct assignment on each item in the json.

How

Use direct assignment within multiple for loops.
This change was successfully tested in #38. This change is also currently is passing tests on #42 which only has one other unrelated change.

@sam-watttime
Copy link
Contributor

description from the original PR (which was merged into #42): pd.json_normalize is not efficient to use in a loop. When profiling code, calls to _parse_historical_forecast_json improved from 10.8s -> 0.38s for 1% of 1 year of forecasts.

This passes all tests and only reproduces the behavior that pd.json_normalize was performing, however that function has significantly more overhead.

@jcofield jcofield mentioned this pull request Mar 14, 2025
@jcofield jcofield force-pushed the faster-historical-forecast-parsing branch from a1cfdee to 98b8bf3 Compare March 15, 2025 01:08
jcofield added a commit that referenced this pull request Nov 2, 2025
@jcofield jcofield force-pushed the faster-historical-forecast-parsing branch from 98b8bf3 to a8ed68e Compare November 2, 2025 02:11
@jcofield jcofield merged commit a818571 into main Nov 2, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants