perf: enable Gunicorn preload_app to reduce memory per worker#12364
perf: enable Gunicorn preload_app to reduce memory per worker#12364jordanrfrazier merged 12 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Enables Gunicorn app preloading to reduce per-worker memory usage via Copy-on-Write sharing after fork.
Changes:
- Adds
preload_app: Trueto the Gunicorn options passed toLangflowApplication. - Updates the generated component index asset, including dependency version entries and the recorded sha256.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/backend/base/langflow/__main__.py |
Enables Gunicorn preload_app to preload the app in the master process before forking workers. |
src/lfx/src/lfx/_assets/component_index.json |
Updates generated component index metadata (dependency versions + sha256). |
Comments suppressed due to low confidence (2)
src/lfx/src/lfx/_assets/component_index.json:1
- This PR is described as enabling Gunicorn
preload_app, but it also includes changes to the generated component index (e.g.,googledependency version entries and the assetsha256). If this file is expected to change, please update the PR description to mention it and why; otherwise, consider reverting this file (or moving it into a separate PR) to keep the change set focused and easier to review.
src/lfx/src/lfx/_assets/component_index.json:1 - This PR is described as enabling Gunicorn
preload_app, but it also includes changes to the generated component index (e.g.,googledependency version entries and the assetsha256). If this file is expected to change, please update the PR description to mention it and why; otherwise, consider reverting this file (or moving it into a separate PR) to keep the change set focused and easier to review.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "certfile": ssl_cert_file_path, | ||
| "keyfile": ssl_key_file_path, | ||
| "log_level": log_level.lower() if log_level is not None else "info", | ||
| "preload_app": True, |
There was a problem hiding this comment.
Enabling preload_app can change runtime behavior because all import-time side effects run in the master process and are then forked into workers. To reduce operational risk and ease debugging/rollbacks, consider making this configurable (e.g., env var/CLI flag defaulting to enabled on supported platforms) and documenting any fork-safety assumptions (no sockets/FDs or background threads created at import time).
|
@ogabrielluiz thank you for the earlier! seems like this is much easier! hope this one looks good as well :-) |
|
@jordanrfrazier please let us know, would love to get your approval :-D thank you! |
|
@severfire Thanks for the find, this looks very promising. I just clicked around quick and saw some articles on this flag, and I think we'll want to spend some time testing this flag before making it default behavior. In particular, I'm worried we'll run into the same "ghost" process issue detailed in this article. So here's what I can do now - I'm going to make this a configurable environment variable that defaults to the existing behavior, but will give you and others the ability to toggle it on and see how it works. If all goes well, I'll make this the default setting for v1.10. |
|
@jordanrfrazier configurable environment variable that defaults to the existing behavior sound great! |
|
@jordanrfrazier - i will work on potential ghosts. Opus 4.6 gave me possible solutions Dealing with "Ghosts" when
|
| Ghost Source | Runs in Master? | Fix |
|---|---|---|
Prometheus start_http_server |
Yes | Move to lifespan or use multiprocess mode |
sentry_sdk.init() |
Yes | Defer to post_fork or lifespan |
| OTEL instrumentation | Yes (partial) | Defer exporter init to lifespan |
| PG version check | Yes (before app) | Explicitly engine.dispose() |
| DB engine, Redis, services | No (lifespan) | Already safe |
Langflow is in a better position than Rippling's Django monolith because the ASGI lifespan pattern already defers heavy initialization to each worker. The main action items are the Prometheus server, Sentry SDK, and adding the fail-fast guardrail to prevent regressions.
|
@severfire Awesome. I'll get the current configurable state in for v1.9, and have created an internal task tracking the follow up for v1.10. If you have any findings on whether it will be an issue or not, please open a PR or issue and tag me. |
|
@jordanrfrazier Sounds great! Thank you! I will create new branch based on it and work on ghosts there. |
|
merge conflicts |
afbc6b0
into
langflow-ai:release-1.9.0
|
@jordanrfrazier Hi, was this released in https://github.com/langflow-ai/langflow/releases/tag/v1.9.0 ? I do not see it on the list |
|
@severfire It was, yes. Not sure why it's not shown on the release. Will check. update: yeah. new release process, generated logs from last release. I'll get the team to fix it up |
|
@jordanrfrazier thanks! waiting for the update then :-) I made some fixes with ghosting #12587, and also other issue i found with job queue #12588 |
* fix: enable preload_app option in LangflowApplication configuration * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes * Make flag configurable * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * [autofix.ci] apply automated fixes (attempt 3/3) --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Gabriel Luiz Freitas Almeida <gabriel@logspace.ai> Co-authored-by: ogabrielluiz <gabriel@langflow.org> Co-authored-by: Jordan Frazier <jordan.frazier@datastax.com> Co-authored-by: Jordan Frazier <122494242+jordanrfrazier@users.noreply.github.com>
Description: Adds "preload_app": True to the Gunicorn options for non-Windows environments.
Why?
Previously, each Gunicorn worker spawned a completely independent Python interpreter, duplicating the entire module and import footprint in RAM. By preloading the app in the master process before forking, the OS (Linux/macOS) can use Copy-on-Write (CoW) to share read-only memory—such as Python bytecode, class definitions, and import-time structures—across all worker processes.
Impact & Safety:
Lower Memory Usage: Significantly reduces the baseline RAM required for each additional worker.
Faster Worker Boot: Workers start up faster since the app imports are already resolved.
Connection Safe: Because Langflow cleanly initializes its stateful connections (Database pools, Telemetry clients, MCP services) inside the async lifespan context manager, connections are safely created post-fork in each worker's dedicated event loop. No sockets or file descriptors are incorrectly shared.