Skip to content

fix(pytorch): preserve numpy dtype when converting to torch.Tensor in runners#5608

Open
1fanwang wants to merge 1 commit into
bentoml:mainfrom
1fanwang:fix/pytorch-runnable-preserve-dtype
Open

fix(pytorch): preserve numpy dtype when converting to torch.Tensor in runners#5608
1fanwang wants to merge 1 commit into
bentoml:mainfrom
1fanwang:fix/pytorch-runnable-preserve-dtype

Conversation

@1fanwang
Copy link
Copy Markdown

Description

PytorchRunnable._mapping (in src/bentoml/_internal/frameworks/common/pytorch.py) and the parallel call in detectron.py build the input tensor with torch.Tensor(arr, device=...). That's the tensor type constructor (capital T) — it ignores the source dtype and always produces a float32 tensor. So every runner call silently upcasts float16, int64, bool, etc. inputs to float32 before they reach the model.

This breaks half-precision and integer-input models in production. Switching to torch.from_numpy(arr).to(device) preserves the source dtype and stays zero-copy where possible, which is safe under the inference_mode_ctx guard already wrapping the call.

Closes #4266.

Verification

Pre-fix, the new parametrized regression fails:

test_pytorch_runnable_preserves_numpy_dtype[float16] FAILED
  assert torch.float32 == torch.float16

Post-fix, all 11 cases pass: 8 numpy dtypes (float16/32/64, int8/32/64, uint8, bool) plus an int64 pandas DataFrame, asserting the dtype handed to the model survives the runner boundary.

ruff check + ruff format --check clean.

`torch.Tensor(arr)` is the tensor *type constructor* and ignores the numpy
dtype of its argument, silently producing a `float32` tensor. When a saved
PyTorch model expects `float16`, `int64`, or `bool` inputs, the runnable
hands it a `float32` tensor and the model raises a dtype mismatch
(see bentoml#4266).

Switch the numpy/pandas branches of `_mapping` in `common/pytorch.py` to
`torch.from_numpy(item).to(self.device_id)`, which preserves the source
dtype. The same one-line pattern in `detectron.py` is fixed for the same
reason.

`tests/integration/frameworks/test_pytorch_unit.py` gains a parametrised
regression covering float16/float32/float64/int8/int32/int64/uint8/bool
numpy arrays and an int64 pandas DataFrame; all assert the resulting
torch dtype matches the input.

Closes bentoml#4266
@1fanwang 1fanwang requested a review from a team as a code owner May 12, 2026 10:40
@1fanwang 1fanwang requested review from parano and removed request for a team May 12, 2026 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: numpy to torch.Tensor conversion does not preserve dtype when using np.float16

1 participant