fix(pytorch): preserve numpy dtype when converting to torch.Tensor in runners#5608
Open
1fanwang wants to merge 1 commit into
Open
fix(pytorch): preserve numpy dtype when converting to torch.Tensor in runners#56081fanwang wants to merge 1 commit into
1fanwang wants to merge 1 commit into
Conversation
`torch.Tensor(arr)` is the tensor *type constructor* and ignores the numpy dtype of its argument, silently producing a `float32` tensor. When a saved PyTorch model expects `float16`, `int64`, or `bool` inputs, the runnable hands it a `float32` tensor and the model raises a dtype mismatch (see bentoml#4266). Switch the numpy/pandas branches of `_mapping` in `common/pytorch.py` to `torch.from_numpy(item).to(self.device_id)`, which preserves the source dtype. The same one-line pattern in `detectron.py` is fixed for the same reason. `tests/integration/frameworks/test_pytorch_unit.py` gains a parametrised regression covering float16/float32/float64/int8/int32/int64/uint8/bool numpy arrays and an int64 pandas DataFrame; all assert the resulting torch dtype matches the input. Closes bentoml#4266
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
PytorchRunnable._mapping(insrc/bentoml/_internal/frameworks/common/pytorch.py) and the parallel call indetectron.pybuild the input tensor withtorch.Tensor(arr, device=...). That's the tensor type constructor (capitalT) — it ignores the source dtype and always produces afloat32tensor. So every runner call silently upcastsfloat16,int64,bool, etc. inputs tofloat32before they reach the model.This breaks half-precision and integer-input models in production. Switching to
torch.from_numpy(arr).to(device)preserves the source dtype and stays zero-copy where possible, which is safe under theinference_mode_ctxguard already wrapping the call.Closes #4266.
Verification
Pre-fix, the new parametrized regression fails:
Post-fix, all 11 cases pass: 8 numpy dtypes (
float16/32/64,int8/32/64,uint8,bool) plus anint64pandas DataFrame, asserting the dtype handed to the model survives the runner boundary.ruff check+ruff format --checkclean.