Skip to content

fix failure of pytest tests/models/paddleocr_vl/test_modeling_paddleo…#43001

Closed
sywangyi wants to merge 2 commits intohuggingface:mainfrom
sywangyi:paddleocr_vl
Closed

fix failure of pytest tests/models/paddleocr_vl/test_modeling_paddleo…#43001
sywangyi wants to merge 2 commits intohuggingface:mainfrom
sywangyi:paddleocr_vl

Conversation

@sywangyi
Copy link
Copy Markdown
Contributor

…cr_vl.py::PaddleOCRVLModelTest::test_flash_attn_2_fp32_ln

quantization_config is not set in text and visual part. so datatype conversion before flash attn will not be called.

pytest tests/models/paddleocr_vl/test_modeling_paddleocr_vl.py::PaddleOCRVLModelTest::test_flash_attn_2_fp32_ln

failure like

../bk/hub/models--kernels-community--flash-attn2/snapshots/172e23272e585d3c0d97124bc690593af81a0b95/build/torch29-cxx11-cu128-x86_64-linux/flash_attn2/flash_attn_interface.py:1199: in flash_attn_func
    return FlashAttnFunc.apply(
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/autograd/function.py:581: in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../bk/hub/models--kernels-community--flash-attn2/snapshots/172e23272e585d3c0d97124bc690593af81a0b95/build/torch29-cxx11-cu128-x86_64-linux/flash_attn2/flash_attn_interface.py:837: in forward
    out_padded, softmax_lse, S_dmask, rng_state = _wrapped_flash_attn_forward(
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_ops.py:1255: in __call__
    return self._op(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_library/autograd.py:111: in autograd_impl
    result = forward_no_grad(*args, Metadata(keyset, keyword_only_args))
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_library/autograd.py:40: in forward_no_grad
    result = op.redispatch(keyset & _C._after_autograd_keyset, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_ops.py:848: in redispatch
    return self._handle.redispatch_boxed(keyset, *args, **kwargs)  # type: ignore[return-value]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_library/custom_ops.py:343: in backend_impl
    result = self._backend_fns[device_type](*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_compile.py:53: in inner
    return disable_fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_dynamo/eval_frame.py:1044: in _fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_library/custom_ops.py:376: in wrapped_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
../bk/hub/models--kernels-community--flash-attn2/snapshots/172e23272e585d3c0d97124bc690593af81a0b95/build/torch29-cxx11-cu128-x86_64-linux/flash_attn2/flash_attn_interface.py:94: in _flash_attn_forward
    out, softmax_lse, S_dmask, rng_state = flash_attn_gpu.fwd(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <OpOverloadPacket(op='_flash_attn_9e27194.fwd')>
args = (tensor([[[[0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0],
         ...0, 0,  ..., 0, 0, 0],
          [0, 0, 0,  ..., 0, 0, 0]]]], device='cuda:0', dtype=torch.uint8), None, None, 0.0, ...)
kwargs = {}

    def __call__(self, /, *args: _P.args, **kwargs: _P.kwargs) -> _T:
        # overloading __call__ to ensure torch.ops.foo.bar()
        # is still callable from JIT
        # We save the function ptr as the `op` attribute on
        # OpOverloadPacket to access it here.

        # Directly calling OverloadPacket goes into C++, which will check
        # the schema and cause an error for torchbind op when inputs consist of FakeScriptObject so we
        # intercept it here and call TorchBindOpverload instead.
        if self._has_torchbind_op_overload and _must_dispatch_in_python(args, kwargs):
            return _call_overload_packet_from_python(self, *args, **kwargs)
>       return self._op(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^
E       RuntimeError: FlashAttention only support fp16 and bf16 data type

/mnt/disk0/wangyi/miniforge3/envs/transformers/lib/python3.11/site-packages/torch/_ops.py:1255: RuntimeError

tests/models/paddleocr_vl/test_modeling_paddleocr_vl.py::PaddleOCRVLModelTest::test_flash_attn_2_fp32_ln

quantization_config is not set in text and visual part. so datatype conversion before flash attn will not be called.

Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
Signed-off-by: Wang, Yi <yi.a.wang@intel.com>
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: paddleocr_vl

Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general we should never have to fix quantization in model specific code! cc @SunMarc

@yao-matrix
Copy link
Copy Markdown
Contributor

@SunMarc , could you pls suggest a way for this? Thx very much

Copy link
Copy Markdown
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this regression was most likely caused by #42882. I will fix it

@SunMarc
Copy link
Copy Markdown
Member

SunMarc commented Jan 7, 2026

Please give the PR above a try !

@sywangyi
Copy link
Copy Markdown
Contributor Author

sywangyi commented Jan 8, 2026

Please give the PR above a try !

yes, this PR fix it

@sywangyi sywangyi closed this Jan 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants