Skip to content

Remove old validation code for dequantization of nvfp4#5592

Merged
protonu merged 14 commits intomainfrom
pbasu_bq_py_cleanup
Dec 2, 2025
Merged

Remove old validation code for dequantization of nvfp4#5592
protonu merged 14 commits intomainfrom
pbasu_bq_py_cleanup

Conversation

@protonu
Copy link
Collaborator

@protonu protonu commented Nov 25, 2025

Stacked on top of #5591.

This removes old validation code in favor of new dequantization of nvfp4 that was added in above mentioned PR.
No tests needed.

@github-actions
Copy link

github-actions bot commented Nov 25, 2025

Review updated until commit c6cf36a

Description

  • Remove old nvfp4 dequantization validation code (unpack_fp4_bytes, e2m1_to_fp32)

  • Replace with new dequantization functions (fp4_to_fp32, unpack_fp4)

  • Update test files to use new function signatures

  • Simplify dequantize_to_dtype implementation using dequantize_fp4

Changes walkthrough

Relevant files
Enhancement
test_cutlass_nvfp4_gemm.py
Update test imports and function calls                                     

tests/python/direct/test_cutlass_nvfp4_gemm.py

  • Update imports to use fp4_to_fp32 and unpack_fp4 instead of
    unpack_fp4_bytes
  • Modify function calls to use new signature with view(torch.uint8)
  • +4/-3     
    test_python_frontend.py
    Update test imports and function calls                                     

    tests/python/direct/test_python_frontend.py

  • Update imports to use fp4_to_fp32 and unpack_fp4 instead of
    unpack_fp4_bytes
  • Modify function calls to use new signature with view(torch.uint8)
  • +3/-2     
    narrow_precision.py
    Remove old dequantization validation code                               

    tests/python/direct_utils/narrow_precision.py

  • Remove old validation code: kE2M1ToFloatArray, e2m1_to_fp32,
    unpack_fp4_bytes
  • Update dequantize_to_dtype to use new dequantize_fp4 function
  • Simplify implementation by removing manual unpacking logic
  • +4/-47   

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Missing dequantize_fp4 function

    The code references a dequantize_fp4 function that is not defined in this file. This function appears to be part of the new dequantization implementation that should be available. Need to verify this function exists and is properly imported.

    out = dequantize_fp4(
        tensor_fp4.view(torch.uint8), tensor_sf, (6.0 * 448.0) / global_scale
    )
    Import consistency

    The imports are changed to use fp4_to_fp32 and unpack_fp4 from python.direct_utils, but these functions should be verified to exist and provide the same functionality as the removed unpack_fp4_bytes.

    fp4_to_fp32,
    unpack_fp4,
    Import path verification

    The imports are changed to use fp4_to_fp32 and unpack_fp4 from python.direct_utils.narrow_precision. Need to ensure these functions are properly exported from this module and provide equivalent functionality.

    fp4_to_fp32,
    unpack_fp4,

    @protonu
    Copy link
    Collaborator Author

    protonu commented Nov 25, 2025

    !test

    @protonu protonu requested a review from jjsjann123 November 25, 2025 21:47
    @protonu protonu marked this pull request as ready for review November 25, 2025 21:47
    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Nov 25, 2025

    Greptile Overview

    Greptile Summary

    This PR removes deprecated FP4 (nvfp4) dequantization code in favor of the newer functions (fp4_to_fp32, unpack_fp4, dequantize_fp4) that were added in PR #5591.

    • Removed kE2M1ToFloatArray, e2m1_to_fp32, and unpack_fp4_bytes functions from narrow_precision.py
    • Refactored dequantize_to_dtype to delegate to dequantize_fp4 instead of manually implementing the dequantization logic
    • Updated test files to use the new fp4_to_fp32(unpack_fp4(...)) pattern instead of the removed unpack_fp4_bytes
    • The new implementation is mathematically equivalent to the old code, using a lookup table approach for FP4→FP32 conversion

    Confidence Score: 5/5

    • This PR is safe to merge - it's a straightforward code cleanup that removes redundant validation code in favor of newer, equivalent implementations.
    • The changes remove duplicate code and consolidate FP4 dequantization into the newer functions. Mathematical equivalence between old and new implementations has been verified. The changes are limited to test utility files and don't affect production code paths.
    • No files require special attention

    Important Files Changed

    File Analysis

    Filename Score Overview
    tests/python/direct_utils/narrow_precision.py 5/5 Removed deprecated kE2M1ToFloatArray, e2m1_to_fp32, and unpack_fp4_bytes functions. Refactored dequantize_to_dtype to use the newer dequantize_fp4 function. The new implementation is mathematically equivalent to the old code.
    tests/python/direct/test_cutlass_nvfp4_gemm.py 5/5 Updated imports and usages to use new fp4_to_fp32 and unpack_fp4 functions instead of removed unpack_fp4_bytes. The new function chain produces equivalent results.
    tests/python/direct/test_python_frontend.py 5/5 Updated imports and test reference calculation to use new fp4_to_fp32(unpack_fp4(...)) instead of removed unpack_fp4_bytes. Functionally equivalent change.

    Sequence Diagram

    sequenceDiagram
        participant Test as Test Code
        participant DP as narrow_precision.py
        participant U as unpack_fp4()
        participant F as fp4_to_fp32()
        participant D as dequantize_fp4()
    
        Note over Test,D: Old Flow (Removed)
        Test->>DP: unpack_fp4_bytes(tensor)
        DP->>DP: e2m1_to_fp32() per element
        DP-->>Test: float32 tensor
    
        Note over Test,D: New Flow
        Test->>U: unpack_fp4(tensor.view(uint8))
        U-->>Test: unpacked nibbles
        Test->>F: fp4_to_fp32(unpacked)
        F-->>Test: float32 tensor (via LUT)
    
        Note over Test,D: dequantize_to_dtype (Refactored)
        Test->>DP: dequantize_to_dtype(tensor, sf, scale)
        DP->>D: dequantize_fp4(tensor, sf, amax)
        D->>U: unpack_fp4()
        D->>F: fp4_to_fp32()
        D-->>DP: scaled float32 tensor
        DP-->>Test: reshaped result
    
    Loading

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Additional Comments (1)

    1. tests/python/direct_utils/narrow_precision.py, line 14-54 (link)

      logic: removing unpack_fp4_bytes, e2m1_to_fp32, and kE2M1ToFloatArray will break tests that still import and use these functions:

      • tests/python/direct/test_python_frontend.py:2700 uses unpack_fp4_bytes
      • tests/python/direct/test_cutlass_nvfp4_gemm.py:155-156 uses unpack_fp4_bytes

      these test files need to be updated to either use the new dequantize_fp4 function or have unpack_fp4_bytes remain available

    1 file reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    protonu and others added 2 commits November 25, 2025 16:53
    Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
    @protonu
    Copy link
    Collaborator Author

    protonu commented Nov 25, 2025

    !test

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    3 files reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    Copy link
    Collaborator

    @jjsjann123 jjsjann123 left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    LGTM

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    3 files reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    Base automatically changed from pbasu_bq_py to main December 2, 2025 17:15
    @protonu
    Copy link
    Collaborator Author

    protonu commented Dec 2, 2025

    !test

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    3 files reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    @protonu
    Copy link
    Collaborator Author

    protonu commented Dec 2, 2025

    !test

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    3 files reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    @protonu protonu merged commit 3050026 into main Dec 2, 2025
    61 checks passed
    @protonu protonu deleted the pbasu_bq_py_cleanup branch December 2, 2025 23:31
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants