Skip to content

Revert "Pipe quantize kernel through FusionExecutorCache (#4760)"#4854

Merged
wujingyue merged 1 commit intomainfrom
wjy/revert
Jul 26, 2025
Merged

Revert "Pipe quantize kernel through FusionExecutorCache (#4760)"#4854
wujingyue merged 1 commit intomainfrom
wjy/revert

Conversation

@wujingyue
Copy link
Collaborator

@wujingyue wujingyue commented Jul 26, 2025

This reverts commit a70e0f5.

Fixes #4852. The potential proper fix #4853 seems to hit issues. So I created this to fix CI sooner.

This reverts commit a70e0f5.

It broke
```
_bn && pytest tests/python/opinfo/test_legacy_ops.py -k test_correctness_abs_float64 -s
```
@wujingyue wujingyue requested review from naoyam and zasdfgbnm July 26, 2025 00:03
@github-actions
Copy link

Description

  • Reverted commit affecting quantize kernel

  • Corrected function calls and return types

  • Updated precision calculation and tests


Changes walkthrough 📝

Relevant files
Bug fix
13 files
fusion_segmenter.cpp
Updated precision calculation function calls                         
+5/-5     
utils.cpp
Updated precision calculation function implementation       
+4/-3     
alias.cpp
Corrected ViewOp argument order                                                   
+1/-1     
matmul.cpp
Removed sub-byte data type check                                                 
+0/-9     
normalization_inner_outer.cpp
Removed sub-byte data type check                                                 
+0/-10   
normalization_utils.cpp
Removed sub-byte data type check                                                 
+0/-9     
reduction.cpp
Removed sub-byte data type check                                                 
+0/-10   
resize.cpp
Removed sub-byte data type check                                                 
+0/-9     
transpose.cpp
Removed sub-byte data type check                                                 
+0/-9     
utils.cpp
Updated precision calculation function call                           
+1/-1     
test_gpu3.cpp
Updated precision calculation test cases                                 
+7/-7     
test_low_precision_recipe.cpp
Updated quantize test cases and fusion handling                   
+14/-30 
helpers.cu
Removed device functions for sub-byte types                           
+0/-20   
Documentation
1 files
utils.h
Updated function comment for precision calculation             
+3/-3     

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests
⚡ Recommended focus areas for review

Incorrect Precision Values

The precision values in the test cases seem incorrect. The expected values for tv1_precision and tv2_precision are set to 2 and 4, and 4 and 2 respectively, which do not match the typical bit sizes for data types like Half and Int.

auto tv1_precision = ir_utils::getPrecisionOfProducerConsumerTensors(
    tv1->definition()->as<UnaryOp>());
ASSERT_TRUE(tv1_precision.has_value());
EXPECT_EQ(tv1_precision->first, 2);
EXPECT_EQ(tv1_precision->second, 4);

auto tv2_precision = ir_utils::getPrecisionOfProducerConsumerTensors(
    tv2->definition()->as<UnaryOp>());
ASSERT_TRUE(tv2_precision.has_value());
EXPECT_EQ(tv2_precision->first, 4);
EXPECT_EQ(tv2_precision->second, 2);

// Precision of type Index is not possible to determine until lowering
auto tv4_precision = ir_utils::getPrecisionOfProducerConsumerTensors(
    tv4->definition()->as<UnaryOp>());
ASSERT_FALSE(tv4_precision.has_value());
Function Name Change

The function getPrecisionOfProducerConsumerTensors has been renamed from getPrecisionOfProducerConsumerTensorsBit. Ensure that all references to the old function name have been updated to avoid any compilation errors.

std::optional<std::pair<int64_t, int64_t>> getPrecisionOfProducerConsumerTensors(
    UnaryOp* uop) {
  NVF_CHECK(uop != nullptr);
  NVF_CHECK(
      uop->getUnaryOpType() == UnaryOpType::Cast,
      "Invalid expr: ",
      uop->toString());

  auto inp_tv = ir_utils::getTvInput(uop);
  auto out_tv = ir_utils::getTvOutput(uop);
  if (inp_tv == nullptr || out_tv == nullptr) {
    return std::nullopt;
  }

  auto inp_dtype = inp_tv->dtype().type;
  auto out_dtype = out_tv->dtype().type;
  auto inp_prim_type = std::get_if<PrimDataType>(&inp_dtype);
  auto out_prim_type = std::get_if<PrimDataType>(&out_dtype);

  if (inp_prim_type == nullptr || out_prim_type == nullptr ||
      *inp_prim_type == PrimDataType::Index ||
      *out_prim_type == PrimDataType::Index) {
    return std::nullopt;
  }

  return std::make_pair(
      primDataTypeSizeByte(*inp_prim_type),
      primDataTypeSizeByte(*out_prim_type));
}
Removed Functions

Several functions such as fmax for __half and __bfloat, and abs for __half and __bfloat have been removed. Verify that these functions are no longer needed or that their functionality has been replaced elsewhere.

__device__ float fmax(float a, float b) {
  // check and propagate NaN
  if (a != a) {
    return a;
  } else { // If b is nan, it will be returned in the next line
    return a > b ? a : b;
  }
}

__device__ constexpr int min(int a, int b) {
  return a > b ? b : a;
}

@wujingyue
Copy link
Collaborator Author

The potential proper fix #4853 seems to hit issues. So I created this to fix CI sooner.

@wujingyue
Copy link
Collaborator Author

!test

@wujingyue wujingyue merged commit d268897 into main Jul 26, 2025
37 of 39 checks passed
@wujingyue wujingyue deleted the wjy/revert branch July 26, 2025 02:18
nsarka pushed a commit to nsarka/Fuser that referenced this pull request Jul 28, 2025
zasdfgbnm added a commit that referenced this pull request Jul 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test_correctness_abs_float64 - AssertionError: Tensor-likes are not close!

2 participants