Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
c045997
Blockwise float8 quantizer and quantized tensor class.
kwyss-nvidia Feb 12, 2025
bf9c137
Apply linting changes.
kwyss-nvidia Mar 6, 2025
9ce2034
Alignment for 1D scaling for GEMM edge case.
kwyss-nvidia Feb 27, 2025
86d4be8
MR feedback.
kwyss-nvidia Mar 10, 2025
03f88f4
Change API name.
kwyss-nvidia Mar 10, 2025
60e86c0
Fix merge conflict with name change.
kwyss-nvidia Mar 10, 2025
00dffe2
Use common tensor map API.
kwyss-nvidia Mar 10, 2025
f6b5392
Change API to use two scaling mode enums.
kwyss-nvidia Mar 10, 2025
33f2ed0
Fix typo.
kwyss-nvidia Mar 11, 2025
125342d
Update some call sites.
kwyss-nvidia Mar 11, 2025
035e1c9
Tests for torch tensor API surface.
kwyss-nvidia Mar 12, 2025
cc86afb
Reuse scale calculation between quantizer refs.
kwyss-nvidia Mar 12, 2025
a815b2a
Save memory by dropping reference to saved tensors.
kwyss-nvidia Mar 12, 2025
86dbaa8
Remove constexpr parameters from kernel.
kwyss-nvidia Mar 13, 2025
8ad7107
Merge conflict from rebase.
kwyss-nvidia Mar 17, 2025
2d6a379
Add shape implementations for block scaling.
kwyss-nvidia Mar 19, 2025
2306611
Move benchmark to te_playground
kwyss-nvidia Apr 1, 2025
fff1818
Remove amax_epsilon and pow_2_scales from tensor.
kwyss-nvidia Apr 1, 2025
4de7aac
Lint changes.
kwyss-nvidia Apr 1, 2025
e6316e9
Fixup MR changes that broke.
kwyss-nvidia Apr 1, 2025
fd951d8
Safer ifdef in kernel.
kwyss-nvidia Apr 1, 2025
cf0021a
Documentation prose.
kwyss-nvidia Apr 2, 2025
32cc5b4
Reuse compute_scale function from Current Scaling.
kwyss-nvidia Apr 2, 2025
d23ae3b
Bugfix on inf_value scale refactor.
kwyss-nvidia Apr 2, 2025
9dafe5e
Remove qopt calls from test.
kwyss-nvidia Apr 2, 2025
29d22ca
Update pytest list.
kwyss-nvidia Apr 2, 2025
279f791
Add copyright to reference scale calc.
kwyss-nvidia Apr 2, 2025
fff1c6b
Use ptx.cuh functions instead of cde.
kwyss-nvidia Apr 2, 2025
9284a9e
Update shape logic with allocation and reuse shape.
kwyss-nvidia Apr 2, 2025
b52a44d
Usage defaults MR feedback.
kwyss-nvidia Apr 2, 2025
18d80bd
Copyright and header guard.
kwyss-nvidia Apr 3, 2025
18f19bb
Updating torch dispatch code.
kwyss-nvidia Apr 3, 2025
572c04b
Fix exception type.
kwyss-nvidia Apr 3, 2025
bac9348
Use TypeInfo
kwyss-nvidia Apr 3, 2025
93d2bf5
MR feedback.
kwyss-nvidia Apr 4, 2025
15f1007
Update CS scale update test to use updated ref impl
timmoon10 Apr 4, 2025
994e71c
Merge branch 'main' into kwyss/subchannel_quantize_dequantize
timmoon10 Apr 4, 2025
1a34a86
Update JAX scaling mode enum
timmoon10 Apr 4, 2025
51f7b29
Skip tests on Lovelace
timmoon10 Apr 4, 2025
58ed18a
Merge branch 'main' into kwyss/subchannel_quantize_dequantize
timmoon10 Apr 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions qa/L0_pytorch_unittest/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ PYTORCH_JIT=0 NVTE_TORCH_COMPILE=0 NVTE_ALLOW_NONDETERMINISTIC_ALGO=0 python3 -m
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_jit.py || test_fail "test_jit.py"
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_fused_rope.py || test_fail "test_fused_rope.py"
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_float8tensor.py || test_fail "test_float8tensor.py"
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_float8blockwisetensor.py || test_fail "test_float8blockwisetensor.py"
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_float8_blockwise_scaling_exact.py || test_fail "test_float8_blockwise_scaling_exact.py"
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_gqa.py || test_fail "test_gqa.py"
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_fused_optimizer.py || test_fail "test_fused_optimizer.py"
python3 -m pytest -v -s $TE_PATH/tests/pytorch/test_multi_tensor.py || test_fail "test_multi_tensor.py"
Expand Down
1 change: 1 addition & 0 deletions tests/cpp/operator/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ add_executable(test_operator
test_cast_mxfp8_gated_swiglu.cu
test_qdq.cu
test_cast_mxfp8.cu
test_cast_float8blockwise.cu
test_dequantize_mxfp8.cu
test_transpose.cu
test_cast_transpose.cu
Expand Down
Loading