Skip to content

Conversation

@xiaolil1
Copy link
Owner

@xiaolil1 xiaolil1 commented Jul 30, 2025

Draft PR, Just for internal review.
Accuracy has been verified with unit tests and models, include gemv, gemm, multi-batched gemm for NF4 and FP4 data types.

@xiaolil1 xiaolil1 force-pushed the xiaoli/dequant_gemm branch from 09eba29 to 4a23b63 Compare July 30, 2025 08:15
@xiaolil1 xiaolil1 force-pushed the xiaoli/dequant_gemm branch from 4a23b63 to a4c77ad Compare August 7, 2025 14:20
@xiaolil1 xiaolil1 changed the title Add draft 4bit_dequant_gemm_cutlass kernel Add draft gemm_4bit_cutlass kernel Aug 11, 2025
@xiaolil1 xiaolil1 force-pushed the xiaoli/dequant_gemm branch 5 times, most recently from c9010b0 to a2407c0 Compare August 19, 2025 13:04
@xiaolil1 xiaolil1 force-pushed the xiaoli/dequant_gemm branch 7 times, most recently from 2d7b5fe to 3a6959d Compare August 27, 2025 05:52
@xiaolil1 xiaolil1 force-pushed the xiaoli/dequant_gemm branch from 3a6959d to e236452 Compare September 3, 2025 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants