Skip to content

support flashmask V4#8

Open
ckl117 wants to merge 4 commits intozoooo0820:support_eb_sm100_fp8from
ckl117:support_eb_sm100_fp8
Open

support flashmask V4#8
ckl117 wants to merge 4 commits intozoooo0820:support_eb_sm100_fp8from
ckl117:support_eb_sm100_fp8

Conversation

@ckl117
Copy link

@ckl117 ckl117 commented Jan 20, 2026

1、拉取优化JIT编译FA4仓库

# PR 未合入 https://github.com/PaddlePaddle/flash-attention/pull/101
git clone -b  opt_compile_key https://github.com/ckl117/flash-attention.git

2、重新编译fastdeploy

3、增加服务启动环境变量

export PYTHONPATH=/root/flash-attention/flashmask:$PYTHONPATH
export FD_ATTENTION_BACKEND=FLASH_ATTN
# 仅支持C16,确保模型config.json中没有指定cache量化类型

启动服务...

@zoooo0820 zoooo0820 force-pushed the support_eb_sm100_fp8 branch 3 times, most recently from 3712222 to e967af8 Compare February 2, 2026 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant