Skip to content

Conversation

@ZhangLirong-amd
Copy link
Contributor

No description provided.

@ZhangLirong-amd
Copy link
Contributor Author

ZhangLirong-amd commented Jul 29, 2025

custom op define only support types like Tensor, Tensor[], int[], int, float, bool... Not support like Enum, Object and so on..
The list of op custom op not support and we need to change:

register_graph_buffers  : not support list[str], list[list[int] as input schem only remain this one 

finished op:
fmoe_int8_g1u0,  
fmoe_g1u1 , 
fmoe_g1u1_tkw1, 
fmoe_fp8_blockscale_g1u1,
moe_stage1_g1u1
mha_varlen_bwd : dict
hipb_mm
hipb_findallsols

For fake function, we can not infer the shape in @compile_ops, thus we return None for all op which is return Tensor before.
The following op list is we need to modify it's api to return None.
Part of them is finished, the remaining list:

add
sub,
div,
sigmoid : done
tanh : done
all_reduce_asm_
all_reduce_rmsnorm_
all_reduce_rmsnorm_quant_
get_graph_buffer_ipc_meta
allocate_meta_buffer
get_meta_buffer_ipc_handle
module_mha_varlen_fwd
hipb_mm
rocb_mm
mha_varlen_fwd  :list
pa_fwd_naive :done
pa_fwd_asm  :done
module_moe_asm

But these maybe not all problematic ops. There may be more during the testing process.

update some new problematic ops need return None:

mha_batch_prefill  :list
fmha_v3_fwd: list
fmha_v3_bwd
mha_bwd
mha_fwd

@valarLip valarLip self-assigned this Jul 31, 2025
@ZhangLirong-amd ZhangLirong-amd force-pushed the zlr/dev_custom branch 5 times, most recently from 21e73dc to d5f9235 Compare August 1, 2025 02:47
@ZhangLirong-amd
Copy link
Contributor Author

ZhangLirong-amd commented Aug 1, 2025

problem in accuracy:

add
mul
sub
div
get_graph_buffer_ipc_meta
get_meta_buffer_ipc_handle
aiter_sigmoid
aiter_tanh
pa_fwd_naive
pa_fwd
mha_batch_prefill
mha_bwd
mha_fwd
mha_varlen_bwd
mha_varlen_fwd
all_reduce_asm
all_reduce_rmsnorm
all_reduce_rmsnorm_quant
fmha_v3_bwd
fmha_v3_fwd
fmha_v3_varlen_bwd

RocSolIdxBlas: rocb_mm

module_moe_asm

hipb_mm

@ZhangLirong-amd ZhangLirong-amd force-pushed the zlr/dev_custom branch 2 times, most recently from f9aa765 to 60c2990 Compare August 3, 2025 15:48
@ZhangLirong-amd ZhangLirong-amd changed the title [WIP]Enable custom op and avoid graph breaks Enable custom op and avoid graph breaks Aug 4, 2025
@valarLip valarLip merged commit b65200e into main Aug 5, 2025
14 checks passed
@valarLip valarLip deleted the zlr/dev_custom branch August 5, 2025 01:50
zhuyuhua-v pushed a commit that referenced this pull request Sep 17, 2025
* first commit

* revert some conflict

* revert some conflict2

* support custom op define schema for some ops

* support some of op return None value

* support gemm return None

* support other op for custom

* commit on mha, gemm, moe

* fix pa test

* commit for enable op

* add mha op multi return support

* support reduce

* support mha fwd

* add support mha fwd and mha_v3

* support mhd bwd and reformat files

* fix ci error and support mha

* rewrite ops

* reformat

* fix ci

* fix ci

* skip three ops in custom

* add cpu backend

* support rms_norm op

* support hipb_mm and moe gate

* fix bug

* fix bug with comment

* support mha_v3_varlen

* use common func to reduce code in mha

* reformat

* fix some bug in ci

* fix some bug in ci

* fix rms norm bug

* fix ci

* fix ci

* fix moe bug
divakar-amd pushed a commit that referenced this pull request Oct 21, 2025
* first commit

* revert some conflict

* revert some conflict2

* support custom op define schema for some ops

* support some of op return None value

* support gemm return None

* support other op for custom

* commit on mha, gemm, moe

* fix pa test

* commit for enable op

* add mha op multi return support

* support reduce

* support mha fwd

* add support mha fwd and mha_v3

* support mhd bwd and reformat files

* fix ci error and support mha

* rewrite ops

* reformat

* fix ci

* fix ci

* skip three ops in custom

* add cpu backend

* support rms_norm op

* support hipb_mm and moe gate

* fix bug

* fix bug with comment

* support mha_v3_varlen

* use common func to reduce code in mha

* reformat

* fix some bug in ci

* fix some bug in ci

* fix rms norm bug

* fix ci

* fix ci

* fix moe bug
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants