ARMSVE Block SVE-Intrinsic Kernels for GCC 8-9#543
ARMSVE Block SVE-Intrinsic Kernels for GCC 8-9#543devinamatthews merged 4 commits intoflame:masterfrom
Conversation
SVE-Intrinsic-based kernels ought not to use asm in their names.
Affected configs: a64fx.
| ( | ||
| 1, | ||
| BLIS_PACKM_8XK_KER, BLIS_DOUBLE, bli_dpackm_armsve256_asm_8xk, | ||
| BLIS_PACKM_8XK_KER, BLIS_DOUBLE, bli_dpackm_armsve256_int_8xk, |
There was a problem hiding this comment.
Shouldn't the 10xk kernel also be used here?
There was a problem hiding this comment.
10xk is for VL=512bits.
Packing kernels are VL-specific, I'm afraid.
| #include "blis.h" | ||
|
|
||
| #ifdef __ARM_FEATURE_SVE | ||
| #if __has_include(<arm_sve.h>) |
There was a problem hiding this comment.
If I understand correctly, the availability of intrinsics really only affects A64fx, right? Can we branch on an A64fx-specific macro here instead of using __has_include which might not be as portable?
There was a problem hiding this comment.
That's right. BLIS_FAMILY_A64FX should work for this case.
| 3, | ||
| BLIS_PACKM_10XK_KER, BLIS_DOUBLE, bli_dpackm_armsve512_asm_10xk, | ||
| BLIS_PACKM_12XK_KER, BLIS_DOUBLE, bli_dpackm_armsve512_asm_12xk, | ||
| BLIS_PACKM_12XK_KER, BLIS_DOUBLE, bli_dpackm_armsve512_int_12xk, |
There was a problem hiding this comment.
Is the 12xk kernel actually used in this config? If not it can be moved to the "old" directory.
|
OK, so it looks like we should:
Another question: why is the 10xk packing kernel not registered for |
|
Packm kernel is not VL-agnostic.
Currently using m_r_d to differentiate SVE VL cases.
On Tue, Oct 5, 2021 at 13:52 Devin Matthews ***@***.***> wrote:
OK, so it looks like we should:
1. Remove references to the 12xk kernel in the context initialization.
2. Move the 12xk kernel to an "old" subdirectory (optional).
3. Use #ifndef BLIS_FAMILY_A64FX instead of __has_include(<arm_sve.h>).
Another question: why is the 10xk packing kernel not registered for m_r_d
== 8?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#543 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB4GUGXMOSROKITTDFQ73NTUFKAAJANCNFSM5ED7KKDA>
.
--
--------------------
*Xu* RuQing
*許* ルーキン
東大理学系研究科
物理学専攻藤堂研
所属研究室Eメール: ***@***.***
東京大学生Eメール: ***@***.***
|
Right, but the two expected outcomes are 8x10 (256 bit) and 16x10 (512 bit) right? Then you need a 10xk packing kernel in both cases. |
|
@devinamatthews That is... Ugh... Unimplemented yet. Unlike GEMM, the PACKM kernels are only implemented for some vector length cases. (Note kernel names. So I have to fall back to |
|
Riiiiight, because the size is the same but your vectors aren't. |
|
SVE isn't great for short vectorization 😢 |
|
Agreed. It's quite handy when doing inner products though ;) |
Done :D Seems that Travis CI is not triggered. |
|
@fgvanzee we're out of credits again 😭 |
|
FYI I've made this kind of things in my dev branch :D https://github.com/xrq-phys/blis/blob/main-dev/.github/workflows/auto-release.yml |
|
Credits are replenished. I also created a Google Calendar reminder (on a 20-day cycle). |
Trying to address
a64fxpart of #535 .Problem:Directive#if __has_include(<arm_sve.h>)will be inserted intoblis.h.This will break down some compilers. (i.e., when we compile BLIS with GCC but want to link it to some apps compiled with vendor CC, the latter would break down if it's not GCC5 or Clang-compatible.)__has_include(<arm_sve.h>)works?<arm_sve.h>exists?Maybe it's better to left all kernels defined and raise runtime errors when<arm_sve.h>is not found?Above approach deprecated.
See Devin's comments for our more stable way.