Skip to content

Arm64 & ArmSVE: Remove Generic Stride Assembly#608

Merged
devinamatthews merged 4 commits intoflame:masterfrom
xrq-phys:arm-nogen
Feb 7, 2022
Merged

Arm64 & ArmSVE: Remove Generic Stride Assembly#608
devinamatthews merged 4 commits intoflame:masterfrom
xrq-phys:arm-nogen

Conversation

@xrq-phys
Copy link
Copy Markdown
Collaborator

@xrq-phys xrq-phys commented Feb 3, 2022

I removed asm-level generic stride support from armv8a & armsve kernels to cope with #583 .

  • Test for NEON.
  • Test for SVE.
  • Use predicates. Use predicates only in m-dimension.

@xrq-phys xrq-phys marked this pull request as ready for review February 4, 2022 18:40
No need to query MR during kernel runtime.
@xrq-phys
Copy link
Copy Markdown
Collaborator Author

xrq-phys commented Feb 5, 2022

@devinamatthews Hi Devin.

Sorry for this 1-month delay. I guess I've put SVE kernels into a better form for #583 .

@xrq-phys
Copy link
Copy Markdown
Collaborator Author

xrq-phys commented Feb 5, 2022

zgemm

@devinamatthews
Copy link
Copy Markdown
Member

@xrq-phys looks like the problem I had in #609 is fixed by appending %= to labels as we did on mac. Can you update this PR to add %= to all armsve/a64fx labels so I don't get a messy merge by doing it in #609?

Otherwise, is this PR ready to go?

For clang (& armclang?) compilation.

Hopefully solves flame#609 .
@xrq-phys
Copy link
Copy Markdown
Collaborator Author

xrq-phys commented Feb 7, 2022

@devinamatthews Adopted the same set of labeling macros as for Apple. Hope I didn't miss any instruction.

@devinamatthews devinamatthews merged commit 2f3872e into flame:master Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants