-
Notifications
You must be signed in to change notification settings - Fork 23
[AArch64] NEON, SVE2 and SME2 instruction support with tests #439
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
75 commits
Select commit
Hold shift + click to select a range
51ade58
Fixed execution logic for UMINP and UMAXP neon instructions.
FinnWilkinson 6a11d7d
Implemented ldrsb (32-bit, Post) instruction with test.
FinnWilkinson 520324c
Fixed implementation of NEON CMHS instruction.
FinnWilkinson 2b4a886
Implemented UCVTF (fixed-point to float) instruction with test.
FinnWilkinson e43ada7
Implemented UCVTF (fixed-point to float) helper function.
FinnWilkinson 4773af8
Implemented UDOT (by element) NEON instructions with tests.
FinnWilkinson 50a8a20
Implemented LD1 (NEON 8h x2, post index) instruction with tests.
FinnWilkinson 6696d5f
Implemented NEON UMLAL (32 to 64 bit) instruction with tests.
FinnWilkinson bb5096a
Implemented NEON UMLAL2 (32 to 64 bit) instruction with tests.
FinnWilkinson 09d6506
Implemented NEON ST1 (single vector, post index) instruction with tests.
FinnWilkinson f6e7c03
Implemented NEON LD1 (single vector, post index, 8b) instruction with…
FinnWilkinson 74e9b47
Implemented SVE LD1RQB (imm offset) instruction with tests.
FinnWilkinson 4daf705
Implemented SVE LD1RQB (reg offset) instruction with tests.
FinnWilkinson 810a324
Implemented SVE UDOT (4-way, indexed) instruction and tests.
FinnWilkinson 2db08ae
Implemented SVE ZIP1+2 (byte) instructions and tests.
FinnWilkinson 7ac89e8
Implemented SVE faddv (float and double) instructions and tests.
FinnWilkinson bb73761
Implemented SVE PTRUE (as counter) instructions with tests.
FinnWilkinson 9febab0
Added paciasp and autiasp empty execution logic.
FinnWilkinson b45d8d7
Implemented NEON UMULL (uint16 to uint32) instruction and tests.
FinnWilkinson 6383d98
Implemented RDSVL and tests.
FinnWilkinson 6416237
Implemented ZERO {zt0} instruction with test.
FinnWilkinson 9a3dc35
Implemented ld1d (4 consec vecs, uint64) SVE instruction with tests, …
FinnWilkinson 4873657
Implemented ld1d (2 consec vecs, uint64) SVE instruction with tests.
FinnWilkinson b82ec90
Implemented SME mova (tile to vec, 4 regs, 8-bit) instruction with te…
FinnWilkinson 89d7501
Implemented pred-as-counter to pred_as_mask function, and added unit …
FinnWilkinson ad5bd87
Implemented st1d (2 consec vecs, uint64) SVE2 instruction with tests.
FinnWilkinson 9bf115a
Implemented st1d (2 consec vecs, uint64, scalar offset) SVE2 instruct…
FinnWilkinson ff8bb58
Implemented LD1W (2 vec and 4 vec, imm offset) SVE2 instructions with…
FinnWilkinson c40e9f4
Implemented LD1W (2 vec, scalar offset) SVE2 instruction with tests.
FinnWilkinson 5f4fd1c
Implemented ST1W (2 vec, imm and scalar offset) SVE2 instructions wit…
FinnWilkinson 7a717e1
Implemented LD1B (2 vec, imm and scalar offset) SVE2 instructions wit…
FinnWilkinson 6dca410
Implemented UMPOA (8-bit to 32-bit widening uint) SME instruction wit…
FinnWilkinson 8b1f9e7
Implemented LD1B (4 vec, imm offset) SVE2 instruction with tests.
FinnWilkinson 5325d3f
Implemented UDOT (4-way, VGx4 8-bit to 32-bit widening, indexed vecto…
FinnWilkinson 7125a40
Implemented MOVA (array to vecs, 4 registers) SME instruction with te…
FinnWilkinson c6da568
Implemented ST1W (4 vec, imm offset) SVE2 instructions with tests.
FinnWilkinson 7e2f9a4
Fixed SVE udot execution logic.
FinnWilkinson 6772b66
Fixed issue with LD1B SVE2 (4 vec) instruction.
FinnWilkinson ab80ba7
Implemented FMLA (float, double, VGx4, indexed) SME instruction with …
FinnWilkinson 9e762b8
Implemented st1d (4 consec vecs, uint64, imm offset) SVE2 instruction…
FinnWilkinson 7de0082
Added NEON bf16 UDOT (by element) instruction execution logic and BF1…
FinnWilkinson 14a79d8
Implemented ld1b (4 strided vectors, imm and reg offset) instructions…
FinnWilkinson 2db03bc
Implemented UVDOT (VGx4 8-bit to 32-bit widening, indexed vector) SME…
FinnWilkinson 68038b7
Implemented ST4W (imm offset) SVE instruction with tests.
FinnWilkinson 4a8f3f6
Implemented LD1W (4 vec, scalar offset) SVE2 instruction with tests.
FinnWilkinson 3d5b288
Implemented FMLA (float, VGx4) SME instruction with tests.
FinnWilkinson b9dcabe
Implemented MOVA (array to vecs, 2 registers) SME instruction with te…
FinnWilkinson b988e01
Implemented FADD (float, vgx2) SME instruction with tests.
FinnWilkinson 4f75ffe
Implemented LD1D (4 vec, scalar offset) SVE2 instruction with tests.
FinnWilkinson f35472b
Implemented FMLA (double, VGx4) SME instruction with tests.
FinnWilkinson 1bf3306
Implemented FADD (double, vgx2) SME instruction with tests.
FinnWilkinson 4effde4
Implemented LD1H (Single vec, imm offset) SVE instruction with tests.
FinnWilkinson 40bba12
Added SVE bf16 DOT (indexed) instruction execution logic.
FinnWilkinson 3932360
Implemented LD1H (two vec, imm and scalar offset) SVE instruction wit…
FinnWilkinson 5aad523
Implemented BFMOPA (widening) SME instruction.
FinnWilkinson 430c775
Minor UMAXP fix.
FinnWilkinson a01c2fc
Fixed function comment.
FinnWilkinson 9790c6e
Updated BF16 comment.
FinnWilkinson 5bc9330
Implemented NEON UDOT (by vector) instruction with tests.
FinnWilkinson 1fd130c
Implemented SVE UDOT (by vector, 4-way) instruction with tests.
FinnWilkinson 81ddba7
Implemented SVE ST4W (scalar offset) instruction with tests, and chan…
FinnWilkinson 4c99a0f
Implemented LD1B (4 vec, scalar offset) SVE2 instruction with tests.
FinnWilkinson 0d74234
Implemented UDOT (4-way, VGx4 8-bit to 32-bit widening) SME instructi…
FinnWilkinson 40a0fa4
Implemented ADD (uint32, vgx2, vectors and ZA), SME instruction with …
FinnWilkinson 950de41
Implemented ZIP (4 vectors) SVE2 instruction with tests.
FinnWilkinson 03a95e7
Attended PR comments.
FinnWilkinson 6729363
Minor bug fixes.
FinnWilkinson 850b741
Attended PR comments.
FinnWilkinson 1d04096
Updated multi-vector load logic.
FinnWilkinson 246d39a
CI CD fixes.
FinnWilkinson 0ec0b8d
CI CD fixes pt2.
FinnWilkinson 6110bce
Attended PR comments.
FinnWilkinson affba83
Merge remote-tracking branch 'origin/dev' into sme-loops-support
FinnWilkinson aa7d937
Merge remote-tracking branch 'origin/dev' into sme-loops-support
FinnWilkinson 2e34160
Bracket removed.
FinnWilkinson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.