-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Add support for AMX instructions #5818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
68 commits
Select commit
Hold shift + click to select a range
036b037
Add support for AMX tile instructions
jwlawson 2bd3453
Make AMX transform opt-in with memory type
jwlawson 34e1a4c
Clean up tiled_matmul test
jwlawson dfeac55
Handle AMX intrinsic attributes better
jwlawson a9f84de
Format
jwlawson da04b0a
Fix test to behave like other tests
jwlawson 9923040
Add doc and missing load check
jwlawson 1a1d10a
Format
jwlawson d54fe24
Throw error if user requests AMX for invalid operation
jwlawson 228beda
Add Tile lowering pass to makefile
jwlawson 673480a
Use spaces in Makefile
jwlawson f91d79f
Place AMX instrinsics into a separate module (x86_amx.ll)
9fc4366
Merge branch 'master' into pr/5818
steven-johnson 16a1c7b
Fix CreateAlignedLoad() call in CodeGen_X86
steven-johnson b3b1dfa
Merge branch 'master' into tile_matmul
03f894a
Merge branch 'master' into tile_matmul
97890dd
Merge branch 'master' into tile_matmul
293ca90
Merge branch 'master' into tile_matmul
4b950e5
Merge branch 'master' into tile_matmul
d0e5123
fix exporting to module
frengels 9189120
add llvm funcs for su, us, uu amx variants
frengels 75c4262
add other amx intrinsics to intrinsic_defs
frengels 09f7551
match with unsigned 8 bit integers
frengels 5dd1471
match for 32 bit integer and guard unsigned amx on llvm 13
frengels 7e45c29
adjust test to cover unsigned tile operations
frengels 4ab681b
guard properly with llvm 12
frengels 6339ae7
create explicit error if failed to use tile operations
frengels 525e11e
pass types as template params rather than boolean
frengels 5a21484
clang-format patch
7950614
add x86_amx to makefile's runtime components
frengels 4a6c10c
make tiled_matmul compatible with c++11
frengels 02c375a
Merge remote-tracking branch 'upstream/master' into tile_matmul
frengels f985644
add mattrs required for amx
frengels 7df3d5e
Merge pull request #3 from frengels/tile_matmul
mcleary e3f1ef6
fix formatting issues
frengels 57b6080
remove outdated FIXME comments
frengels ef9d544
Merge pull request #4 from frengels/tile_matmul
mcleary 9c6bfbc
Merge branch 'master' into tile_matmul
b5c46d2
Merge branch 'master' into tile_matmul
5af3dae
Merge remote-tracking branch 'upstream/master' into tile_matmul
frengels 55098f3
add bf16 tile operations to the runtime
frengels 9f078f0
create a schedule that should map to amx
frengels e73702d
create full amx-bf16 schedule
frengels 16217b4
allow amx operations to yield f32s
frengels c722610
accept 32 bit float stores
frengels 66885f1
add support for bf16
frengels 97fb022
add missing bf16 intrinsics
frengels f6ba739
fix striding error when loading matrix
frengels c7278b6
add checks to verify bf16 result
frengels 5e81a72
fix scaling of col_bytes on matmul call
frengels ea74fe2
move brace to previous line
frengels a854dc9
derive result type using a function rather than lambda
frengels d820737
Merge remote-tracking branch 'upstream/master' into tile_matmul_bf16
frengels 26014d2
run clang tidy and format
frengels 34557cb
have tile_store return i32
frengels 5ad06e0
make is_3d_tile_index robust to indexing changes
frengels f0f9f3e
Merge remote-tracking branch 'upstream/master' into tile_matmul_bf16
frengels 7cab155
apply formatting suggestions
frengels 8f83544
both first and second can be const qualified
frengels b1e1452
remove trailing whitespace in unformatted section
frengels 95d38f0
Merge branch 'master' into pr/5818
steven-johnson 14df0bc
make requested style changes
frengels 8b63d77
rename NewMatmul -> Matmul
frengels 6a5eeaa
fix warning about missing return value
frengels cc7c97d
use get_1d_tile_index to handle special case
frengels 014f0c6
add correctness test for AMX instructions
frengels 655dbdf
correctness part has been separated out
frengels abc660b
remove unused variables
frengels File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.