-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[SVE] Implement scalable vectors in TVM #16347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cbalint13
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit to the code part.
Hi @ekalda , Very nice work ! I can't comment much, I think this implementation is a careful step toward SVE. |
|
@tvm-bot rerun |
1a912fc to
134d6ab
Compare
This prototype is to accompany the open SVE RFC. It implements the design outlined in the RFC. The main changes to the stack include: 1. tir.split can accept an expression with vscale as a factor 2. LoopVectorizer can create Ramp and Broadcast nodes with scalable lanes 3. BufferLoad and BufferStore nodes can accept an optional predicate which is created in LoopVectorizer 4. LLVM codegen can lower the scalable predicated vectors into llvm.masked.* intrinsics The prototype is currently missing tir.tile and TVMScript parser support for predicates. Co-authored-by: Luke Hutton <luke.hutton@arm.com> Co-authored-by: Neil Hickey <neil.hickey@arm.com>
Change-Id: I7d90c8b8396ba7a2b609a91bea2fe5f599d5cb96
Change-Id: I2e994ffdacaf1dacdc875c2cfd62be433e6952c6
This commit adds support for expressing and printing buffer loads/stores
in TVMScript.
The buffer API has been extended with load and store methods which
support passing a predicate parameter to BufferLoad/Store. When the
printer encounters a predicated BufferLoad/Store, it will print with
the .load/.store syntax as opposed to the shorthand [...] syntax as
it is easier to represent predicates.
Extending the functionality of vload and vstore was considered but they
do not currently support expressing loading/storing of non-consecutive
values and such a change will result in many changes across the
codebase.
An example of a predicated load and store in TVMScript:
```
A.load(
[T.Ramp(i_0 * 4, 1, 4)],
predicate=T.get_active_lane_mask("int1x4", i_0 * 4, 14),
)
B.store(
T.Broadcast(T.float32(1), 4),
[T.Ramp(i_0 * 4, 1, 4)],
predicate=T.get_active_lane_mask("int1x4", i_0 * 4, 14),
)
```
Change-Id: I1305c1b5d052ad109232604c6660e40d4a566dd6
Change-Id: I8a893377d7361bed8840a645fbaff81ab401ccb8
Plumb this argument through the te.split implementation to achieve feature parity with tir.split. Change-Id: I92707bc08e857b9fd5678153d998aeecebcce228
This prototype is to accompany the open SVE RFC. It implements the design outlined in the RFC.
The main changes to the stack include:
tir.splitcan accept an expression with vscale as a factorLoopVectorizercan createRampandBroadcastnodes with scalable lanesBufferLoadandBufferStorenodes can accept an optional predicate which is created inLoopVectorizerLLVM codegen can lower the scalable predicated vectors into
llvm.masked.*intrinsicsThe prototype is currently missing
tir.tileand TVMScript parser support for predicates.