-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[Hexagon] 3-stage pipeline; multi queue async DMA for cache read / write #12954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CC @masahi |
| sched = tvm.testing.parameter("cache_read", "cache_read_write") | ||
|
|
||
|
|
||
| @tvm.testing.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this PR, we could technically test any n-stage pipeline, correct (doesn't have to be limited to 3-stage)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. This allows for any number of cache_read and cache_write stages to be lowered using Async DMA on Hexagon. Note that there is a known issue when trying to do cache_read for an op with multiple inputs in the same stage which will be addressed in a future PR. Future PR will modify compute on this test to be a + b instead of a + 1 and add support to lower cache_read of both a and b in the same stage to Async DMA.
tmoreau89
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR Adam, LGTM (left a few nits)
|
@tvm-bot rerun |
|
Thanks @adstraw , the PR has been merged |
…ite (apache#12954) * [Hexagon] 3-stage pipeline; multi queue async DMA for cache rd / wr * add cache_write (no cache_read) schedule to python test
Add
HexagonUserDMAsupport for multiple virtual queues which enables Async DMA for bothcache_readandcache_writewhile maintaining a single descriptor chain to maintain overall FIFO ordering between virtual queues. Tested with runtime unit tests and at the python level for a simple operator.