Skip to content

Conversation

@JackWeiw
Copy link
Contributor

compact_buffer_region PASS modify shared buffer stride[0] to

T.int64(72) * T.min((n + T.int64(63)) // T.int64(64) * T.int64(64), T.int64(96)) and stride[1] is T.int64(72)
but in LowerOpaqueBlock PASS it report error:
InternalError: Check failed: (is_zero(floormod(buffer->strides[i - 1], buffer->strides[i]))) is false:

For more detaied discuss, see here

@JackWeiw
Copy link
Contributor Author

CC @Lunderberg @wrongtest-intellif

Copy link
Contributor

@Lunderberg Lunderberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for retargeting on main, and looks good to me! The current CI failure is due to incorrect python formatting, and looks like you just need to run the black formatter on test_tir_transform_lower_opaque_block.py.

@JackWeiw
Copy link
Contributor Author

Thank you for retargeting on main, and looks good to me! The current CI failure is due to incorrect python formatting, and looks like you just need to run the black formatter on test_tir_transform_lower_opaque_block.py.

This PR has passed all checks. Could u please give the approval to merge this PR?
I will rise another PR to fix bug in PASS InjectPTXAsyncCopy after this PR been merged.
Thank u very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants