Skip to content

Vectorization cleanup#393

Merged
zasdfgbnm merged 15 commits intomainfrom
vectorization-cleanup
May 26, 2023
Merged

Vectorization cleanup#393
zasdfgbnm merged 15 commits intomainfrom
vectorization-cleanup

Conversation

@zasdfgbnm
Copy link
Collaborator

@zasdfgbnm zasdfgbnm commented May 23, 2023

Currently, vectorization analysis is done by first calling getInnerDimVectorizableWidth to compute teh vectorization factor of the innermost dimension, then expand that to contiguous merge of getMaxVectorizableWidth. However, these two functions are very similar, almost a copy-paste of each other. Therefore, it makes no sense to use getInnerDimVectorizableWidth then getMaxVectorizableWidth, we should remove getInnerDimVectorizableWidth and go directly to getMaxVectorizableWidth.

This PR removes getInnerDimVectorizableWidth, anded a boolean option contig_merge to getMaxVectorizableWidth to reproduce the behavior of getInnerDimVectorizableWidth. The contig_merge=false is only used in the transpose scheduler, where the which dimensions are contiguously merge to which ID is more complicated and is not considered in heuristics.

Unfortunately, because the variable name change, I can not diff the kernel to check if they match exactly or not, but I do briefly skimmed through the diff by eye, and I don't find any change in the vectorization factor.

The V100 failure is OOM.

return vectorize_size;
}

size_t getExpandedVectorization(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A big portion of getVectorizationFactor has been removed, so I just cut-paste the function body of this function into getVectorizationFactor and remove this function.

@zasdfgbnm zasdfgbnm marked this pull request as draft May 23, 2023 17:25
@zasdfgbnm
Copy link
Collaborator Author

!build

@zasdfgbnm
Copy link
Collaborator Author

!build

@zasdfgbnm zasdfgbnm marked this pull request as ready for review May 24, 2023 21:06
@zasdfgbnm zasdfgbnm requested a review from naoyam May 24, 2023 21:08
@zasdfgbnm
Copy link
Collaborator Author

!build

Copy link
Collaborator

@naoyam naoyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes make sense to me. Have you found any change in actual vectorization?

@zasdfgbnm
Copy link
Collaborator Author

The changes make sense to me. Have you found any change in actual vectorization?

No, I didn't find any. But I didn't check programmatically. I just used my eye to inspect the output of

difft --language cpp ../Fuser3/old-kernels/ new-kernels/ | grep load

and didn't find any.

@zasdfgbnm zasdfgbnm merged commit 88a782b into main May 26, 2023
@zasdfgbnm zasdfgbnm deleted the vectorization-cleanup branch May 26, 2023 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants