Fix problem with clang-14.0.0 and reference gemm ukr.#854
Merged
devinamatthews merged 1 commit intomasterfrom Feb 7, 2025
Merged
Fix problem with clang-14.0.0 and reference gemm ukr.#854devinamatthews merged 1 commit intomasterfrom
gemm ukr.#854devinamatthews merged 1 commit intomasterfrom
Conversation
Details: - clang 14.0.0 apparently makes some invalid assumptions about whether or not the AB microtile is initialized in the `gemm` reference microkernel. This leads to the "scale by alpha" part doing something strange (all sorts of random and even NaN values pop up). I do not know why this only manifested for `ztrsm` on `skx` (in `zgemm_skx_ref` via `zgemmtrsm_skx_ref`). See #852. - Aliasing the AB microtile (in the proper datatype) as a pointer to a raw character array, and then initializing the character array with `= { 0 }` convinces the compiler to do the right thing. - The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow band of versions which are problematic. - This commit adds the char array workaround and fixes #852.
devinamatthews
added a commit
that referenced
this pull request
Feb 7, 2025
Details: - clang 14.0.0 apparently makes some invalid assumptions about whether or not the AB microtile is initialized in the `gemm` reference microkernel. This leads to the "scale by alpha" part doing something strange (all sorts of random and even NaN values pop up). I do not know why this only manifested for `ztrsm` on `skx` (in `zgemm_skx_ref` via `zgemmtrsm_skx_ref`). See #852. - Aliasing the AB microtile (in the proper datatype) as a pointer to a raw character array, and then initializing the character array with `= { 0 }` convinces the compiler to do the right thing. - The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow band of versions which are problematic. - This commit adds the char array workaround and fixes #852. (cherry picked from commit 028be42)
devinamatthews
added a commit
that referenced
this pull request
Feb 7, 2025
Details: - clang 14.0.0 apparently makes some invalid assumptions about whether or not the AB microtile is initialized in the `gemm` reference microkernel. This leads to the "scale by alpha" part doing something strange (all sorts of random and even NaN values pop up). I do not know why this only manifested for `ztrsm` on `skx` (in `zgemm_skx_ref` via `zgemmtrsm_skx_ref`). See #852. - Aliasing the AB microtile (in the proper datatype) as a pointer to a raw character array, and then initializing the character array with `= { 0 }` convinces the compiler to do the right thing. - The problem did not occur in 14.0.6 or 15.0.7. It may only be a narrow band of versions which are problematic. - This commit adds the char array workaround and fixes #852. (cherry picked from commit 028be42) (cherry picked from commit a0d7f26)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details:
gemmreference microkernel. This leads to the "scale by alpha" part doing something strange (all sorts of random and even NaN values pop up). I do not know why this only manifested forztrsmonskx(inzgemm_skx_refviazgemmtrsm_skx_ref). See NaN encountered in SKX ztrsm (no 1m) #852.= { 0 }convinces the compiler to do the right thing.