Windows ARM runner and build fixes#5979
Merged
ggerganov merged 4 commits intoggml-org:masterfrom Mar 11, 2024
Merged
Conversation
ggerganov
reviewed
Mar 10, 2024
| #define GGML_F16x8_ZERO vdupq_n_f16(0.0f) | ||
| #define GGML_F16x8_SET1(x) vdupq_n_f16(x) | ||
| #define GGML_F16x8_LOAD(x) vld1q_f16((const __fp16 *)(x)) | ||
| #define GGML_F16x8_LOAD(x) vld1q_f16((const ggml_fp16_internal_t *)(x)) |
Member
There was a problem hiding this comment.
It would call vld1q_f16 with uint16_t * argument.
Does this produce correct results?
Member
There was a problem hiding this comment.
Does this produce correct results?
Hm, yes - no reason not to.
Wouldn't it be better if instead of introducing ggml_fp16_internal_t, we simply change these to:
diff --git a/ggml.c b/ggml.c
index 80efa6f2..ac0d15b2 100644
--- a/ggml.c
+++ b/ggml.c
@@ -857,7 +857,7 @@ inline static float vaddvq_f32(float32x4_t v) {
#define GGML_F16x8 float16x8_t
#define GGML_F16x8_ZERO vdupq_n_f16(0.0f)
#define GGML_F16x8_SET1(x) vdupq_n_f16(x)
- #define GGML_F16x8_LOAD(x) vld1q_f16((const __fp16 *)(x))
+ #define GGML_F16x8_LOAD(x) vld1q_f16((const uint16_t *)(x))
#define GGML_F16x8_STORE vst1q_f16
#define GGML_F16x8_FMA(a, b, c) vfmaq_f16(a, b, c)
#define GGML_F16x8_ADD vaddq_f16
@@ -900,7 +900,7 @@ inline static float vaddvq_f32(float32x4_t v) {
#define GGML_F32Cx4 float32x4_t
#define GGML_F32Cx4_ZERO vdupq_n_f32(0.0f)
#define GGML_F32Cx4_SET1(x) vdupq_n_f32(x)
- #define GGML_F32Cx4_LOAD(x) vcvt_f32_f16(vld1_f16((const __fp16 *)(x)))
+ #define GGML_F32Cx4_LOAD(x) vcvt_f32_f16(vld1_f16((const uint16_t *)(x)))
#define GGML_F32Cx4_STORE(x, y) vst1_f16(x, vcvt_f16_f32(y))
#define GGML_F32Cx4_FMA(a, b, c) vfmaq_f32(a, b, c)
#define GGML_F32Cx4_ADD vaddq_f32
Contributor
Author
There was a problem hiding this comment.
We'll get some warnings (like incompatible pointer types assigning to 'const __fp16 *' from 'const uint16_t *') from GGML_F16_VEC_LOAD in this case.
But if it's ok, I'll commit the change.
ggerganov
approved these changes
Mar 11, 2024
Member
ggerganov
left a comment
There was a problem hiding this comment.
Let's keep it like proposed
NeoZhangJianyu
pushed a commit
to NeoZhangJianyu/llama.cpp
that referenced
this pull request
Mar 12, 2024
* windows arm ci * fix `error C2078: too many initializers` with ggml_vld1q_u32 macro for MSVC ARM64 * fix `warning C4146: unary minus operator applied to unsigned type, result still unsigned` * fix `error C2065: '__fp16': undeclared identifier`
jordankanter
pushed a commit
to jordankanter/llama.cpp
that referenced
this pull request
Mar 13, 2024
* windows arm ci * fix `error C2078: too many initializers` with ggml_vld1q_u32 macro for MSVC ARM64 * fix `warning C4146: unary minus operator applied to unsigned type, result still unsigned` * fix `error C2065: '__fp16': undeclared identifier`
Seunghhon
pushed a commit
to Seunghhon/llama.cpp
that referenced
this pull request
Apr 26, 2026
* windows arm ci * fix `error C2078: too many initializers` with ggml_vld1q_u32 macro for MSVC ARM64 * fix `warning C4146: unary minus operator applied to unsigned type, result still unsigned` * fix `error C2065: '__fp16': undeclared identifier`
phuongncn
pushed a commit
to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4
that referenced
this pull request
Apr 28, 2026
* windows arm ci * fix `error C2078: too many initializers` with ggml_vld1q_u32 macro for MSVC ARM64 * fix `warning C4146: unary minus operator applied to unsigned type, result still unsigned` * fix `error C2065: '__fp16': undeclared identifier`
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I would like to add several fixes for building the repo on Windows ARM, as well as a CI runner for automated build control.
error C2078: too many initializers with ggml_vld1q_u32with a macro for MSVC ARM64error C2065: '__fp16': undeclared identifierwarning C4146: unary minus operator applied to unsigned type, result still unsigned