Fix locale-dependent float printing in GGUF metadata#17331
Fix locale-dependent float printing in GGUF metadata#17331JohannesGaessler merged 3 commits intoggml-org:masterfrom
Conversation
|
This was a fun one to track down! The locale dependency was subtle enough that most people probably wouldn't notice it unless they happened to be running with a non-English locale. I'm glad I could spot the inconsistency and fix it. |
|
@ggerganov Please let me know if you need further help to finish this PR? |
|
@JohannesGaessler Can you confirm it fixes #10613? |
|
This makes the prints in |
1036665 to
d654be7
Compare
|
Originally all binaries were in the |
yeah, just now I fixed it. |
|
I'm wondering if there is any tricks to avoid adding this to every example. It adds a bit of boilerplate everywhere IMO. Since most examples initialize |
Yes, this is a good idea to put this inside the constructor. I am working on this |
|
Ah sorry, |
|
I don't think we use |
6fd289e to
7196124
Compare
|
@JohannesGaessler for all examples don't use it, we can set it explicitly on main. I think doing it in |
JohannesGaessler
left a comment
There was a problem hiding this comment.
I don't see us changing global settings anywhere in common_params_parse. That is fundamentally not something that a library should be doing upon use unless it is done by an explicit and dedicated function.
|
Technically the global settings are set by I'm ok with both approaches (either explicitly set on |
fbe1865 to
050fc7f
Compare
|
What are we waiting on for this PR? Did I miss something? |
|
Sorry, IIRC there were CI failures where I wasn't sure whether they were being caused by this PR. I meant to look into it again and then forgot. |
|
I want to followup for this PR. please get it reviewed. |
|
In that case, please rebase on top of master, resolve the merge conflicts, and ensure that there are no CI failures. |
Add std::setlocale(LC_NUMERIC, "C") to all 16 binaries in the tools/ directory to ensure consistent floating-point formatting.
a4ef097 to
1fb6374
Compare
|
I am not sure the failure we are seeing is related to this change. |
JohannesGaessler
left a comment
There was a problem hiding this comment.
Yes, the CI failures seem to be unrelated.
* Set C locale for consistent float formatting across all binaries. * Add C locale setting to all tools binaries Add std::setlocale(LC_NUMERIC, "C") to all 16 binaries in the tools/ directory to ensure consistent floating-point formatting. * Apply suggestion from @JohannesGaessler --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* Set C locale for consistent float formatting across all binaries. * Add C locale setting to all tools binaries Add std::setlocale(LC_NUMERIC, "C") to all 16 binaries in the tools/ directory to ensure consistent floating-point formatting. * Apply suggestion from @JohannesGaessler --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
* Set C locale for consistent float formatting across all binaries. * Add C locale setting to all tools binaries Add std::setlocale(LC_NUMERIC, "C") to all 16 binaries in the tools/ directory to ensure consistent floating-point formatting. * Apply suggestion from @JohannesGaessler --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
I was running some llama.cpp examples on a system with a German locale (de_DE) and noticed something odd - when llama-cli printed out the model metadata, all the float values had commas as decimal separators (like "0,000000") instead of periods. But when I ran llama-perplexity on the same model, it used periods normally.
After some digging, I found the issue was in the gguf_data_to_str() function in llama-impl.cpp. It was using std::to_string() to format floats, which respects the system's LC_NUMERIC locale setting. So depending on which tool you used and what locale it was running with, you'd get different formatting.
I've changed it to use std::ostringstream with std::locale::classic() instead, which always formats floats with a period as the decimal separator, regardless of the system locale. This should make the output consistent across all tools and locales.
Fixes #10613