Skip to content

util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)#20452

Merged
laanwj merged 1 commit into
bitcoin:masterfrom
practicalswift:remove-atoi
Oct 4, 2021
Merged

util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)#20452
laanwj merged 1 commit into
bitcoin:masterfrom
practicalswift:remove-atoi

Conversation

@practicalswift
Copy link
Copy Markdown
Contributor

Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17).

@laanwj
Copy link
Copy Markdown
Member

laanwj commented Nov 22, 2020

Note that we have a function ParseInt32 (as well as 64 and UInt variants) for this and I once tried to use it in more places, see #17385.

However these didn't turn out to be actually locale-independent. Maybe this can replace them eventually.

Comment thread src/util/strencodings.h Outdated
@laanwj
Copy link
Copy Markdown
Member

laanwj commented Nov 22, 2020

locale-independent std::from_chars(…) (C++17).

This is good to hear at least. From https://en.cppreference.com/w/cpp/utility/from_chars#Notes :

Unlike other parsing functions in C++ and C libraries, std::from_chars is locale-independent, non-allocating, and non-throwing.

Concept ACK.

Comment thread src/util/strencodings.h Outdated
@DrahtBot
Copy link
Copy Markdown
Contributor

DrahtBot commented Nov 22, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #15936 (Unify bitcoin-qt and bitcoind persistent settings by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@laanwj
Copy link
Copy Markdown
Member

laanwj commented Nov 22, 2020

I think we need to make a choice here (either one or both):

  • Make an emulation of all broken and surprising atoi behavior, but call it something else than ToIntegral, something like LocaleIndependentAtoi, make it clear it should not be used in new code, it's only meant for preserving atoi behavior for backward compatibility reasons.
  • Make a sane integer parsing function that can eventually replace Parse[U]IntXX or their contents in a locale-independent way. I'm fine with the name then.

@practicalswift
Copy link
Copy Markdown
Contributor Author

practicalswift commented Nov 22, 2020

@laanwj

I think we need to make a choice here (either one or both):

  • Make an emulation of all broken and surprising atoi behavior, but call it something else than ToIntegral, something like LocaleIndependentAtoi, make it clear it should not be used in new code, it's only meant for preserving atoi behavior for backward compatibility reasons.
  • Make a sane integer parsing function that can eventually replace Parse[U]IntXX or their contents in a locale-independent way. I'm fine with the name then.

Good point. I thought about those options as well. I think we should do both. This PR is meant as a pure refactoring: it is not meant to change any behaviour that is defined by atoi.

I've now made it more clear that this PR is meant as a quirk-by-quirk compatible replacement for atoi by calling the function LocaleIndependentAtoi :)

Good point about naming: we should keep the name ToIntegral for a future sane version that doesn't emulate atoi or any other legacy function.

@laanwj
Copy link
Copy Markdown
Member

laanwj commented Nov 22, 2020

Okay, agree. I guess it could replace atoi as well as atoi64 in that case (as it's parametrized on type)?

@practicalswift
Copy link
Copy Markdown
Contributor Author

@laanwj Done! I've now replaced all instances of atoi64(s) with LocaleIndependentAtoi<int64_t>(s).

Comment thread src/util/strencodings.h Outdated
@laanwj
Copy link
Copy Markdown
Member

laanwj commented Nov 23, 2020

@laanwj Done! I've now replaced all instances of atoi64(s) with LocaleIndependentAtoi<int64_t>(s).

Thanks, looks good to me now except the documentation nit.

@practicalswift practicalswift force-pushed the remove-atoi branch 2 times, most recently from f14668e to 04271a9 Compare November 23, 2020 15:50
@practicalswift practicalswift changed the title Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17) util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17) Nov 24, 2020
maflcko pushed a commit that referenced this pull request Nov 24, 2020
… leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on #20457 and #20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 24, 2020
…es with leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on bitcoin#20457 and bitcoin#20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
@laanwj
Copy link
Copy Markdown
Member

laanwj commented Nov 25, 2020

I don't get the CI error. Is charconv non-standard in some way?

In file included from primitives/transaction.cpp:10:0:
./util/strencodings.h:16:10: fatal error: charconv: No such file or directory
 #include <charconv>
          ^~~~~~~~~~
compilation terminated.

PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Sep 24, 2021
…es with leading +/-/0:s

05c1095 test: Add testing of ParseInt/ParseUInt edge cases with leading +/-/0:s (practicalswift)

Pull request description:

  Add testing of `ParseInt`/`ParseUInt` edge cases with leading `+`/`-`/`0`:s.

  Context: While working on bitcoin#20457 and bitcoin#20452 I noticed some edge cases which our unit tests are currently not covering.

ACKs for top commit:
  MarcoFalke:
    review ACK 05c1095
  laanwj:
    Code review ACK 05c1095
  jonatack:
    ACK 05c1095
  promag:
    Code review ACK 05c1095.

Tree-SHA512: bdfb94d8fa0293512dbba89907cb6dd0f8b1418d878267dd6d49c8c397a0e5b9714441345565d41a6a909a1cda052ef7cccece822f355ff604fcf85f2dc8136f
fanquake added a commit that referenced this pull request Sep 28, 2021
…irements

182de7b ci: update minimum compiler requirements for std::filesystem (fanquake)
04f5baf doc: update minimum compiler requirements for std::filesystem (fanquake)

Pull request description:

  This increases the minimum required compiler versions to Clang 7 and GCC 8.1. This has been split out of #20744 (migration to `std::filesystem`), as it's also a requirement for some other changes, such as #20452 or #20457 which want to make use of `std::from_chars`. As well as #20435, which is also `std::filesystem` related. Given that the `std::filesystem` changes are moving ahead, splitting out this change to let other PRs take advantage of the new requirements seems worthwhile.

  Clang 7 has been available in Debian since [Stretch (oldoldstable)](https://packages.debian.org/stretch/clang-7) and in Ubuntu since [Bionic (18.04)](https://packages.ubuntu.com/bionic-updates/clang-7). GCC 8 has been available in Debian since [Buster (oldstable)](https://packages.debian.org/buster/gcc) and in Ubuntu since [Bionic (18.04)](https://packages.ubuntu.com/bionic/gcc-8). CentOS 8 also packages GCC 8.

  The CI changes here give us one build with GCC 8, and another using Clang 7 on top of libc++.

  Note that the minimum required libc++ in dependencies.md is unchanged as, at least for `<filesystem>`, and the `*_chars` use cases, libc++ 7 [should be sufficient](https://en.cppreference.com/w/cpp/compiler_support/17).

  I've tested that building `<filesystem>` code using Clang 7 & libc++ works. i.e `clang++-7 -std=c++17 fs.cpp -stdlib=libc++ -lc++fs`. Also that building `<filesystem>` code with Clang 7 and libstdc++ 8 works. i.e `clang++-7 -std=c++17  fs.cpp -lstdc++fs`.

ACKs for top commit:
  MarcoFalke:
    review ACK 182de7b

Tree-SHA512: 5bc151c4be58005711eed6bd8a091f3417f75a0218c11c08cffff9d749edadd965726bb7856a8e693e96e69ed0596989cda1aac4b29fb6d30705b1687a5b3363
@practicalswift
Copy link
Copy Markdown
Contributor Author

Now that #20460 has been merged I think this PR should be ready for final review :)

See also related PR #20457.

…from_chars(…) (C++17)

test: Add test cases for LocaleIndependentAtoi

fuzz: Assert legacy atoi(s) == LocaleIndependentAtoi<int>(s)

fuzz: Assert legacy atoi64(s) == LocaleIndependentAtoi<int64_t>(s)
@jonatack
Copy link
Copy Markdown
Member

Concept ACK

@laanwj
Copy link
Copy Markdown
Member

laanwj commented Sep 30, 2021

Code review ACK 4343f11

Copy link
Copy Markdown
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 4343f11

Comment thread src/rpc/mining.cpp
Comment thread src/util/strencodings.h
// LocaleIndependentAtoi is provided for backwards compatibility reasons.
//
// New code should use the ParseInt64/ParseUInt64/ParseInt32/ParseUInt32 functions
// which provide parse error feedback.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really important comment. I hope we can get rid of these functions entirely at some point. The implicit error handling behavior of atoi and atoi64 is pretty much always undesirable.
(so I would normally comment "these function names are long and clunky, please shorten them" but it's good in this case, we want using them to be ugly 😄 )

@laanwj laanwj merged commit cdb4dfc into bitcoin:master Oct 4, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Oct 4, 2021
Copy link
Copy Markdown
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

concept ack

Comment thread src/core_read.cpp
Comment thread src/qt/rpcconsole.cpp
Comment thread src/util/strencodings.h
Comment thread src/util/moneystr.cpp
@maflcko
Copy link
Copy Markdown
Member

maflcko commented Oct 5, 2021

Follow up in #23184

Copy link
Copy Markdown
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

withdrawn my comments

Comment thread src/core_read.cpp
Comment thread src/qt/rpcconsole.cpp
Comment thread src/util/moneystr.cpp
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants