-
Notifications
You must be signed in to change notification settings - Fork 0
8. String to Integer (atoi) #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hayashi-ay
wants to merge
9
commits into
main
Choose a base branch
from
hayashi-ay-patch-58
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
f2d1bc2
Create 8. String to Integer (atoi).md
hayashi-ay 6e33c00
Update 8. String to Integer (atoi).md
hayashi-ay a87b9c8
Update 8. String to Integer (atoi).md
hayashi-ay 6b8e489
Update 8. String to Integer (atoi).md
hayashi-ay 6ef0062
Update 8. String to Integer (atoi).md
hayashi-ay 14c45b2
Update 8. String to Integer (atoi).md
hayashi-ay fc06dc3
Update 8. String to Integer (atoi).md
hayashi-ay 4580d68
Update 8. String to Integer (atoi).md
hayashi-ay aa33437
Update 8. String to Integer (atoi).md
hayashi-ay File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,190 @@ | ||
| 文字を1文字ずつ読み進めれば良い。フェーズとしては以下の3つ | ||
| - 空白を無視して読み進める。 | ||
| - 符号を判定する | ||
| - 数値を読んでいく | ||
|
|
||
| INT_MAX, INT_MINを超えたらそこで終了する。Pythonのintは理論上は上限がないので途中のオーバーフローなどは考えなくて良い。 | ||
| CやC++で実装する場合でも32bitより大きい型(long)を確保してあげれば考えなくて良くなる。← ただlongの場合のオーバーフローの考慮などどこかではちゃんと考慮しないといけない。その場合は、桁上りの計算をする前にオーバーフローするかどうかを判定すれば良い。 | ||
|
|
||
| > Integers have unlimited precision. | ||
|
|
||
| https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex | ||
|
|
||
| 1st | ||
|
|
||
| isdigitの実装は https://github.com/python/cpython/blob/2305ca51448552542b2414186252123a8dc87db7/Objects/bytes_methods.c#L154 | ||
| 空白の判定には、`str.isspace()`も使用できる。LeetCodeの要件的には`' '`との比較だけで良い。 | ||
|
|
||
| ```python | ||
| class Solution: | ||
| def myAtoi(self, s: str) -> int: | ||
| def skip_white_spaces(): | ||
| index = 0 | ||
| while index < len(s) and s[index] == ' ': | ||
| index += 1 | ||
| return index | ||
|
|
||
| def get_sign(begin): | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 文字を解析してインデックスと符号を得ているため、 get より parse という単語を使用したほうがしっくりくると思います。 |
||
| index = begin | ||
| if index == len(s): | ||
| return index, 1 | ||
| if s[index] == '-': | ||
| return index + 1, -1 | ||
| if s[index] == '+': | ||
| return index + 1, 1 | ||
| return index, 1 | ||
|
|
||
| index = skip_white_spaces() | ||
| index, sign = get_sign(index) | ||
|
|
||
| INT_MAX = (1 << 31) - 1 | ||
| INT_MIN = - INT_MAX - 1 | ||
|
|
||
| number = 0 | ||
| while index < len(s) and s[index].isdigit(): | ||
| digit = ord(s[index]) - ord('0') | ||
| number = number * 10 + sign * digit | ||
| if number >= INT_MAX: | ||
| return INT_MAX | ||
| if number <= INT_MIN: | ||
| return INT_MIN | ||
| index += 1 | ||
| return number | ||
| ``` | ||
|
|
||
| > 32-bits環境だと(1 << 31)をした時点でオーバーフローする | ||
|
|
||
| 2nd | ||
|
|
||
| 桁上りの掛け算をする前にオーバーフローの判定をする版。あと関数に切り出さずに書いた。処理が素直に上から下に流れるので関数に切り出さなくても大丈夫かも。 | ||
| むしろposを取り回している分切り出すと逆に分かりづらいかもしれない。 | ||
|
|
||
| ```python | ||
| class Solution: | ||
| def myAtoi(self, s: str) -> int: | ||
| pos = 0 | ||
| # 空白をskipする | ||
| while pos < len(s) and s[pos].isspace(): | ||
| pos += 1 | ||
| if pos == len(s): | ||
| return 0 | ||
| # 符号の判定 | ||
| sign = 1 | ||
| if s[pos] == '+' or s[pos] == '-': | ||
| if s[pos] == '-': | ||
| sign = -1 | ||
| pos += 1 | ||
| INT_MAX = (1 << 31) - 1 | ||
| INT_MIN = - (1 << 31) | ||
|
|
||
| cutoff = INT_MAX // 10 | ||
| cutlim = INT_MAX % 10 | ||
| if sign == -1: | ||
| cutlim = -INT_MIN % 10 | ||
|
|
||
| num = 0 | ||
| while pos < len(s) and s[pos].isdigit(): | ||
| digit = ord(s[pos]) - ord('0') | ||
| if num > cutoff or (num == cutoff and digit > cutlim): | ||
| if sign == 1: | ||
| return INT_MAX | ||
| else: | ||
| return INT_MIN | ||
| num = num * 10 + digit | ||
| pos += 1 | ||
|
|
||
| return num * sign | ||
| ``` | ||
|
|
||
| > 最後にsignを掛けているのでINT_MINのときに途中でオーバーフローしている気がする。たぶんそう。`digit >= cutlim`に変えてあげれば問題ないが、通常の範囲でのINT_MINなのかオーバーフローしてのINT_MINなのかを判別することができなくなる | ||
|
|
||
|
|
||
| 3rd | ||
| ```python | ||
| class Solution: | ||
| def myAtoi(self, s: str) -> int: | ||
| pos = 0 | ||
| # ignore white spaces | ||
| while pos < len(s) and s[pos].isspace(): | ||
| pos += 1 | ||
| if pos == len(s): | ||
| return 0 | ||
|
|
||
| # get sign | ||
| sign = 1 | ||
| if s[pos] == '+' or s[pos] == '-': | ||
| if s[pos] == '-': | ||
| sign = -1 | ||
| pos += 1 | ||
| if pos == len(s): | ||
| return 0 | ||
|
|
||
| # convert to number | ||
| INT_MAX = (1 << 31) - 1 | ||
| INT_MIN = - (1 << 31) | ||
|
|
||
| num = 0 | ||
| while pos < len(s) and s[pos].isdigit(): | ||
| digit = ord(s[pos]) - ord('0') | ||
| num = num * 10 + sign * digit | ||
| if num > INT_MAX: | ||
| return INT_MAX | ||
| if num < INT_MIN: | ||
| return INT_MIN | ||
| pos += 1 | ||
| return num | ||
| ``` | ||
|
|
||
| > `ord(s[pos]) - ord('0')`の代わりに`int(s[pos])`とかも選択肢としてあり。とはいえこれをするんだったら`int(s[index:])`みたいにやっていいじゃんという気持ちにもなるので、この問題的にはint使わない方が空気が読めてそう。 | ||
|
|
||
| 4th | ||
|
|
||
| floor divisionとmoduloの挙動: https://docs.python.org/3/reference/expressions.html#binary-arithmetic-operations | ||
|
|
||
| ```python | ||
| class Solution: | ||
| def myAtoi(self, s: str) -> int: | ||
| index = 0 | ||
|
|
||
| # skip white spaces | ||
| while index < len(s) and s[index] == ' ': | ||
| index += 1 | ||
| if index == len(s): | ||
| return 0 | ||
|
|
||
| # determine sign | ||
| sign = 1 | ||
| if s[index] == '+' or s[index] == '-': | ||
| if s[index] == '-': | ||
| sign = -1 | ||
| index += 1 | ||
| if index == len(s): | ||
| return 0 | ||
|
|
||
| # convert to integer | ||
| INT_MAX = (1 << 31) - 1 | ||
| INT_MIN = -INT_MAX - 1 | ||
| num = 0 | ||
| cutoff = INT_MAX // 10 | ||
| cutlim = INT_MAX % 10 | ||
| if sign == -1: | ||
| # 2の補数表現によりマイナスの場合は1を足せば良い&2のべき乗-1が9になることはない | ||
| # マイナスの場合もmoduloを取っても良いが、結果がsecond operandの符合と一致する言語仕様で処理が冗長になる | ||
| cutlim += 1 | ||
| while index < len(s) and s[index].isdigit(): | ||
| digit = ord(s[index]) - ord('0') | ||
| if abs(num) > cutoff or (abs(num) == cutoff and digit > cutlim): | ||
| return INT_MAX if sign == 1 else INT_MIN | ||
| num = num * 10 + digit * sign | ||
| index += 1 | ||
| return num | ||
| ``` | ||
|
|
||
| ifのところはindex += 1が冗長になるがフラットに以下でも良いかも。 | ||
|
|
||
| > sign = 1 | ||
| if s[index] == '+': | ||
| index += 1 | ||
| elif s[index] == '-': | ||
| index += 1 | ||
| sign = -1 | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long はデータモデルによってサイズが異なります。
https://ja.wikipedia.org/wiki/64%E3%83%93%E3%83%83%E3%83%88
LP64 LLP64 等でお調べください。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ありがとうございます。C++の規格では各typeの最小のbit数を規程していて、実際のサイズはその制約を満たす上で選択されたデータモデルによって決まるんですね。
C++の規格ではlongの最小サイズは32bitとして決められていて、LP32のようなデータモデルだと32bitになる。long long型については64bit以上であることが仕様上保証されている。