139. Word Break by tom4649 · Pull Request #37 · tom4649/Coding

tom4649 · 2026-03-29T21:59:19Z

https://leetcode.com/problems/word-break/description/

huyfififi · 2026-03-30T00:05:56Z

nit sol1.py がTLEする方で、sol1_failed.py がパスする方に見受けられるので、ファイル名が逆かもしれません。

huyfififi · 2026-03-30T00:09:14Z

+        stripped_sub_strs = []
+        for word in wordDict:
+            if s.startswith(word):
+                stripped_sub_strs.append(s[len(word) :])


str.strip() をするのかと変数名を見て思いましたが、いわゆる strip 処理を行っていなさそうなので、単に sub_strings でいいかなと思います。

その通りだと思います。反映しました。

huyfififi · 2026-03-30T00:09:49Z

+            if i == len_target:
+                return True
+            for word in wordDict:
+                if s.startswith(word, i) and can_break(i + len(word)):


なるほど、部分文字列をわざわざ作らなくても startswith でいけるんですね、勉強になります 👀

mamo3gr · 2026-03-30T07:16:58Z

+> は、
+> "a" * 2 と "a" * 4 で表せないので、単純なバックトラックでは失敗するというのが予想です。
+
+「正規表現で書けるからO(n)」はあくまでwordDictが定数のときの話だろう。ただこの考えは持っておきたい。


あくまでwordDictが定数のとき

これはどういう意味でしょうか。listの要素が固定されている、という意味であれば、関数が呼び出されたタイミングで固定されていると思いました。

計算量の見積もりにおいてて定数とみなされている、という意味です。
もし定数でなければ、正規表現を受理するオートマトンの構築自体の時間も考える必要があると思いました。

mamo3gr · 2026-03-30T07:20:41Z

+        @functools.cache
+        def can_break(i) -> bool:
+            if i == len_target:
+                return True


個人的には冗長な説明変数に思いました。

Suggested change

return True

@functools.cache

def can_break(i) -> bool:

if i == len(s):

ほぼ変わらないと思いますが、実行時間が定数倍減少すると思うので、このままにしておこうと思います。

mamo3gr · 2026-03-30T07:22:49Z

+        len_target = len(s)
+
+        @functools.cache
+        def can_break(i) -> bool:


関数の中身まで読まないと i の意味が分からないので（呼び出し方で予測はできますが）、前説してあると丁寧です。

Suggested change

def can_break(i) -> bool:

def can_break(i: int) -> bool:

"""returns whether s[i:] can be broken."""

なるほど。たしかに引数のiが何を指しているのかわかりませんね。採用させていただきます。

5ky7 · 2026-04-08T14:30:10Z

+        len_target = len(s)
+
+        @functools.cache
+        def can_break(i) -> bool:


iが何を意味するのかをコードから推理する必要があったので，start_posあたりにすると良いかもしれません．

みていただいた二人の目に留まったということは、良いコードではないのですね。
start_posにさせていただきます。

5ky7 · 2026-04-08T14:43:31Z

+        frontier = [0]
+        visited = {0}
+        while frontier:
+            position = frontier.pop()


これもstart_positionあるいはfirstの方が好みです．

5ky7 · 2026-04-08T14:53:28Z

+- n = |s|, m = len(wordDict), l = max([len(word) for word in wordDoct])とする
+
+### sol1.py
+- 時間 O(nml): can_breakはメモ化しているので高々O(n)回呼び出される、それぞれの関数内でwordDict内全ての文字列比較をするので O(ml)


計算量から具体的な計算時間の目安を見積もると実行前にTLEに気付けるようになるかもしれません．
見積もり方はこちらが参考になります．
今回の場合ですと，
$nml \le 300 \times 1000 \times 20 = 6\times 10^7$
ですから，Pythonが 1e6 ~ 1e7 steps/sec であることを踏まえると最大で 1 ~ 10秒程度かかる，と言う見積もりになるかと思います．

ありがとうございます、参考になりました。自分で見積もるくせをつけておきたいですね。

5ky7 · 2026-04-08T16:59:37Z

+
+@dataclasses.dataclass
+class TrieNode:
+    children: Dict[str, TrieNode] = dataclasses.field(default_factory=dict)


別にList[TrieNode]を用意し，辞書自体はDict[str, int]としておいて，map先のintを使ってList[TrieNode]からTrieNodeを取り出すようにすると早いかもしれない，と言う議論がありました(cf)．

Pythonのリストは動的配列だったはずなので，同様の議論が成り立つと思いますが，インタプリタ言語である以上ボトルネックがC++と異なるので，可読性なども考慮した上でやる価値があるかは不明です...

なるほど、C++の場合にはキャッシュヒットが増えそうなのは理解ができました。
Pythonで行う必要性があるのかは確かに疑問ですね。
このようなことも考えながらコードを書けるようになりたいです。

5ky7 · 2026-04-08T17:07:41Z

+
+class Solution:
+    def wordBreak(self, s: str, wordDict: List[str]) -> bool:
+        len_target = len(s)


好みの問題でしょうが，len(s)のままで良いと思います．情報量が変わらず，長さも短くなるわけではないので．

5ky7 · 2026-04-08T17:09:10Z

+        root = TrieNode()
+        max_len_of_wordDict = 0
+        for word in wordDict:
+            max_len_of_wordDict = max(max_len_of_wordDict, len(w))


個人的にはmax_len_of_wordDictを求めるロジックはTrie木を作るロジックと別物であるので分離するのが好みです．
このfor文の後で，

max_len_wordDict = max(map(len, wordDict), default=0)

や

max_len_wordDict = max(len(word) for word in wordDict)

などとするのはいかがでしょうか．

後の方を採用しました（しかもw と word が間違っていましたね）

nodchip · 2026-04-10T10:05:33Z

+
+「正規表現で書けるからO(n)」はあくまでwordDictが定数のときの話だろう。ただこの考えは持っておきたい。
+
+> というわけで、先頭から DP が"模範解答"だろうな、とは思います。


背景として、 Vitabi のアルゴリズムがありそうな気がしました。
https://ja.wikipedia.org/wiki/%E3%83%93%E3%82%BF%E3%83%93%E3%82%A2%E3%83%AB%E3%82%B4%E3%83%AA%E3%82%BA%E3%83%A0

最大確率を求めるわけではないのでやや疑問ですが、たしかにインデックスを状態としたDPを行うのは似ているのかもしれません

tom4649 added 2 commits March 30, 2026 06:49

139. Word Break

b0eac6c

Improve readability of memo.md

ab18435

huyfififi reviewed Mar 30, 2026

View reviewed changes

mamo3gr reviewed Mar 30, 2026

View reviewed changes

Add suggested changes

0d2d321

5ky7 reviewed Apr 8, 2026

View reviewed changes

nodchip reviewed Apr 10, 2026

View reviewed changes

Add suggested changes

94403bf

	def can_break(i) -> bool:
	def can_break(i: int) -> bool:
	"""returns whether s[i:] can be broken."""


		「正規表現で書けるからO(n)」はあくまでwordDictが定数のときの話だろう。ただこの考えは持っておきたい。

		> というわけで、先頭から DP が"模範解答"だろうな、とは思います。

Conversation

tom4649 commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tom4649 commented Mar 29, 2026 •

edited

Loading