49. Group Anagrams by Ryotaro25 · Pull Request #13 · Ryotaro25/leetcode_first60

Ryotaro25 · 2024-06-11T14:53:03Z

問題へのリンク
49. Group Anagrams 49. Group Anagrams
問題文(プレミアムの場合)

備考

次に解く問題の予告
https://leetcode.com/problems/intersection-of-two-arrays/

フォルダ構成
LeetCodeの問題ごとにフォルダを作成します。
フォルダ内は、step1.cpp、step2.cpp、step3.cpp, ascii.cppとmemo.mdとなります。

memo.md内に各ステップで感じたことを追記します。

liquo-rice · 2024-06-12T18:23:38Z

49.GroupAnagrams/ascii.cpp

+
+public:
+    vector<vector<string>> groupAnagrams(vector<string>& strs) {
+        unordered_map<string, vector<string>> sorted_to_group;


変数名はこれで良さそうですか？
unordered_mapとmapが両方出ていますが、違いは理解していますか？

@liquo-rice 別で提示いただきましたドキュメントをきちんと読んでみます。

@liquo-rice
頂いた資料と下記を読みました。setとmapは赤黒木を使って実装されており、検索、削除、挿入に対する計算量は平均もワーストもO(log n)でunordered_setとunordered_mapはハッシュを使われており平均はO(1)だけどワーストはO(n)とございました。https://chromium.googlesource.com/chromium/src/+/master/base/containers/README.md#Map-and-set-selection

mapとsetを使わない方針

if the number of items will be large or unbounded and elements will be inserted/deleted outside of the containers constructor/destructor - they have O(n) performance on inserts and deletes of individual items.

unordered_setとunordered_mapをデフォルトでは使わない理由

In the common case, query performance is unlikely to be sufficiently higher than std::map to make a difference, insert performance is slightly worse, and the memory overhead is high. This makes sense mostly for large tables where you expect a lot of lookups.

commonでないケースは分からなかった(挿入されるデータ量が無限のもの？)のですが、今回の問題だとデータ量は有限であるのでmapを使うことにしました。

Hash table, red-black treeがどのようなものかについても調べるとよいと思います。

実際に一度実装してみると良いでしょう。

std::mapをred_black_treeに置き換えましたか？52bc545では使っていないみたいので、念のため。

@oda @liquo-rice
失礼しました。修正します。

@oda @liquo-rice
for (const auto& [key, word_group] : sorted_to_group)この部分動かすために、iteratorを実装する必要がるとわかって今日一日向き合っていたのですがスクラッチで実装できそうにないです。赤黒木のiteratorを実装するにあたって参考になる資料などございますでしょうか？

Tree traversalの知識が必要になると思います。こちらの問題で練習してみると良いかもしれません。
https://leetcode.com/problems/binary-search-tree-iterator/description/

@liquo-rice 練習問題ありがとうございます。arai60をあと5問解けばtree関連に入りますのでそこまで先に進めてから練習問題とtree関連を終わらせてから赤黒木の実装の続きをしようと思います🙇

liquo-rice · 2024-06-12T18:23:58Z

49.GroupAnagrams/ascii.cpp

+            string key;
+            // chatGPTにて確認
+            for (int count : counts) {
+                key += '#' + to_string(count);


もっと効率よく文字列結合する方法はありますか？

@liquo-rice
文字列結合を効率化するにはRopes Data Structureを使うという記事を見つけました。一読しただけではRopes Data Structureを理解できなかったので、理解してから組み込んでみます🙇
https://stackoverflow.com/questions/611263/efficient-string-concatenation-in-c
https://www.geeksforgeeks.org/ropes-data-structure-fast-string-concatenation/

Ropes Data Structureは私も何か知らなかったのですが、ostringstreamを使えば良さそうです。

@liquo-rice
step6.cppに実装しました。ostringstreamは独習c++などには載っていなかったのですが一般的に使うものでしょうか？

また以下の2つを参照しました。
https://ameblo.jp/nana-2007-july/entry-10098557843.html
https://cplusplus.com/reference/sstream/ostringstream/

ostreamを使って、stringを作成したいときなどに便利だと思います。

@liquo-rice 承知しました。ありがとうございます。

C++ の文字列は、Python や Java などとは違って、mutable です。このため、+= で追記しても文字列全体が作り直されることはありません。

とはいえ、vector の追記のようにリアロケーションが起きることがあるので、場合によってはそこも高速化したい場合がありえます。

Ropes は、木を使って後で結合する方法です。他、Finger Tree などというデータ構造などもありますね。

string は、C++11 からメモリー上で連続することが保証されるようになりました。
https://stackoverflow.com/questions/11752705/does-stdstring-have-a-null-terminator

immutable な言語は、String を作るのに何らかの仕組みを使います。

たとえば、Java StringBuilder は、Array を用意して * 2 + 2 にして行く方式。
https://hg.openjdk.org/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/AbstractStringBuilder.java

C# は、後ろからの単方向リストです。
https://referencesource.microsoft.com/#mscorlib/system/text/stringbuilder.cs

資料ありがとうございます。読んでみます。

liquo-rice · 2024-06-12T18:26:08Z

49.GroupAnagrams/ascii.cpp

+
+        vector<vector<string>> group_anagrams;
+        for (auto [key, word_group] : sorted_to_group) {
+            group_anagrams.push_back(word_group);


word_groupが２回コピーされているような気がします。コピーを減らせそうですか？

liquo-rice · 2024-06-12T18:28:30Z

49.GroupAnagrams/step3.cpp

+        for (auto str : strs) {
+            string anagram = str;
+            sort(anagram.begin(), anagram.end());
+            sorted_to_group[anagram].push_back(str);
+        }


無駄なコピーを減らせそうですか？

もう少し簡単な質問として、何回どこで何がコピーがなされていると思っているかを教えて下さい。

@oda @liquo-rice
コピーの回数ですが、ループごとにstrsからstrに1回、strからanagramに1回それぞれコピーされている認識でした。
正直わかっていないので調べてみます。

@oda @liquo-rice
調べ方がわからなかったのでchatGPTで調べました。その後関連しそうなドキュメント読んでみました。
vectorのpush_backを使うとコピーを作ってから、挿入するのですね。
https://en.cppreference.com/w/cpp/container/vector/push_back

下記に目を通しましたが、sortでは行われていないと思います。
https://en.cppreference.com/w/cpp/algorithm/sort

なので以下の3回でコピーが発生しているともいます。
・strs からstr
・str からanagram
・push_back

@liquo-rice @oda
discord上で教えていただいたことをもとに修正したものを上げました。step4.cppとなります。
f55e74e
autoは参照に変更し、push_backはコピーを作らないようにしました。

メモ
vectorにはmoveを渡すことができる
https://en.cppreference.com/w/cpp/container/vector/push_back

move
https://en.cppreference.com/w/cpp/utility/move

@liquo-rice
分からなかったのでこちら読んでみました。ざっくりと以下のように理解しました🙇
a = b
=> bのコピーを作成してaという変数名でアクセスするようにする

a = move(b)
=>aがメモリを確保し、bの内容をaに記録する
　元のbは使えなくなる
https://zenn.dev/mafafa/articles/cba24383d46900

@kazukiii
std::moveについて、何かコメントがあればお願いいたします。

@Ryotaro25
そうですね、個人的にはもう少し深く理解しておいた方が他のことに応用が効くと思います。
自分は先日、わからない用語を調べながら以下を読みました。
https://cpprefjp.github.io//lang/cpp11/rvalue_ref_and_move_semantics.html

簡単ですが、自分の理解した内容を書いておきます。
まずstd::moveについてですが、これは単に引数を右辺値参照にキャストするだけで、実際にはなにも移動しません。
型Tにコピー演算とムーブ演算が定義されているとして、

T hoge = 左辺値; -> コピー代入演算子が呼び出される T hoge = 右辺値; -> ムーブ代入演算子が呼び出される

という感じです。

で、push_backに対しても同じ原理で理解できます。

vector<T> hoge; hoge.push_back(左辺値); -> コピーコンストラクタが呼び出される hoge.push_back(右辺値); -> ムーブコンストラクタが呼び出される

という感じです。

このムーブコンストラクタ/代入演算子が実際のムーブ処理を行っていて、
これはクラスの設計者が定義するものなので、一度自分で書いてみると理解が深まるかもしれません。

@liquo-rice
補足等ありましたらお願いできますと幸いです。

ありがとうございます。

@Ryotaro25
Zennの記事やkazukiiiさんの説明にもあるように、std::moveは、rvalue referenceへのキャストになります。練習として、簡単なstd::vectorのようなクラスを自分で作成して、copy/move constructor/assignmentを定義してみたらいいかと思います。

@kazukiii @liquo-rice
解説ありがとうございます。一度自分で書いてみようと思います🙇

nodchip · 2024-06-25T14:41:22Z

49.GroupAnagrams/ascii.cpp

+class Solution {
+private:
+    vector<int> count_alphabet(string s) {
+            vector<int> counts(26, 0);


インデントのスペースの個数が 8 個になっている気がします。

@nodchip
お疲れ様です。修正したものをこちらにあげました。
52bc545

liquo-rice · 2024-06-25T15:22:19Z

49.GroupAnagrams/unordered_map.cpp

@@ -0,0 +1,66 @@
+class my_unordered_map {


template <typename K, typename V>でできますか？
バケットの数を決めうちにするのではなく、動的に拡張できますか？

@liquo-rice
お疲れ様です。
b2870c2に実装してみました。

liquo-rice · 2024-07-03T10:24:33Z

49.GroupAnagrams/unordered_map.cpp

-        buckets[index].emplace_back(key, vector<string>());
-        return buckets[index].back().second;
+  V& operator[](const K& key) {
+    // 占有率0.7を超えたら倍の大きさにする


https://en.wikipedia.org/wiki/Hash_table
こちらに最適な占有率についての記述があります。
separate chaining: 1 to 3
open addressing: 0.6 to 0.75

@liquo-rice
資料ありがとうございます。
なっとくアルゴリズムに0.7とあって(理由は記載なし）そのままにしておりました。
読んでみます🙇

liquo-rice · 2024-07-03T10:29:31Z

49.GroupAnagrams/unordered_map.cpp

+  // メンバー関数をconstにする
+  size_t hash(const K& key) const {
+    size_t hash_code = 0;
+    for (char letter : key) {


Kが文字列以外で動きますか？

stringだけしか動かないですね。見落としておりました、方法を探してみます。

finish

2416b56

liquo-rice reviewed Jun 12, 2024

View reviewed changes

Ryotaro25 added 2 commits June 16, 2024 12:04

add two files

f55e74e

mod type

3fa3102

rihib mentioned this pull request Jun 25, 2024

Group Anagrams rihib/leetcode#3

Merged

finish hash

44bebb7

nodchip reviewed Jun 25, 2024

View reviewed changes

liquo-rice reviewed Jun 25, 2024

View reviewed changes

Ryotaro25 added 2 commits July 3, 2024 11:54

hashの実装

b2870c2

finish implementing map

52bc545

liquo-rice reviewed Jul 3, 2024

View reviewed changes

add search function

eb1a513

colorbox mentioned this pull request Oct 24, 2024

49. Group Anagrams colorbox/leetcode#26

Merged

added modified version

0a9752e

Ryotaro25 merged commit 9c0d394 into main May 4, 2025
1 check passed

Conversation

Ryotaro25 commented Jun 11, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liquo-rice Jun 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liquo-rice Jul 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

liquo-rice Jun 25, 2024 •

edited

Loading

liquo-rice Jul 3, 2024 •

edited

Loading