fix: Korean keyboard unclickable + German ß — universal diacritic normalization#155
Conversation
The Korean keyboard used Compatibility Jamo (U+3130 block) while the word list used Hangul Jamo (U+1100 block). These are visually identical but different Unicode codepoints, so keyboard input was silently rejected. Fix uses the existing diacritic_map system to normalize between the two encodings — the same universal mechanism used for accent normalization in European languages. Also adds compound jongseong and compound vowel keys to an extended keyboard layout so all Korean words are playable. Backend change (app.py) is universal: any language with a diacritic_map now automatically includes base characters in the accepted set, even when only variant forms appear in the word list. Fixes #153
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThe PR refactors keyboard character handling and enhances Korean language support. It introduces a new utility function to aggregate typeable characters across all keyboard layouts, updates test coverage validation to use this function, modifies the Language class to normalize diacritical variants, replaces the Korean "double" keyboard layout with an expanded "full" layout, and adds comprehensive diacritical mappings for Korean. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Korean default keyboard now 5 rows with double consonants + all compound vowels (98.6% word coverage, up from 77%) - Compound jongseong available on "Full" layout (remaining 1.4%) - German: fix broken multi-char diacritic "ss"→"ß" to single-char "s"→"ß" which the char-by-char normalizer can actually process - Remove all keyboard coverage xfails (both ko and de now pass)
Words with compound final consonants (ㄺ, ㄻ, ㄼ, etc.) are not typeable on the default keyboard. They remain valid guesses (via the Full layout) but are excluded from daily word selection so every daily word is always solvable on the default keyboard. Also adds documentation for future Option C: decomposing compound jongseong into individual jamo with 6-cell grid (like kordle.kr), which would eliminate compound keys entirely. Notes other languages (Devanagari, Thai, Khmer) that may need similar treatment.
Maps physical key codes (event.code) to jamo characters, bypassing the OS IME which would otherwise compose syllable blocks. Uses the standard Korean 2-set (Dubeolsik) layout: Q→ㅂ, W→ㅈ, E→ㄷ, etc. with Shift for double consonants (Shift+Q→ㅃ, Shift+T→ㅆ, etc.). The physical_key_map config is universal — any language needing IME bypass can add one to their language_config.json.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/test_word_lists.py (1)
128-152:⚠️ Potential issue | 🟠 MajorAdd a default-layout coverage check for daily candidates.
Unioning all layouts is fine for “guessable somewhere”, but it no longer protects the Korean contract introduced here: daily-selectable words must still be typeable on the default
korean_2setlayout. A word missed inko_blocklist.txtwill now pass CI viakorean_fulland still ship as an unsolvable daily. Please keep this all-layout assertion for guessability, but add a second check over the daily candidate pool (load_daily_words(lang)orload_word_list(lang) - load_blocklist(lang)) against the default layout only.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/test_word_lists.py` around lines 128 - 152, The current test unions characters from all layouts via load_all_keyboard_chars(lang), which misses the requirement that daily-selectable words must be typeable on the default layout; add a second assertion that computes the daily candidate set (use load_daily_words(lang) or compute load_word_list(lang) minus load_blocklist(lang)) and verify every character in those daily words appears on the default keyboard layout (load_keyboard(lang, layout="korean_2set") or the project’s equivalent API); keep the existing all-layout check for guessability and add this default-layout check (reference test_keyboard_covers_all_word_characters, KEYBOARD_COVERAGE_XFAIL, load_word_list, load_daily_words, load_blocklist, load_keyboard, load_all_keyboard_chars).
🧹 Nitpick comments (2)
webapp/data/languages/ko/language_config.json (1)
56-63: Add the missingㅒ→ᅤnormalization entry.
ko_keyboard.jsonnow exposesㅒon both layouts, but this map jumps straight fromㅑtoㅓ. Adding the Hangul Jamo variant keeps the keyboard/config pair symmetric and avoids another Compatibility Jamo mismatch ifᅤever lands in the guessable lists.Possible change
"ㅑ": ["ᅣ"], + "ㅒ": ["ᅤ"], "ㅓ": ["ᅥ"],🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@webapp/data/languages/ko/language_config.json` around lines 56 - 63, The JSON vowel-to-Jamo map is missing the ㅒ→ᅤ normalization; add an entry mapping the Hangul character "ㅒ" to the compatibility Jamo array ["ᅤ"] (insert it between the existing "ㅑ": ["ᅣ"] and "ㅓ": ["ᅥ"] entries) so the language_config normalization matches the ko_keyboard layouts and avoids Compatibility Jamo mismatches.tests/conftest.py (1)
159-181: Keep this helper in lockstep with the runtime keyboard parser.
webapp/app.pystill accepts the legacy top-level dict keyboard shape, but this helper only handles{"layouts": ...}or raw lists. If a keyboard JSON uses the legacy form, the tests will under-report typeable chars compared to production.Possible alignment
- if isinstance(data, dict) and "layouts" in data: - for layout in data["layouts"].values(): - for row in layout.get("rows", []): - chars.update(k for k in row if k not in control_keys) + if isinstance(data, dict): + layouts = data.get("layouts") + if not isinstance(layouts, dict): + layouts = {k: v for k, v in data.items() if k != "default"} + for layout in layouts.values(): + rows = layout.get("rows", []) if isinstance(layout, dict) else layout + for row in rows: + chars.update(k for k in row if k not in control_keys) elif isinstance(data, list): for row in data: chars.update(k for k in row if k not in control_keys)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/conftest.py` around lines 159 - 181, The helper load_all_keyboard_chars currently only handles data shaped as {"layouts": ...} or a raw list and thus misses legacy top-level dict keyboard JSONs; update it to detect when data is a dict but lacks "layouts" and treat the dict's values as rows (or lists of rows) similarly to the other branches, iterating over those values and filtering out control_keys (use the existing control_keys set and LANGUAGES_DIR reference) so legacy keyboard files produce the same char set as the runtime parser.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@tests/test_word_lists.py`:
- Around line 128-152: The current test unions characters from all layouts via
load_all_keyboard_chars(lang), which misses the requirement that
daily-selectable words must be typeable on the default layout; add a second
assertion that computes the daily candidate set (use load_daily_words(lang) or
compute load_word_list(lang) minus load_blocklist(lang)) and verify every
character in those daily words appears on the default keyboard layout
(load_keyboard(lang, layout="korean_2set") or the project’s equivalent API);
keep the existing all-layout check for guessability and add this default-layout
check (reference test_keyboard_covers_all_word_characters,
KEYBOARD_COVERAGE_XFAIL, load_word_list, load_daily_words, load_blocklist,
load_keyboard, load_all_keyboard_chars).
---
Nitpick comments:
In `@tests/conftest.py`:
- Around line 159-181: The helper load_all_keyboard_chars currently only handles
data shaped as {"layouts": ...} or a raw list and thus misses legacy top-level
dict keyboard JSONs; update it to detect when data is a dict but lacks "layouts"
and treat the dict's values as rows (or lists of rows) similarly to the other
branches, iterating over those values and filtering out control_keys (use the
existing control_keys set and LANGUAGES_DIR reference) so legacy keyboard files
produce the same char set as the runtime parser.
In `@webapp/data/languages/ko/language_config.json`:
- Around line 56-63: The JSON vowel-to-Jamo map is missing the ㅒ→ᅤ
normalization; add an entry mapping the Hangul character "ㅒ" to the
compatibility Jamo array ["ᅤ"] (insert it between the existing "ㅑ": ["ᅣ"] and
"ㅓ": ["ᅥ"] entries) so the language_config normalization matches the ko_keyboard
layouts and avoids Compatibility Jamo mismatches.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0e4d480f-fdc2-4a21-9735-f8dc8bc01da9
📒 Files selected for processing (8)
tests/conftest.pytests/test_language_config.pytests/test_word_lists.pywebapp/app.pywebapp/data/languages/de/language_config.jsonwebapp/data/languages/ko/ko_blocklist.txtwebapp/data/languages/ko/ko_keyboard.jsonwebapp/data/languages/ko/language_config.json
Korean diacritic_map is for internal Unicode normalization (Compatibility Jamo ↔ Hangul Jamo), not player-visible accent variants. The sub-key hints (ᄇ/ᆸ) are meaningless to players and add visual noise. Adds hide_diacritic_hints config flag — keeps hints working for European languages (German ä/ö/ü/ß, etc.) where they're genuinely useful.
Same fix as German ß (PR #155): "oe"→["œ"] and "ae"→["æ"] were multi-char keys that the char-by-char normalizer silently ignored. Changed to single-char mappings: "o"→["ô","œ"] and "a"→["à","â","æ"]. Players can now type 'o' to match œ and 'a' to match æ in French words.
Same fix as German ß (PR #155): "oe"→["œ"] and "ae"→["æ"] were multi-char keys that the char-by-char normalizer silently ignored. Changed to: "o"→["ô","œ"] and "a"→["à","â","æ"]. Players can now type 'o' to match œ and 'a' to match æ in French words.
… words These words contain compound final consonants not typeable on the default keyboard, and were added to ko_blocklist.txt in PR #155.
Summary
Korean keyboard was completely non-functional (issue #153). Root cause: keyboard keys used Compatibility Jamo (U+3130) while the word list used Hangul Jamo (U+1100) — visually identical but different Unicode codepoints, so every key press was silently rejected.
Changes
1. Universal diacritic normalization fix (
app.py)acceptable_charactersto chars used in words, also include diacritic base characters whose variants appear in the word list2. Korean
diacritic_map(language_config.json)3. Korean keyboard expansion (
ko_keyboard.json)4. Korean compound jongseong blocklist (
ko_blocklist.txt)5. Physical keyboard support (
game.ts,physical_key_map)event.code) to jamo, bypassing OS IME compositionphysical_key_map6. German ß fix (
de/language_config.json)"ss": ["ß"](multi-char, ignored by normalizer) →"s": ["ß"]7. Test improvements
load_all_keyboard_chars(): coverage tests check ALL layouts, not just defaultFuture work (documented in code)
Test plan
uv run pytest tests/— 2033 passed, 0 failedpnpm test— 81 passed, 0 faileduv run ruff check+pnpm format— cleansmatches words withßFixes #153