Merged
Conversation
Collaborator
|
related issue: #109 |
Contributor
There was a problem hiding this comment.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
ekmb
reviewed
Oct 19, 2023
tests/nemo_text_processing/zh/data_text_normalization/test_cases_cardinal.txt
Show resolved
Hide resolved
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/verbalizers/verbalize_final.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
nemo_text_processing/text_normalization/zh/taggers/tokenize_and_classify.py
Fixed
Show fixed
Hide fixed
* port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfd. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb748. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
removing punct graph imported Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com>
mgrafu
reviewed
Apr 26, 2024
nemo_text_processing/text_normalization/zh/verbalizers/verbalizer_adapt.py
Outdated
Show resolved
Hide resolved
mgrafu
reviewed
Apr 26, 2024
Update on test issue for Docker file locations Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com>
Fixing style. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com>
Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com>
…mbol.py Removing because it's not one of the semiotic classes. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com>
Updating Jenkins date Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com>
ekmb
approved these changes
Apr 30, 2024
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
This PR is to solve the normalizing issue with the grammar that it did not a whole sentence but only the input as a single word.
Add a one line overview of what this PR aims to accomplish.
Now it rads the sentence to do the normalization.
Before your PR is "Ready for review"
Pre checks:
git commit -sto sign.pytestor (if your machine does not have GPU)pytest --cpufrom the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')).bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...pytestand Sparrowhawk here.__init__.pyfor every folder and subfolder, includingdatafolder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.to all newly added Python files?Copyright 2015 and onwards Google, Inc.. See an example here.try import: ... except: ...) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.