Conversation
Resolves #126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com>
Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com>
| $1,234.123~one thousand two hundred and thirty four point one two three dollars | ||
| US $76.3 trillion~US seventy six point three trillion dollars | ||
| US$76.3 trillion~seventy six point three trillion us dollars | ||
| The price for each canned salmon is $5, each bottle of peanut butter is $3~The price for each canned salmon is five dollars, each bottle of peanut butter is three dollars |
There was a problem hiding this comment.
How did this one pass w/o Jenkins update?
There was a problem hiding this comment.
seems like the added test case was working w/o weight update. Could you please replace with the input from this issueThank you for the quantities. Now, lets talk about the pricing. The price for each canned salmon is \$5, each bottle of peanut butter is \$3
Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com>
Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com>
tbartley94
left a comment
There was a problem hiding this comment.
Nothing major so leaving approving review for quick merge.
| $1,234.123~one thousand two hundred and thirty four point one two three dollars | ||
| US $76.3 trillion~US seventy six point three trillion dollars | ||
| US$76.3 trillion~seventy six point three trillion us dollars | ||
| The price for each canned salmon is $5, each bottle of peanut butter is $3~The price for each canned salmon is five dollars, each bottle of peanut butter is three dollars |
There was a problem hiding this comment.
seems like the added test case was working w/o weight update. Could you please replace with the input from this issueThank you for the quantities. Now, lets talk about the pricing. The price for each canned salmon is \$5, each bottle of peanut butter is \$3
|
This test doesn't work correctly for me with the original weights. Not sure how it passed the tests.
________________________________
From: Evelina ***@***.***>
Sent: Wednesday, November 22, 2023 2:01:50 AM
To: NVIDIA/NeMo-text-processing ***@***.***>
Cc: Anand Joseph ***@***.***>; Author ***@***.***>
Subject: Re: [NVIDIA/NeMo-text-processing] Increase weights for serial (en TN) (PR #128)
@ekmb requested changes on this pull request.
________________________________
In tests/nemo_text_processing/en/data_text_normalization/test_cases_money.txt<#128 (comment)>:
@@ -63,3 +63,4 @@ $1,925.21~one thousand nine hundred and twenty five dollars twenty one cents
$1,234.123~one thousand two hundred and thirty four point one two three dollars
US $76.3 trillion~US seventy six point three trillion dollars
US$76.3 trillion~seventy six point three trillion us dollars
+The price for each canned salmon is $5, each bottle of peanut butter is $3~The price for each canned salmon is five dollars, each bottle of peanut butter is three dollars
seems like the added test case was working w/o weight update. Could you please replace with the input from this issue<#126>Thank you for the quantities. Now, lets talk about the pricing. The price for each canned salmon is \$5, each bottle of peanut butter is \$3
—
Reply to this email directly, view it on GitHub<#128 (review)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AZICZSILKH6BOTWR65Y5SMLYFUFTNAVCNFSM6AAAAAA7QGHHRWVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTONBTGAZDGMJSGY>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
* Increase weights for serial (en TN) Resolves NVIDIA#126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru>
* Increase weights for serial (en TN) Resolves #126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* Increase weights for serial (en TN) Resolves #126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers …
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <al…
* Increase weights for serial (en TN) Resolves #126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <al…
* Increase weights for serial (en TN) Resolves #126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <al…
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <al…
* Increase weights for serial (en TN) Resolves #126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com>
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <al…
* Increase weights for serial (en TN) Resolves NVIDIA#126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Namrata Gachchi <ngachchi@nvidia.com>
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers with eight Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * rewrote tokenizer Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removed the file and replaced it with char in 1.8 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins file update Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * to fix tn bug@ xuesong Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * tn bug Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixeds and updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Alex Cui <alexcui1994@gmail.com> * adjustments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * testing commit Signed-off-by: Alex Cui <alexcui1994@gmail.com> * removing unsed file Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated test cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updating etst cases Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updates adapting to graphs Signed-off-by: Alex Cui <al…
* Increase weights for serial (en TN) Resolves NVIDIA#126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Namrata Gachchi <ngachchi@nvidia.com>
* Increase weights for serial (en TN) Resolves NVIDIA#126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com>
* IT TN improvement on tests (#120) * add missing test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with time tests Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add sentence test cases Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * refine shortest path for irregular cardinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add single letter exception for roman numerals (#121) * add single letter exception for roman numerals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci dir Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix broken path for nondet whitelist (#124) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Increase weights for serial (en TN) (#128) * Increase weights for serial (en TN) Resolves https://github.com/NVIDIA/NeMo-text-processing/issues/126 Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Add tests for fix Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile cache path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Update Jenkinsfile. Fix cache folder Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measures file for FR TN (#131) * add measures file Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update whitelist data Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * add fr tn tests Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh jenkins (#127) * Add SH tests to Jenkins Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkins tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add CI/CD tests for sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * docker build only if in test mode Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing variable Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix comments and remove arguments not required Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix commands not executing Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing arguments Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Missing quotes Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix incorrect path for tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Incorrect paths of tests and shunit2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix issues with paths as arguments to shunit Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Undo path change Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix intentional fail test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * revert redundant check for cased option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix default path in export_grammars.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add interactive option Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add SH tests for cased EN ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update isort - fix precommit (#138) * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * update isort version Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove unused imports Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian itn (#136) * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added Armenian ITN Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context for tests and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Revert "Added context for tests and fixed CodeQL errors" This reverts commit 2c804d941963c0be21d3aad07e6cd13568ab747b. Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * Added context to some test files and fixed CodeQL errors Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unnecessary data Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * translated a few measurements to Armenian Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * adjusted some things for better readability and maintainer support Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed one test case and some issues Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix CI (#142) * fix whitelist deployment Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * comment out tests to recreate grammars Signed-off-by: Evelina <ebakhturina@nvidia.com> * shorten test Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix jenkins Signed-off-by: Evelina <ebakhturina@nvidia.com> * cased for TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * revert debug changes Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix args default Signed-off-by: Evelina <ebakhturina@nvidia.com> * try parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * debug parallel Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * rerun Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix sh tests for local SH launcher Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * enable all ci tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Armenian TN (#137) * merged with main branch and fixed conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixing some more conflicts Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * fixed a minor issue Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * deleted unused imports Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix: add "hy" language option for armenian Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> * added optional space for measurements after cardinals/decimals Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * added Armenian dot Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: David Sargsyan <d.sargsyan@ispras.ru> Signed-off-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: David Sargsyan <d.sargsyan@ispras.ru> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Ara Yeroyan <60027241+Ara-Yeroyan@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Marathi ITN (#134) * Added Marathi ITN Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding jenkins test Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Chinmay Patil <chinmaypatil2000@gmail.com> Signed-off-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Signed-off-by: Travis Bartley <tbartley@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: tbartley94 <90423858+tbartley94@users.noreply.github.com> Co-authored-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * jenkins fix (#150) * jenkins fix Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * removing armenian to troubleshoot jenkins Signed-off-by: Travis Bartley <tbartley@nvidia.com> * missing _init_ for python Signed-off-by: Travis Bartley <tbartley@nvidia.com> * mislabled cache Signed-off-by: Travis Bartley <tbartley@nvidia.com> --------- Signed-off-by: Travis Bartley <tbartley@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * r0.3.0 release (#151) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Fix text=line[text] to text=line[text_field] (#153) Signed-off-by: Sasha Meister <sasha.meister.work@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * use real string on docstring (#157) Signed-off-by: Kevin Sanders <kevin.sanders@dialpad.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Sh postprocess (#147) * Add support for postprocessor far in sparrowhawk Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Choose between having a post processor or not Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * update run_evaluate script for cased itn (#164) * update run_evaluate script for cased itn Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * remove unused function from ar tn decimals (#165) * remove unused function from ar tn decimals Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update ci date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * ZH sentence-level TN (#112) * Swedish telephone fix (#60) * port fix for telephone from swedish-itn branch Signed-off-by: Jim O'Regan <joregan@kth.se> * extend cardinal in non-deterministic mode Signed-off-by: Jim O'Regan <joregan@kth.se> * whitespace fixes Signed-off-by: Jim O'Regan <joregan@kth.se> * also fix in the verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * Update Jenkinsfile Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * log instead of print in graph_utils.py (#68) Signed-off-by: Enno Hermann <enno.hermann@idiap.ch> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * CER estimation speedup for audio-based text normalization (#73) * Replaced jiwer with editdistance to speed up CER estimation Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Vitaly Lavrukhin <vlavrukhin@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add measure coverage for TN and ITN (#62) * add measure coverage for TN and ITN Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * Remove unused imports Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update measure.py Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> --------- Signed-off-by: ealbasiri <ealbasiri@gradcenter.cuny.edu> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * upload es-ES, es-LA, fr-FR and it-IT g2p dicts (#63) * upload es-ES and fr-FR g2p dicts Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add inits Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add NALA Spanish dict Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * rename Spanish and French dictionaries Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add Italian dictionary Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * add country codes from hu (#77) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix electronic case for username (#75) * fix electronic username w/o . Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix ar test Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * disable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci dirs, enable sv tests Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * 0.1.8 release (#79) Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Codeswitched ES/EN ITN (#78) * Initial commit for ES-EN codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable export for es_en codeswitched ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add whitelist, update weights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add tests for en_es, zone tagged separately in es Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix path to test data for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update Jenkinsfile - enable ES/EN tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Add __init__.py files Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix issues with failed docker build - due to archiving of debian and issues with re2 Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove unused imports and variables Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update date Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Enable NBSP in sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update copyrights Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update cache path in for ES/EN CI/CD Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * minor normalize.py edit for usability (#84) * electronic verbalizer fallback (#81) * 0.1.8 release Signed-off-by: Evelina <ebakhturina@nvidia.com> * add elec fallback Signed-off-by: Evelina <ebakhturina@nvidia.com> * update ci Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * documentation edits for grammar/clarity Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * added --output_field flag for command line interface Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Linnea Pari Leaver <lleaver@lleaver-mlt.client.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Swedish ITN (#40) * force two digits for month Signed-off-by: Jim O'Regan <joregan@kth.se> * put it in a function, because I reject the garbage pre-commit.ci came up with Signed-off-by: Jim O'Regan <joregan@kth.se> * wrap some more pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * add graph pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * delete junk Signed-off-by: Jim O'Regan <joregan@kth.se> * my copyright Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser (copy from es) Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks Signed-off-by: Jim O'Regan <joregan@kth.se> * add date verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * add right tokens Signed-off-by: Jim O'Regan <joregan@kth.se> * some tweaks, more needed Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to ITN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaks to TN date tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * moved to tagger Signed-off-by: Jim O'Regan <joregan@kth.se> * nothing actually fixed here Signed-off-by: Jim O'Regan <joregan@kth.se> * now most tests pass Signed-off-by: Jim O'Regan <joregan@kth.se> * electronic Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fractions Signed-off-by: Jim O'Regan <joregan@kth.se> * extend Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * bare fractions is a bit of an overreach Signed-off-by: Jim O'Regan <joregan@kth.se> * whitelist Signed-off-by: Jim O'Regan <joregan@kth.se> * just inverting the TN whitelist tagger will not work/be useful Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from English Signed-off-by: Jim O'Regan <joregan@kth.se> * overwrite with version from en Signed-off-by: Jim O'Regan <joregan@kth.se> * add basic test case Signed-off-by: Jim O'Regan <joregan@kth.se> * fix call Signed-off-by: Jim O'Regan <joregan@kth.se> * swap tsv sides Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add optional_era variable Signed-off-by: Jim O'Regan <joregan@kth.se> * add test case Signed-off-by: Jim O'Regan <joregan@kth.se> * make deterministic default, like most of the others Signed-off-by: Jim O'Regan <joregan@kth.se> * also add lowercase versions Signed-off-by: Jim O'Regan <joregan@kth.se> * replacing NEMO_SPACE does not work either Signed-off-by: Jim O'Regan <joregan@kth.se> * increasing weight... did not work last time Signed-off-by: Jim O'Regan <joregan@kth.se> * tweaking test cases, in case it was a sentence splitting issue. It was not Signed-off-by: Jim O'Regan <joregan@kth.se> * put the full stops back Signed-off-by: Jim O'Regan <joregan@kth.se> * add filler words Signed-off-by: Jim O'Regan <joregan@kth.se> * try splitting this out to see if it makes a difference Signed-off-by: Jim O'Regan <joregan@kth.se> * aha, this part should be non-deterministic only Signed-off-by: Jim O'Regan <joregan@kth.se> * single line only Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert "increasing weight... did not work last time" This reverts commit 39b020b50db745dfd6b281c8cbca45a033926996. Signed-off-by: Jim O'Regan <joregan@kth.se> * disabling ITN here makes TN work again(?) Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "disabling ITN here makes TN work again(?)" This reverts commit be49d7d5c687876e51c2e9ce1cf1e01491df280f. Signed-off-by: Jim O'Regan <joregan@kth.se> * changing the variable name fixes norm tests Signed-off-by: Jim O'Regan <joregan@kth.se> * change the variable names Signed-off-by: Jim O'Regan <joregan@kth.se> * add missing test tooling Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * copy telephone fixes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * add a piece for area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add country codes from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * extend any_read_digit for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * country/area codes for ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * first attempt Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * add to t&c Signed-off-by: Jim O'Regan <joregan@kth.se> * remove country codes for the time being, makes things ambiguous Signed-off-by: Jim O'Regan <joregan@kth.se> * basic test cases Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove trailing whitespace Signed-off-by: Jim O'Regan <joregan@kth.se> * Update __init__.py Signed-off-by: Jim O’Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transform of TN tests Signed-off-by: Jim O'Regan <joregan@kth.se> * basic transformation of TN decimal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * slight changes to date Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * include space Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen Signed-off-by: Jim O'Regan <joregan@kth.se> * problem with tusen was not that Signed-off-by: Jim O'Regan <joregan@kth.se> * add functions from hu Signed-off-by: Jim O'Regan <joregan@kth.se> * respect my own copyright xD Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading to constructor; had weirdness in this file, probably due to module-level python-suckage Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading, this has been an oddity before Signed-off-by: Jim O'Regan <joregan@kth.se> * try changing this year declaration Signed-off-by: Jim O'Regan <joregan@kth.se> * add year + era Signed-off-by: Jim O'Regan <joregan@kth.se> * eliminate more module-level data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "eliminate more module-level data loading" This reverts commit 6a26e5d927817e1308e818758196924441ff7b3a. Signed-off-by: Jim O'Regan <joregan@kth.se> * expose variables Signed-off-by: Jim O'Regan <joregan@kth.se> * extra param for itn mode Signed-off-by: Jim O'Regan <joregan@kth.se> * change call Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * change comment Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * fix parens Signed-off-by: Jim O'Regan <joregan@kth.se> * move data loading Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * adapt comments Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adapt/extend tests Signed-off-by: Jim O'Regan <joregan@kth.se> * fix dict init/change keys to something useful Signed-off-by: Jim O'Regan <joregan@kth.se> * initial stab at prefixed numbers Signed-off-by: Jim O'Regan <joregan@kth.se> * some adapting Signed-off-by: Jim O'Regan <joregan@kth.se> * insert kl. if absent Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comments Signed-off-by: Jim O'Regan <joregan@kth.se> * the relative prefixed times Signed-off-by: Jim O'Regan <joregan@kth.se> * + comments Signed-off-by: Jim O'Regan <joregan@kth.se> * enable time Signed-off-by: Jim O'Regan <joregan@kth.se> * space in both directions Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix hours to Signed-off-by: Jim O'Regan <joregan@kth.se> * split by before/after Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * fix if Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. 9 Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from en Signed-off-by: Jim O'Regan <joregan@kth.se> * keep only get_abs_path Signed-off-by: Jim O'Regan <joregan@kth.se> * imports Signed-off-by: Jim O'Regan <joregan@kth.se> * add trimmed file Signed-off-by: Jim O'Regan <joregan@kth.se> * fix imports Signed-off-by: Jim O'Regan <joregan@kth.se> * two abs_paths... could be fun Signed-off-by: Jim O'Regan <joregan@kth.se> * minutes/seconds Signed-off-by: Jim O'Regan <joregan@kth.se> * suffix Signed-off-by: Jim O'Regan <joregan@kth.se> * delete, not insert Signed-off-by: Jim O'Regan <joregan@kth.se> * one optional Signed-off-by: Jim O'Regan <joregan@kth.se> * export variable Signed-off-by: Jim O'Regan <joregan@kth.se> * kl. or one of suffix/zone Signed-off-by: Jim O'Regan <joregan@kth.se> * already disambiguated Signed-off-by: Jim O'Regan <joregan@kth.se> * closure Signed-off-by: Jim O'Regan <joregan@kth.se> * do not insert kl. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix test case Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix spelling Signed-off-by: Jim O'Regan <joregan@kth.se> * Delete measure.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Delete money.py Signed-off-by: Jim O’Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused test pieces Signed-off-by: Jim O'Regan <joregan@kth.se> * copy from es Signed-off-by: Jim O'Regan <joregan@kth.se> * add SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * add/update __init__ Signed-off-by: Jim O'Regan <joregan@kth.se> * blank line Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * fix lang Signed-off-by: Jim O'Regan <joregan@kth.se> * fix decimal verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix Signed-off-by: Jim O'Regan <joregan@kth.se> * remove year, conflicts with cardinal Signed-off-by: Jim O'Regan <joregan@kth.se> * space before, not after Signed-off-by: Jim O'Regan <joregan@kth.se> * fix cardinal tests Signed-off-by: Jim O'Regan <joregan@kth.se> * spurious deletion Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * re-enable SV TN; enable SV ITN Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "re-enable SV TN; enable SV ITN" This reverts commit 3ce4dfde1f70a89afc274284f6e4c737b3fac95b. Signed-off-by: Jim O'Regan <joregan@kth.se> * fix singulras Signed-off-by: Jim O'Regan <joregan@kth.se> * add an export Signed-off-by: Jim O'Regan <joregan@kth.se> * change integer graph Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * move spaces Signed-off-by: Jim O'Regan <joregan@kth.se> * use cdrewrite Signed-off-by: Jim O'Regan <joregan@kth.se> * just EOS/BOS Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Jim O'Regan <joregan@kth.se> * omit en/ett, because they are also articles Signed-off-by: Jim O'Regan <joregan@kth.se> * uncomment Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * unused Signed-off-by: Jim O'Regan <joregan@kth.se> * strip spaces from decimal part Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * partial fix, not what I wanted Signed-off-by: Jim O'Regan <joregan@kth.se> * move comment Signed-off-by: Jim O'Regan <joregan@kth.se> * en/ett cannot work in itn case Signed-off-by: Jim O'Regan <joregan@kth.se> * be more deliberate in graph construction Signed-off-by: Jim O'Regan <joregan@kth.se> * accept both Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * +2 tests Signed-off-by: Jim O'Regan <joregan@kth.se> * (try to) accept singular quantities for plurals Signed-off-by: Jim O'Regan <joregan@kth.se> * retry Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * oops Signed-off-by: Jim O'Regan <joregan@kth.se> * replace Signed-off-by: Jim O'Regan <joregan@kth.se> * arcmap Signed-off-by: Jim O'Regan <joregan@kth.se> * version without ones Signed-off-by: Jim O'Regan <joregan@kth.se> * add another test Signed-off-by: Jim O'Regan <joregan@kth.se> * change graph Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of this, this is where it goes wrong Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * add a test Signed-off-by: Jim O'Regan <joregan@kth.se> * multiple states from both ones, try removing and readding Signed-off-by: Jim O'Regan <joregan@kth.se> * remove ones, see if that fixes at least the bare quantities Signed-off-by: Jim O'Regan <joregan@kth.se> * works in the repl, dunno why it still breaks Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate Signed-off-by: Jim O'Regan <joregan@kth.se> * move definition Signed-off-by: Jim O'Regan <joregan@kth.se> * simplify Signed-off-by: Jim O'Regan <joregan@kth.se> * tweak Signed-off-by: Jim O'Regan <joregan@kth.se> * another test Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * local declaration, seems to not be working Signed-off-by: Jim O'Regan <joregan@kth.se> * more tests Signed-off-by: Jim O'Regan <joregan@kth.se> * match verbaliser Signed-off-by: Jim O'Regan <joregan@kth.se> * fix last two failing tests Signed-off-by: Jim O'Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add missing tests for telephone and word Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused variable Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused imports Signed-off-by: Jim O'Regan <joregan@kth.se> * fix comment Signed-off-by: Jim O'Regan <joregan@kth.se> * get rid of convert_space, tests fail Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails Signed-off-by: Jim O'Regan <joregan@kth.se> * Revert "put convert_spaces back, change test file; pytest fails" This reverts commit a7bb7489137b8026aab02aff64df39e874630043. Signed-off-by: Jim O'Regan <joregan@kth.se> * put convert_spaces back, change test file; pytest fails, take 2 Signed-off-by: Jim O'Regan <joregan@kth.se> * deliberately remove spaces rather than have a non-determinism that comes out differently in sparrowhawk Signed-off-by: Jim O'Regan <joregan@kth.se> * try converting the non-breaking spaces in the shell script Signed-off-by: Jim O'Regan <joregan@kth.se> * wrong place Signed-off-by: Jim O'Regan <joregan@kth.se> * fix typo Signed-off-by: Jim O'Regan <joregan@kth.se> * fix path Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * export Signed-off-by: Jim O'Regan <joregan@kth.se> * remove unused Signed-off-by: Jim O'Regan <joregan@kth.se> * Update date.py Signed-off-by: Jim O’Regan <joregan@kth.se> * Update time.py Signed-off-by: Jim O’Regan <joregan@kth.se> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix comment Signed-off-by: Jim O’Regan <joregan@kth.se> * trim comments Signed-off-by: Jim O’Regan <joregan@kth.se> * remove commented line Signed-off-by: Jim O’Regan <joregan@kth.se> * en halv Signed-off-by: Jim O’Regan <joregan@kth.se> * Update test_sparrowhawk_inverse_text_normalization.sh Signed-off-by: Jim O’Regan <joregan@kth.se> --------- Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Jim O’Regan <joregan@kth.se> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Italian_TN (#67) * add TN italian Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix init Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix LOCATION Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * modify graph_utils Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * correct decimals Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix electronic Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> * fix measure Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> --------- Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> Signed-off-by: Giacomo Cavallini <giacomoleonemaria@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh itn (#74) * Add ZH ITN Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix copyrights and code cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Remove invalid tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolve CodeQL issues Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Cleanup Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix missing 'zh' option for ITN and correct comment Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update __init__.py Change to zh instead of en for the imports. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for decimal test data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * update for langauge import Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update for Chinese punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * a new class for whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * PYNINI_AVAILABLE = False Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to file import format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * recreated due to format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * caught duplicates, removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates, arranges for CHInese Yuan updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates accordingly to the comments from last PR. Recreated some of the files due to format issues Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the hours_to and minute_to files used for back counting. ALso removed am and pm suffix files according to the last PR. Recreated some of them for format issue Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * re-added this file to avoid data file import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated gramamr according to last PR. Removed the acceptance of 千 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated according to last PR. Removed comma after decimal points Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for Fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * gramamr for money and updated according to last PR. Plus process of 元 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar. updates due to the updates in cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR comments. removing am and pm and allowing simple mandarin expression Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * arrangements Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added whitelist grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * word grammar for non-classified items Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated cardinal, decimal, time, itn data Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to last PR Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to the updates for cardinal grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for more Mandarin punctuations Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to last PR. removing am pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * adjustment on the weight Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the targger updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated accordingly to the time tagger Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates according to changes in tagger on am and pm Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * verbalizer for fraction Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for mandarin grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * kept this file because using English utils results in data namin error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * merge conflict Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed import os Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * deleted unsed variables Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and edits based on pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue, reccreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * format issue recreated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fixed codeing style/format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed coding style and format Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicated graph for 毛 Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed the comment Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unnecessary comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * unnecessary comment removed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test file updated for more cases Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated with a comment explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated the file explaining why this file is kept Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added Mandarin as zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing for dplication Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * removed unused NEMO objects Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed duplicates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removing unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix test file failures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to fix file failtures Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failture Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to resolve test case failure Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adap to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adapt to grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates to adopt to cardinal grammar changes Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fix style Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixing pr checks Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed // for zhtn/itn cache Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Update inverse_normalize.py Added zh as a selection to pass Jenkins checks. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: Alex Cui <alcui@nvidia.com> Co-authored-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * updated pynini_export.py file to create far files (#88) Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * readd Swedish (#87) Signed-off-by: Jim O'Regan <joregan@kth.se> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn 0712 (#89) * updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates and fixings according to document on natonal gideline Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * Decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fraction updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * money updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * ordinal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * punctuation grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * time gramamr updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * tokenizaer updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates on certificate Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * data updated and added due to updates and chanegs to the existing grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * cardinal updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * date grammar changed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * decimal grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * grammar updates Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test data added Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test python file edits Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for tn1.0 and previous tn grammar from contribution Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * test cases updated Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fixed Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * dates updated for init files Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated the date for zh Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed unsed imports Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * removed comments Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * added back the itn tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added back measure and math from previou TN Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated for tests reruns Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * updated weights Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zh tn char (#95) * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name change Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * file name Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * code stle Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * fixed import error Signed-off-by: BuyuanCui <alexcui1994@gmail.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * audio-based TN fix for empty pred_text/text (#92) * fix for empty pred_text Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add unittests Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix path Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix pytest Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * pip 1.2.0 Signed-off-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * French tn (#91) * add tests for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add fr tn for cardinals, decimals, fractions and ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * delete it far files from tools Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * add languages to run_evaluate Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * remove ambiguous spacing Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * enable sh testing for fr tn Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix bug with ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile cache date Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * fix test for ordinals Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update tn cache for fr Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * resolve codeql issues Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Add whitelist_tech.tsv (#96) Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Zhitn 0727 (#93) * updates on itn grammar to pass sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updats for sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates fro sparrowhawk tests Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * coding style fix Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updates for coding style and sparrowhawk test Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * updated classes for tests on whitelist and word grammar Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for tests on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added for test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on whitelist Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * added to run test on word Signed-off-by: BuyuanCui <alexcui1994@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update test_word.py Removed unused import. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_word.py Removed imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Removing imports according to CodeQL Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update test_whitelist.py Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> * Update Jenkinsfile changed zh cache to 07-27-23 as it is the latest update. Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> --------- Signed-off-by: BuyuanCui <alexcui1994@gmail.com> Signed-off-by: Buyuan(Alex) Cui <69030297+BuyuanCui@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Es tn romans fix (#98) * fix es tn roman exceptions Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update jenkinsfile Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update eval script for ITN Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * codeql fix Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Change docker image (#102) Change docker image to one including sparrowhawk Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Print warning instead exception (#97) * raise text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * text arg Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * Failed text Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * add logger Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * logger Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * NeMo-text-processing Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * info level Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * rm raise Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Normalizer.select_verbalizer Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * Exception Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * verbose Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * restart ci Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Nikolay Karpov <nkarpov@nvidia.com> Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Nikolay Karpov <nkarpov@nvidia.com> Co-authored-by: Evelina <ebakhturina@nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * warning regardless of verbose flag (#107) * warning Signed-off-by: Nikolay Karpov <karpnv@gmail.com> * self.verbose Signed-off-by: Nikolay Karpov <karpnv@gmail.com> --------- Signed-off-by: Nikolay Karpov <karpnv@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Unpin setuptools (#106) Signed-off-by: Peter Plantinga <plantinga.peter@proton.me> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fixed warnings: File is not always closes. (#113) Signed-off-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Co-authored-by: Xuesong Yang <16880-xueyang@users.noreply.gitlab-master.nvidia.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * fix bug #111 (ar currencies) (#117) * fix bug #111 (ar currencies) Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> * update ci folder Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <mgrafu@gmail.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Logging clean up + IT TN fix (#118) * fix utils and it TN Signed-off-by: Evelina <ebakhturina@nvidia.com> * clean up Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix logging Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * fix format Signed-off-by: Evelina <ebakhturina@nvidia.com> * add IT TN to CI Signed-off-by: Evelina <ebakhturina@nvidia.com> * update patch Signed-off-by: Evelina <ebakhturina@nvidia.com> --------- Signed-off-by: Evelina <ebakhturina@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Alex Cui <alexcui1994@gmail.com> * Time_IT_TN (#105) * add time verbalizer Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add time tagger and verba Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * add pytest time Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * codeQL Signed-off-by: GiacomoLeoneMaria <giacomoleonemaria@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix numbers …
What does this PR do ?
Increase weights for serial (en TN)
PR Type: