Add Vietnamese measure text normalization support#307
Add Vietnamese measure text normalization support#307tbartley94 merged 12 commits intoNVIDIA:staging_vi_tnfrom
Conversation
- Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com>
for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com>
Signed-off-by: folivoramanh <palasek182@gmail.com>
61a5e74 to
3ed2bd3
Compare
Signed-off-by: folivoramanh <palasek182@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: folivoramanh <palasek182@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: folivoramanh <palasek182@gmail.com>
|
|
||
| return pynini.union(*patterns) | ||
|
|
||
| def _build_all_magnitude_patterns(self): |
There was a problem hiding this comment.
Any way you can insert preserve_order in any of these or too complicated?
There was a problem hiding this comment.
not applicable for cardinal number patterns in this context. Cardinal numbers have deterministic structure: Vietnamese cardinal numbers follow a fixed left-to-right magnitude ordering (millions → thousands → hundreds → units). There's no alternative ordering that needs preservation.
There was a problem hiding this comment.
if it's deterministic, then you can enforce order preservation no? this will make the graph more efficient.
There was a problem hiding this comment.
preserve_order isn't applicable here. Vietnamese cardinal numbers only output a single integer property (plus optional negative). Since cardinals have deterministic single-property output, there's nothing to preserve - no permutation occurs.
| from nemo_text_processing.text_normalization.vi.verbalizers.cardinal import CardinalFst as VCardinalFst | ||
| from nemo_text_processing.text_normalization.vi.verbalizers.date import DateFst as VDateFst | ||
| from nemo_text_processing.text_normalization.vi.verbalizers.decimal import DecimalFst as VDecimalFst | ||
| from nemo_text_processing.text_normalization.vi.verbalizers.fraction import FractionFst as VFractionFst |
There was a problem hiding this comment.
why are we using alias imports?
There was a problem hiding this comment.
The alias imports are being used because there are name conflicts between taggers and verbalizers. And those verbalizers are imported to build range_fst
nemo_text_processing/text_normalization/vi/verbalizers/money.py
Outdated
Show resolved
Hide resolved
| @@ -0,0 +1,139 @@ | |||
| # Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | |||
There was a problem hiding this comment.
i'm confused how this significantly differs from current postprocessing behavior.
There was a problem hiding this comment.
Here's example in English when have not post processing
NeMo-text-processing :: DEBUG :: tokens { name: "It's" } tokens { electronic { domain: "123.000USD" } } tokens { name: "(" } tokens { measure { cardinal { integer: "one" } units: "kilogram" } } tokens { name: ")." } tokens { name: "kinda" } tokens { name: "good," } tokens { name: "(" } tokens { cardinal { integer: "one hundred and twenty three" } } tokens { name: ")" } tokens { name: "thanks." }
It's one two three dot zero zero zero USD ( one kilogram ). kinda good, ( one hundred and twenty three ) thanks.
when punct near other class not word, system recognized it as seperated tokens and when verbalized, output += ' ' + Normalizer.select_verbalizer(verbalizer_lattice) (in normalize.py) add space in every punct and other class (such as "(" and one kilogram). Thus post_processing.py fixes spacing issues inherent to the token-by-token verbalization architecture
Signed-off-by: folivoramanh <palasek182@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: folivoramanh <palasek182@gmail.com>
for more information, see https://pre-commit.ci
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com>
* Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (NVIDIA#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (NVIDIA#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (NVIDIA#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (NVIDIA#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (NVIDIA#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (NVIDIA#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (NVIDIA#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (NVIDIA#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (NVIDIA#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (NVIDIA#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com>
* Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (NVIDIA#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (NVIDIA#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (NVIDIA#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (NVIDIA#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (NVIDIA#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (NVIDIA#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (NVIDIA#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (NVIDIA#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (NVIDIA#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (NVIDIA#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com>
* Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (NVIDIA#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: Mai Anh <palasek182@gmail.com> * Add missing init file Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (NVIDIA#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (NVIDIA#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (NVIDIA#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (NVIDIA#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (NVIDIA#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (NVIDIA#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (NVIDIA#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (NVIDIA#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (NVIDIA#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (NVIDIA#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (NVIDIA#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (NVIDIA#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (NVIDIA#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (NVIDIA#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (NVIDIA#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (NVIDIA#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (NVIDIA#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (NVIDIA#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (NVIDIA#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com>
* Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (NVIDIA#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: Mai Anh <palasek182@gmail.com> * Add missing init file Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (NVIDIA#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (NVIDIA#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (NVIDIA#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (NVIDIA#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (NVIDIA#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (NVIDIA#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (NVIDIA#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (NVIDIA#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (NVIDIA#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (NVIDIA#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com>
* PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) (#327) * Fix Jenkinsfile for CI * Fix requirements for test * Update paths and docker * Fix docker name * Fix click version * Change path of grammars for sparrowhawk tests * Update paths in sh_test.sh * Update paths * Revert paths --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Staging vi tn signed off (#339) * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: Mai Anh <palasek182@gmail.com> * Add missing init file Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Comma bugfix for En electronics (#332) * fix bug with commas and electronics Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * remove unuse import (#340) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Update Jenkinsfile (#341) Only mount TestData from path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] pre-commit suggestions (#335) updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](pre-commit/pre-commit-hooks@v5.0.0...v6.0.0) - [github.com/PyCQA/flake8: 7.2.0 → 7.3.0](PyCQA/flake8@7.2.0...7.3.0) - [github.com/PyCQA/isort: 6.0.1 → 6.1.0](PyCQA/isort@6.0.1...6.1.0) - https://github.com/psf/black → https://github.com/psf/black-pre-commit-mirror - [github.com/psf/black-pre-commit-mirror: 25.1.0 → 25.9.0](psf/black-pre-commit-mirror@25.1.0...25.9.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * update jenkins cache Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * fill missing lang in arg run (#347) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) (#327) * Fix Jenkinsfile for CI * Fix requirements for test * Update paths and docker * Fix docker name * Fix click version * Change path of grammars for sparrowhawk tests * Update paths in sh_test.sh * Update paths * Revert paths --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Staging vi tn signed off (#339) * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: Mai Anh <palasek182@gmail.com> * Add missing init file Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Comma bugfix for En electronics (#332) * fix bug with commas and electronics Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * remove unuse import (#340) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Update Jenkinsfile (#341) Only mount TestData from path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] pre-commit suggestions (#335) updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](pre-commit/pre-commit-hooks@v5.0.0...v6.0.0) - [github.com/PyCQA/flake8: 7.2.0 → 7.3.0](PyCQA/flake8@7.2.0...7.3.0) - [github.com/PyCQA/isort: 6.0.1 → 6.1.0](PyCQA/isort@6.0.1...6.1.0) - https://github.com/psf/black → https://github.com/psf/black-pre-commit-mirror - [github.com/psf/black-pre-commit-mirror: 25.1.0 → 25.9.0](psf/black-pre-commit-mirror@25.1.0...25.9.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * update jenkins cache Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * fill missing lang in arg run (#347) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com>
* PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Fix Jenkinsfile for CI (#325) (#327) * Fix Jenkinsfile for CI * Fix requirements for test * Update paths and docker * Fix docker name * Fix click version * Change path of grammars for sparrowhawk tests * Update paths in sh_test.sh * Update paths * Revert paths --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Fix word range (#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Staging vi tn signed off (#339) * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Comma bugfix for En electronics (#332) * fix bug with commas and electronics Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * remove unuse import (#340) Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Update Jenkinsfile (#341) Only mount TestData from path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] pre-commit suggestions (#335) updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0) - [github.com/PyCQA/flake8: 7.2.0 → 7.3.0](https://github.com/PyCQA/flake8/compare/7.2.0...7.3.0) - [github.com/PyCQA/isort: 6.0.1 → 6.1.0](https://github.com/PyCQA/isort/compare/6.0.1...6.1.0) - https://github.com/psf/black → https://github.com/psf/black-pre-commit-mirror - [github.com/psf/black-pre-commit-mirror: 25.1.0 → 25.9.0](https://github.com/psf/black-pre-commit-mirror/compare/25.1.0...25.9.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins cache Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * fill missing lang in arg run (#347) Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Staging vi tn DCO fixed (#354) * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) (#327) * Fix Jenkinsfile for CI * Fix requirements for test * Update paths and docker * Fix docker name * Fix click version * Change path of grammars for sparrowhawk tests * Update paths in sh_test.sh * Update paths * Revert paths --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Staging vi tn signed off (#339) * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: Mai Anh <palasek182@gmail.com> * Add missing init file Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Comma bugfix for En electronics (#332) * fix bug with commas and electronics Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * remove unuse import (#340) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Update Jenkinsfile (#341) Only mount TestData from path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] pre-commit suggestions (#335) updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0) - [github.com/PyCQA/flake8: 7.2.0 → 7.3.0](https://github.com/PyCQA/flake8/compare/7.2.0...7.3.0) - [github.com/PyCQA/isort: 6.0.1 → 6.1.0](https://github.com/PyCQA/isort/compare/6.0.1...6.1.0) - https://github.com/psf/black → https://github.com/psf/black-pre-commit-mirror - [github.com/psf/black-pre-commit-mirror: 25.1.0 → 25.9.0](https://github.com/psf/black-pre-commit-mirror/compare/25.1.0...25.9.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * update jenkins cache Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * fill missing lang in arg run (#347) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update vi cache date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Refactor Vietnamese (#357) * Refactor Vietnamese ITN taggers: modularize date, add data files, improve naming - Modularize date.py year components for better readability - Add weights to prevent non-deterministic behavior in insert operations - Remove redundant YEAR_WEIGHT constant (use inline weights) - Create zero_prefix.tsv and digit_special.tsv data files - Rename delete_extra_space to delete_single_space in electronic.py for clarity - Add delete_single_space to graph_utils for reuse Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Refactor Vietnamese: PSA follow Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * delete unuse import (#358) Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Mai Anh <95898107+folivoramanh@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mai Anh <palasek182@gmail.com>
* PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Fix Jenkinsfile for CI (#325) (#327) * Fix Jenkinsfile for CI * Fix requirements for test * Update paths and docker * Fix docker name * Fix click version * Change path of grammars for sparrowhawk tests * Update paths in sh_test.sh * Update paths * Revert paths --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Fix word range (#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Staging vi tn signed off (#339) * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: folivoramanh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * add test case for range measure Signed-off-by: folivoramanh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: folivoramanh <palasek182@gmail.com> * refractor minor code Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: folivoramanh <palasek182@gmail.com> * fix quote in post process Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: folivoramanh <palasek182@gmail.com> * revert old codes Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: folivoramanh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Comma bugfix for En electronics (#332) * fix bug with commas and electronics Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * remove unuse import (#340) Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Update Jenkinsfile (#341) Only mount TestData from path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] pre-commit suggestions (#335) updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0) - [github.com/PyCQA/flake8: 7.2.0 → 7.3.0](https://github.com/PyCQA/flake8/compare/7.2.0...7.3.0) - [github.com/PyCQA/isort: 6.0.1 → 6.1.0](https://github.com/PyCQA/isort/compare/6.0.1...6.1.0) - https://github.com/psf/black → https://github.com/psf/black-pre-commit-mirror - [github.com/psf/black-pre-commit-mirror: 25.1.0 → 25.9.0](https://github.com/psf/black-pre-commit-mirror/compare/25.1.0...25.9.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins cache Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * fill missing lang in arg run (#347) Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Staging vi tn DCO fixed (#354) * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: folivoramanh <palasek182@gmail.com> * Add missing init file Signed-off-by: folivoramanh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: folivoramanh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) (#327) * Fix Jenkinsfile for CI * Fix requirements for test * Update paths and docker * Fix docker name * Fix click version * Change path of grammars for sparrowhawk tests * Update paths in sh_test.sh * Update paths * Revert paths --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Staging vi tn signed off (#339) * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * PR: Add Vietnamese text normalization for cardinal semiotic class (#289) * Add Vietnamese text normalization for cardinal semiotic class Signed-off-by: Mai Anh <palasek182@gmail.com> * Add missing init file Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Cardinal and optimize logic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Ordinal and Decimal for Vietnamese TN (#290) * Add Vietnamese text normalization for ordinal and decimal semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * update sparrowhawk Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor decimal code and docstring Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese TN - Fraction (#296) * Fraction class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> * Remove irrelavant test case Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date Semiotic Class for Vietnamese TN (#298) * Date for vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add roman support and correct copyright header Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header to current year Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change header time Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Time - semiotic class for Vietnamese TN (#302) * Time - semiotic class for Vietnamese TN Signed-off-by: Mai Anh <palasek182@gmail.com> * remove irrelevant import and comment Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add comment and refractor pattern Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * Change the spaces to NEMO_SPACE for maintenance. Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Change the spaces to NEMO_SPACE for maintenance. - remove quote Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese TN support for Money and Range semiotic classes (#304) * Add Vietnamese TN support for Money and Range semiotic classes - Add money.py tagger and verbalizer for Vietnamese currency handling - Add range.py tagger for numerical range processing - Add supporting data files for money (currency, currency_minor, per_unit) - Add quantity abbreviations and time units data - Update existing taggers and verbalizers for integration - Add comprehensive test cases for money and range functionality - Update tokenize_and_classify to include new semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * modify illogical test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor and simplify word and punctuation to avoid hardcoding Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor code money range Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Add Vietnamese measure text normalization support (#307) * Add Vietnamese measure text normalization support - Added measure tagger and verbalizer for Vietnamese TN - Updated money tagger and verbalizer to handle per-unit measurements - Added test cases for measure normalization - Updated fraction handling for better integration - Added data files for measurements, prefixes, and per-unit bases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * add test case for range measure Signed-off-by: Mai Anh <palasek182@gmail.com> * additional support for cardinal and remove duplicate test case Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refractor cardinal and add test cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * remove duplicate lines in run_eval file Signed-off-by: Mai Anh <palasek182@gmail.com> * refractor minor code Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add measure support for unit per unit cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Vietnamese MRC 1.0 fix case (#312) * fix and add cases Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix word range (#334) * fix range and quote Signed-off-by: Mai Anh <palasek182@gmail.com> * fix quote in post process Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix quote and range Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Date time itn (#333) * improve numeric semiotic classes Signed-off-by: Mai Anh <palasek182@gmail.com> * Fix Jenkinsfile for CI (#325) * Fix Jenkinsfile for CI Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix requirements for test Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths and docker Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix docker name Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Fix click version Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Change path of grammars for sparrowhawk tests Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths in sh_test.sh Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Update paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> * Revert paths Signed-off-by: Anand Joseph <anajoseph@nvidia.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mai Anh <palasek182@gmail.com> * revert old codes Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * revert not inherit Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve date time Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix pynini union instead of union operator Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * improve measure, telephone, electronic Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * change union operator to pynini union Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Comma bugfix for En electronics (#332) * fix bug with commas and electronics Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update jenkins Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> --------- Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * remove unuse import (#340) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * Update Jenkinsfile (#341) Only mount TestData from path Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] pre-commit suggestions (#335) updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v5.0.0...v6.0.0) - [github.com/PyCQA/flake8: 7.2.0 → 7.3.0](https://github.com/PyCQA/flake8/compare/7.2.0...7.3.0) - [github.com/PyCQA/isort: 6.0.1 → 6.1.0](https://github.com/PyCQA/isort/compare/6.0.1...6.1.0) - https://github.com/psf/black → https://github.com/psf/black-pre-commit-mirror - [github.com/psf/black-pre-commit-mirror: 25.1.0 → 25.9.0](https://github.com/psf/black-pre-commit-mirror/compare/25.1.0...25.9.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * update jenkins cache Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> * fill missing lang in arg run (#347) Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * update vi cache date Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> * Refactor Vietnamese (#357) * Refactor Vietnamese ITN taggers: modularize date, add data files, improve naming - Modularize date.py year components for better readability - Add weights to prevent non-deterministic behavior in insert operations - Remove redundant YEAR_WEIGHT constant (use inline weights) - Create zero_prefix.tsv and digit_special.tsv data files - Rename delete_extra_space to delete_single_space in electronic.py for clarity - Add delete_single_space to graph_utils for reuse Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Refactor Vietnamese: PSA follow Signed-off-by: Mai Anh <palasek182@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Mai Anh <palasek182@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * delete unuse import (#358) Signed-off-by: Mai Anh <palasek182@gmail.com> --------- Signed-off-by: folivoramanh <palasek182@gmail.com> Signed-off-by: Mariana Graterol Fuenmayor <marianag@nvidia.com> Signed-off-by: Anand Joseph <anajoseph@nvidia.com> Signed-off-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Signed-off-by: Mai Anh <palasek182@gmail.com> Signed-off-by: Mariana <47233618+mgrafu@users.noreply.github.com> Co-authored-by: Mai Anh <95898107+folivoramanh@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: anand-nv <105917641+anand-nv@users.noreply.github.com> Co-authored-by: Mai Anh <palasek182@gmail.com> Signed-off-by: RajanPutty <rputty@nvidia.com>
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Before your PR is "Ready for review"
Pre checks:
git commit -sto sign.pytestor (if your machine does not have GPU)pytest --cpufrom the root folder (given you marked your test cases accordingly@pytest.mark.run_only_on('CPU')).bash tools/text_processing_deployment/export_grammars.sh --MODE=test ...pytestand Sparrowhawk here.__init__.pyfor every folder and subfolder, includingdatafolder which has .TSV files?Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.to all newly added Python files?Copyright 2015 and onwards Google, Inc.. See an example here.try import: ... except: ...) if not already done.PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.