-
Notifications
You must be signed in to change notification settings - Fork 145
Zh tn bug 240712 #187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Zh tn bug 240712 #187
Changes from all commits
Commits
Show all changes
98 commits
Select commit
Hold shift + click to select a range
f3555a8
IT TN improvement on tests (#120)
mgrafu e67e17c
add single letter exception for roman numerals (#121)
mgrafu c2b9e0a
fix broken path for nondet whitelist (#124)
mgrafu 41b21e2
Increase weights for serial (en TN) (#128)
anand-nv 230b21e
add measures file for FR TN (#131)
mgrafu c4f4553
Sh jenkins (#127)
anand-nv 5561c48
update isort - fix precommit (#138)
ekmb d9f749e
Armenian itn (#136)
davidks13 7bc3654
Fix CI (#142)
ekmb bf43b19
Armenian TN (#137)
davidks13 462d551
Marathi ITN (#134)
ChinmayPatil11 28409b2
jenkins fix (#150)
tbartley94 0032b2b
r0.3.0 release (#151)
ekmb 5bbab9c
Fix text=line[text] to text=line[text_field] (#153)
ssh-meister 86b1904
use real string on docstring (#157)
kevsan4 76f415c
Sh postprocess (#147)
anand-nv dea0439
update run_evaluate script for cased itn (#164)
mgrafu 400c9fb
remove unused function from ar tn decimals (#165)
mgrafu 36fa3af
ZH sentence-level TN (#112)
BuyuanCui ec331da
preparing release, updating change log (#168)
tbartley94 61054ba
hotfix (#169)
ekmb 498781f
hotfix (#170)
tbartley94 d290748
DE TN Fixes (#177)
zoobereq df22a18
Tts en tech terms (#167)
mgrafu 1318648
Normalizes the '%' sign (#180)
zoobereq 7783a1c
FR TN Fixes (#181)
zoobereq 6c19ae5
Merge branch 'main' of https://github.com/NVIDIA/NeMo-text-processing…
BuyuanCui 85a771b
testing
BuyuanCui 5bb4872
removing test.txt
BuyuanCui 72341b1
fixing zh tn money curreny on l
BuyuanCui 14ff392
bug fix on money currency l
BuyuanCui 694c33b
updates for zh tn
BuyuanCui 3152b35
resolving failed ci tests for money grammar
BuyuanCui d619eca
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f7416a1
updates for decimal maoney failure
BuyuanCui b5e6f33
removing comments
BuyuanCui ae47069
Merge branch 'zh_tn_bug_240712' of https://github.com/NVIDIA/NeMo-tex…
BuyuanCui 682988c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 840ae1f
updates on money grammar for failure cases
BuyuanCui e7284e6
adding test cases in the nvbug
BuyuanCui bff16e0
Merge branch 'zh_tn_bug_240712' of https://github.com/NVIDIA/NeMo-tex…
BuyuanCui 83ba1d7
updates for ci etst
BuyuanCui 9ebb5ad
updating date for rerun
BuyuanCui 50546b2
renaming final graphs
BuyuanCui 9648811
Merge branch 'main' into zh_tn_bug_240712
BuyuanCui 4219965
resolving conflicts
BuyuanCui 63fec6f
conflicts
BuyuanCui 17a3554
Merge branch 'zh_tn_bug_240712' of https://github.com/NVIDIA/NeMo-tex…
BuyuanCui f461b40
updating data
BuyuanCui 16cb041
attempt to resolve jenkins issue
BuyuanCui 5548a95
ci tests resolving
BuyuanCui 739ef30
testing
BuyuanCui 4431f6a
removing test.txt
BuyuanCui c1c7ef4
fixing zh tn money curreny on l
BuyuanCui f1d1d96
bug fix on money currency l
BuyuanCui f520f57
resolving failed ci tests for money grammar
BuyuanCui 818dca0
updates for decimal maoney failure
BuyuanCui 078fb7a
removing comments
BuyuanCui 5087cd0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 66f2787
updates on money grammar for failure cases
BuyuanCui 85c6ed5
adding test cases in the nvbug
BuyuanCui bea17ed
renaming final graphs
BuyuanCui 5e26452
conflicts
BuyuanCui a7c8b6d
updating data
BuyuanCui 7a64e69
attempt to resolve jenkins issue
BuyuanCui 483f667
ci tests resolving
BuyuanCui 47016b6
resolving conflict for ci tests update
BuyuanCui af3f8f7
Increase weights for serial (en TN) (#128)
anand-nv 9a68cf9
add measures file for FR TN (#131)
mgrafu 78aadbe
Sh jenkins (#127)
anand-nv 4f9da16
update isort - fix precommit (#138)
ekmb 02fae02
Armenian itn (#136)
davidks13 e9f32a8
Fix CI (#142)
ekmb f0fd38a
Armenian TN (#137)
davidks13 b0d58dd
Marathi ITN (#134)
ChinmayPatil11 a514d80
jenkins fix (#150)
tbartley94 66cda82
ZH sentence-level TN (#112)
BuyuanCui 128753d
Tts en tech terms (#167)
mgrafu 1d37427
testing
BuyuanCui 150d864
removing test.txt
BuyuanCui 7d0b513
fixing zh tn money curreny on l
BuyuanCui a1272f4
resolving failed ci tests for money grammar
BuyuanCui 6314efb
updates for decimal maoney failure
BuyuanCui 61b892b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f8fb143
updates on money grammar for failure cases
BuyuanCui 553d32e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 8002a50
renaming final graphs
BuyuanCui 7b938de
conflicts
BuyuanCui 0eca73d
updating data
BuyuanCui 8e3a793
attempt to resolve jenkins issue
BuyuanCui 2a48144
ci tests resolving
BuyuanCui 7f451bf
committing
BuyuanCui 6351b49
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 49d0058
resolving conflict
BuyuanCui ade0b91
Jenkins test not starting, copied form main branch
BuyuanCui 4e149fa
copied from Nemo main, esolving Jenkins isue
BuyuanCui 17ccdaa
copied from NeMo main, resolving Jenkins issue
BuyuanCui 4323398
Merge branch 'main' into zh_tn_bug_240712
BuyuanCui File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -476,4 +476,4 @@ pipeline { | |
| cleanWs() | ||
| } | ||
| } | ||
| } | ||
| } | ||
64 changes: 32 additions & 32 deletions
64
nemo_text_processing/inverse_text_normalization/ja/verbalizers/word.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,32 +1,32 @@ | ||
| # Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| import pynini | ||
| from pynini.lib import pynutil | ||
| from nemo_text_processing.text_normalization.zh.graph_utils import NEMO_NOT_QUOTE, GraphFst, delete_space | ||
| class WordFst(GraphFst): | ||
| ''' | ||
| tokens { char: "一" } -> 一 | ||
| ''' | ||
| def __init__(self, deterministic: bool = True, lm: bool = False): | ||
| super().__init__(name="char", kind="verbalize", deterministic=deterministic) | ||
| graph = pynutil.delete("name: \"") + NEMO_NOT_QUOTE + pynutil.delete("\"") | ||
| graph = pynini.closure(delete_space) + graph + pynini.closure(delete_space) | ||
| self.fst = graph.optimize() | ||
| # Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
|
|
||
| import pynini | ||
| from pynini.lib import pynutil | ||
|
|
||
| from nemo_text_processing.text_normalization.zh.graph_utils import NEMO_NOT_QUOTE, GraphFst, delete_space | ||
|
|
||
|
|
||
| class WordFst(GraphFst): | ||
| ''' | ||
| tokens { char: "一" } -> 一 | ||
| ''' | ||
|
|
||
| def __init__(self, deterministic: bool = True, lm: bool = False): | ||
| super().__init__(name="char", kind="verbalize", deterministic=deterministic) | ||
|
|
||
| graph = pynutil.delete("name: \"") + NEMO_NOT_QUOTE + pynutil.delete("\"") | ||
| graph = pynini.closure(delete_space) + graph + pynini.closure(delete_space) | ||
| self.fst = graph.optimize() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -82,4 +82,4 @@ testITNWord() { | |
| shift $# | ||
|
|
||
| # Load shUnit2 | ||
| . /workspace/shunit2/shunit2 | ||
| . /workspace/shunit2/shunit2 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -82,4 +82,4 @@ testITNWord() { | |
| shift $# | ||
|
|
||
| # Load shUnit2 | ||
| . /workspace/shunit2/shunit2 | ||
| . /workspace/shunit2/shunit2 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -119,4 +119,4 @@ testTNMath() { | |
| shift $# | ||
|
|
||
| # Load shUnit2 | ||
| . /workspace/shunit2/shunit2 | ||
| . /workspace/shunit2/shunit2 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2 changes: 1 addition & 1 deletion
2
tests/nemo_text_processing/zh/test_sparrowhawk_normalization.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -107,4 +107,3 @@ else | |
| echo "done mode: $MODE" | ||
| exit 0 | ||
| fi | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Check notice
Code scanning / CodeQL
Unused import