diff --git a/tutorials/nlp/02_NLP_Tokenizers.ipynb b/tutorials/nlp/02_NLP_Tokenizers.ipynb index 7199a4c67a14..f2e52a394f8c 100644 --- a/tutorials/nlp/02_NLP_Tokenizers.ipynb +++ b/tutorials/nlp/02_NLP_Tokenizers.ipynb @@ -98,7 +98,7 @@ "\n", "Hugging Face and Megatron tokenizers (which uses Hugging Face underneath) can be automatically instantiated by only `tokenizer_name`, which downloads the corresponding `vocab_file` from the internet. \n", "\n", - "For SentencePieceTokenizer, WordTokenizer, and CharTokenizers `tokenizer_model` or/and `vocab_file` can be generated offline in advance using [`scripts/tokenizers/process_asr_text_tokenizer.py`](https://github.com/NVIDIA/NeMo/blob/stable/scripts/process_asr_text_tokenizer.py)\n", + "For SentencePieceTokenizer, WordTokenizer, and CharTokenizers `tokenizer_model` or/and `vocab_file` can be generated offline in advance using [`scripts/tokenizers/process_asr_text_tokenizer.py`](https://github.com/NVIDIA/NeMo/blob/stable/scripts/tokenizers/process_asr_text_tokenizer.py)\n", "\n", "The tokenizers in NeMo are designed to be used interchangeably, especially when\n", "used in combination with a BERT-based model.\n", @@ -381,7 +381,7 @@ "id": "ykwKmREuPQE-" }, "source": [ - "We use the [`scripts/tokenizers/process_asr_text_tokenizer.py`](https://github.com/NVIDIA/NeMo/blob/stable/scripts/process_asr_text_tokenizer.py) script to create a custom tokenizer model with its own vocabulary from an input file" + "We use the [`scripts/tokenizers/process_asr_text_tokenizer.py`](https://github.com/NVIDIA/NeMo/blob/stable/scripts/tokenizers/process_asr_text_tokenizer.py) script to create a custom tokenizer model with its own vocabulary from an input file" ] }, { @@ -585,4 +585,4 @@ }, "nbformat": 4, "nbformat_minor": 1 -} \ No newline at end of file +}