A small experiment using both Mecab and Tinysegmenter to create a tokenized list of Japanese sentences in JSON, taken from the Tatoeba corpus.
-
Updated
Mar 25, 2021 - Python
A small experiment using both Mecab and Tinysegmenter to create a tokenized list of Japanese sentences in JSON, taken from the Tatoeba corpus.
Add a description, image, and links to the tinysegmenter topic page so that developers can more easily learn about it.
To associate your repository with the tinysegmenter topic, visit your repo's landing page and select "manage topics."