-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Labels
Description
I wonder if there is a straightforward way to add from the tokenizer the sentences and their token content to build a new folia doc.
It is not clear to me how to do that with the add method: is one supposed to recursively access sentences and tokens from the tokenizer that yields Token types, and subsequently render the token contents by scripting (e.g. accessing a token class and then specifying it for a folia.Word annotation), or is there an direct way to add the tokenizer content structure to the FoLiA doc?
Or is python-ucto not meant to be used for that, and one should rather first create a folia doc with untokenized content and run CLI ucto on it?