Skip to content

[FR] Split string by unit, pattern, chunk length #21

@michaelrog

Description

@michaelrog

In service to #3, #18, #20, and the chop filter in general, it'd be nice to have underlying methods for splitting text by various units — character, word, sentence, line, paragraph — and to expose those also as their own functions, e.g.:

split("This is a test.", unit='w')

...returns:

["This", "is", "a", "test."]

(Should we also offer a way to strip punctuation?)

Also, this function could include/extend functionality from:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions