Skip to content

interested in UTR#30 support? #4

@jrochkind

Description

@jrochkind

There was a UTR#30 for 'ascii folding'. While it's been withdrawn as part of the Unicode standard, many people find it useful anyway -- for instance Solr/Lucene still supports it with their ICUFoldingFilterFactory

Here are what I think are the relevant unicode ".txt" source files with mappings to implement UTR#30, from the lucene source: https://github.com/apache/lucene-solr/tree/trunk/lucene/analysis/icu/src/data/utr30

I note that unicode_utils uses these same unicode .txt mapping source files as definitions to implement the parts of unicode it does implement.

So that would probably make it pretty feasible to do UTR#30 too. Even though it's not part of unicode, some people are still finding it useful and have need of it (including myself).

Are you interested in unicode_utils supporting UTR#30? I could try to create a pull request, although it would take me a while to figure out what to actually do with the .txt mapping definition files to fit them into unicode_util properly, it's possible you could do it in only a few minutes if you were interested.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions