I see that you have a small list of homoglyphs in characters_safetext.py. Unicode has a reference text file for such information that seems pretty comprehensive: confusables.txt (Techincal Report)
Would it make sense to incorporate this dataset into your tool?