Use mb_substr() for correct abbreviation of non-ASCII characters #651

xalt7x · 2024-05-21T11:41:42Z

When using substr() or another method to reduce a string to/by 1 byte, many UTF-8 characters are lost (displayed as � ). Switching to mb_substr() fixes this.

xalt7x · 2024-05-21T11:48:43Z

The problem is easily reproducible with Cyrillic/Ukrainian characters (e.g., "Джон Дое" as the User/Owner name, or "Навички обслуговування клієнтів" string for "Key Skills").

Additional information:

If you’re working with strings encoded as UTF-8 you may lose characters when you try to get a part of them using the PHP substr function. This happens because in UTF-8 characters are not restricted to one byte, they have variable length to match Unicode characters, between 1 and 4 bytes.

RussH · 2024-09-16T13:04:03Z

Thanks @xalt7x !

Use mb_substr() for correct abbreviation of non-ASCII characters

7d31c45

When using substr() or another method to reduce a string to/by 1 byte, many UTF-8 characters are lost (displayed as � ). Switching to mb_substr() fixes this.

RussH merged commit e7c1ab1 into opencats:master Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use mb_substr() for correct abbreviation of non-ASCII characters #651

Use mb_substr() for correct abbreviation of non-ASCII characters #651

Uh oh!

xalt7x commented May 21, 2024

Uh oh!

xalt7x commented May 21, 2024

Uh oh!

RussH commented Sep 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use mb_substr() for correct abbreviation of non-ASCII characters #651

Use mb_substr() for correct abbreviation of non-ASCII characters #651

Uh oh!

Conversation

xalt7x commented May 21, 2024

Uh oh!

xalt7x commented May 21, 2024

Uh oh!

RussH commented Sep 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants