Skip to content

Fix the handling of supplementary characters (characters > U+FFFF)#66

Merged
xerial merged 1 commit intoxerial:masterfrom
mkauf:fix-utf8-supplementary-characters
Oct 26, 2015
Merged

Fix the handling of supplementary characters (characters > U+FFFF)#66
xerial merged 1 commit intoxerial:masterfrom
mkauf:fix-utf8-supplementary-characters

Conversation

@mkauf
Copy link
Contributor

@mkauf mkauf commented Oct 26, 2015

JNI uses a modified UTF-8 encoding. For supplementary characters, an invalid UTF-8 sequence was written to the database, which resulted in interoperability problems. The solution is to avoid UTF-8 in the native code and use the UTF-16 functions of SQLite (where possible). SQLite will then convert the UTF-16 to standards-compliant, unmodified UTF-8.

This also fixes related bugs in JDBC3PreparedStatement and improves the "out of memory" handling in the native code.

Fixed Issues:

JNI uses a modified UTF-8 encoding. For supplementary characters, an invalid UTF-8 sequence was written to the database, which resulted in interoperability problems. The solution is to avoid UTF-8 in the native code and use the UTF-16 functions of SQLite (where possible). SQLite will then convert the UTF-16 to standards-compliant, unmodified UTF-8.

This also fixes related bugs in JDBC3PreparedStatement and improves the "out of memory" handling in the native code.

Fixed Issues:
- https://bitbucket.org/xerial/sqlite-jdbc/issues/200/wrong-utf-8-decoding-of-unicode-code , same as #61
- https://bitbucket.org/xerial/sqlite-jdbc/issues/144/nativedbexec-throws-an-exception-without
- https://bitbucket.org/xerial/sqlite-jdbc/issues/84/bug-in-nativedbc-bind_1blob
- https://bitbucket.org/xerial/sqlite-jdbc/issues/70/setting-a-blob-in-prepstmt
@mkauf
Copy link
Contributor Author

mkauf commented Oct 26, 2015

I think that the continuous integration system has not compiled the native library, but used an old version of the native library instead. A new unit test that I have written fails because the old native library has been used.

@xerial
Copy link
Owner

xerial commented Oct 26, 2015

OK. I will prepare native binaries rebuilt with this fix.

@xerial xerial merged commit a4cf82d into xerial:master Oct 26, 2015
@xerial
Copy link
Owner

xerial commented Oct 26, 2015

Thanks for the proper error handling and the improvement of the query execution by using new API of SQLite.

@mkauf
Copy link
Contributor Author

mkauf commented Nov 1, 2015

Thank you for merging!

@mkauf mkauf deleted the fix-utf8-supplementary-characters branch November 1, 2015 10:29
@jberkel
Copy link
Contributor

jberkel commented Dec 8, 2015

Just ran into this bug, glad it's already been fixed. However there doesn't seem to be a 3.9.1-SNAPSHOT version on sonatype, I only found 3.9.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments