Skip to content

java binding will cause invalid utf-8 sequence error if passed path parameter contains emoji #3194

@G-XD

Description

@G-XD

code:

    public static void main(String[] args) {
        Map<String, String> map = new HashMap<>();
        map.put("root", "/tmp");
        try (Operator op = new Operator("fs", map)) {
            op.write("❌😱.txt", "hello OpenDAL").join();
            System.out.println(op.read("test.txt").join());
            op.delete("test.txt").join();
        }
    }

error:

Exception in thread "main" org.apache.opendal.OpenDALException: Unexpected (permanent) at  => invalid utf-8 sequence of 1 bytes from index 3, source: invalid utf-8 sequence of 1 bytes from index 3
	at org.apache.opendal.Operator.write(Native Method)
	at org.apache.opendal.Operator.write(Operator.java:121)
	at org.apache.opendal.Operator.write(Operator.java:117)
	at org.gxd.Main.main(Main.java:12)

I think the cause of the problem is that the env.get_string() method doesn’t automatically decode Java’s modified UTF-8 format, and we can use .into() to convert the returned JavaStr.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions