Skip to content

Misleading or wrong error message in UTF-8 byte sequence verification #25874

@mike-ward

Description

@mike-ward

Describe the bug

Compiling the simple program below leads to a compilation error.

Reproduction Steps

fn main() {
	println('\xF0\x81\xA0\x80')
}

Expected Behavior

It should issue an error indicating the UTF-8 sequence is wrong. The real issue is as follows:

F0 81 A0 80 is not a valid UTF-8 sequence because it represents an overlong encoding, which is not allowed in UTF-8. Valid UTF-8 sequences must use the shortest form for each character.

Current Behavior

main.v:2:22: notice: invalid utf8 string, please check your file's encoding is utf8
    1 | fn main() {
    2 |     println('\xF0\x81\xA0\x80')
      |                         ~~~~~~
    3 | }
�`

Possible Solution

It should be a matter of correcting the error message. The file is a valid UTF-8 encoded. It is the byte sequence that is wrong.

Additional Information/Context

No response

V version

V 0.4.12 e07eb54

Environment details (OS name and version, etc.)

V full version V 0.4.12 e563c8f.e07eb54
OS macos, macOS, 26.1, 25B78
Processor 8 cpus, 64bit, little endian, Apple M2
Memory 0.12GB/8GB
V executable /Users/mike/Documents/github/v/v
V last modified time 2025-11-30 16:14:30
V home dir OK, value: /Users/mike/Documents/github/v
VMODULES OK, value: /Users/mike/.vmodules
VTMP OK, value: /tmp/v_501
Current working dir OK, value: /Users/mike/Documents/github/bug
env VFLAGS "-message-limit 5"
env LDFLAGS "-L/opt/homebrew/opt/ruby/lib"
Git version git version 2.52.0
V git status e07eb54
.git/config present true
cc version Apple clang version 17.0.0 (clang-1700.4.4.1)
gcc version Apple clang version 17.0.0 (clang-1700.4.4.1)
clang version Apple clang version 17.0.0 (clang-1700.4.4.1)
tcc version tcc version 0.9.28rc 2024-02-05 HEAD@105d70f7 (AArch64 Darwin)
tcc git status thirdparty-macos-arm64 1867108f
emcc version N/A
glibc version N/A

Note

You can use the 👍 reaction to increase the issue's priority for developers.

Please note that only the 👍 reaction to the issue itself counts as a vote.
Other reactions and those to comments will not be taken into account.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugThis tag is applied to issues which reports bugs.

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions