Use wcwidth implementation from fish-shell#1289
Use wcwidth implementation from fish-shell#1289zhaofengli wants to merge 1 commit intomobile-shell:masterfrom
Conversation
Co-authored-by: Stanislaw Pusep <creaktive@gmail.com>
|
I don't think we want to take this PR, as we have not thought about the issues around mosh-client and mosh-server having out of sync wcwidth implementations. That's already possible today, but it's only going to get worse if we vendor wcwidth as this PR suggests. I'm in general super not into vendoring source. |
|
Understood, thanks for the review. I'll apply the patch locally in the meantime.
I agree that this is problematic. Currently it seems that the state diff is generated by the server as an terminal sequence (
If my understanding is correct, would it make sense to change the over-the-wire diff representation to be in terms of Row/Cell changes? This way the server controls how characters are laid out in Cells, avoiding (1). Without the escape sequences, the diff can also be more compact.
That's reasonable. We can require users and downstream packaging to provide |
With my widecharwidth hat on, since I was CC'd: widecharwidth is meant to be used as a single header that you generate and include in your source. It is not set up to be built as a shared library, so it's not amenable to downstream packaging. I don't believe we want to support a shared library either, because that means caring about downstream packages. And the generated header is a few long data tables together with a couple functions to read them. Overall not a lot of code to vendor.
I can tell you that we haven't had many reports of problems with being out-of-sync, at least not once you get over the big 8->9 hump. Tho that is shell (remote or local) and terminal being out of sync, y'all would have an additional step. Anyway, it's of course your decision. I'll unsubscribe from this issue now so if anyone wants to summon me feel free to @ me. |
Vendoring is how you avoid the server and the client being out of sync. Currently, you simply leave users with broken software. I, for example, had the display glitch in #1281, went through the trouble of researching the problem and compiling https://github.com/jdrouhard/mosh on both the server and the client, and then it works fine. Some other users would simply try mosh, find it broken, and never use it again. This is improvement left on the table. |
|
Coming to this from downstream — I maintain tabby, a tmux sidebar tool, and Mosh users hit this constantly: Unicode characters render with wrong widths causing off-by-one corruption. A couple of questions:
Happy to help test on macOS/iPadOS if that's useful. |
|
The wire protocol idea would be a breaking change, as such it is not in scope for 1.5.0. maybe a hypothetical 2.0, if it even made sense. I haven't thought about the idea since it was proposed, and have since forgotten what my objection was, but I recall thinking there were some downsides with it that would make it challenging. (At the very least, it means client side prediction would get funky) |
|
@achernya Do you have any thoughts on the up-to-date width tables making it into this release? Seems like there was some discourse on this earlier and I'm wondering how we can move forward |
|
I actively do not want vendor any width tables. If there is a library we can depend on that is packaged in every major distribution we care about, we can consider the dependency. |
|
@achernya I get the no-vendored-width-tables stance. One option that seems to fit that constraint: use a system library for wcwidth instead of libc — libutf8proc (utf8proc_charwidth), which is already packaged in major distros and Homebrew. For desync, we can keep mixed installs safe by warning if client/server disagree on backend or Unicode version, and if only one side has utf8proc, fall back to libc on both ends for that session so widths stay consistent. This won't fix grapheme-cluster edge cases, but it should address the common macOS/old-libc emoji-width corruption without changing the wire protocol. |
|
I spot-checked the distros we directly push to (fedora, debian) and those we rely on for .pkg builds (macports). All of them have utf8proc. utf8proc includes pkg-config, so it should be straightforward to incorporate into configure.ac. On the desync front, as I mentioned in this comment above, it's already possible today with libc. The primary thing I'd like us to figure out is if introducing utf8proc makes things noticeably worse. If the answer is "no", we can just integrate it directly and move on, no version checks or fallbacks necessary. If the answer is "yes", then we need to figure out how to do the signaling. There's no good way to do it; the underlying transport is per-packet, so it would be wasted overhead. Likewise, the hostinput/userinput sides aren't appropriate either, since it's just a byte stream. That leaves us with the mosh startup sequence. I'm not excited about modifying mosh.pl but it looks like we could probably add another line there. But, again, I'd much rather we keep forwards and backwards compatibility as much as possible rather than explore this alternative. If you or anyone else can think of any issues that don't exist with today's wcwidth implementation, that would be helpful in deciding the steps forward. |
This PR makes mosh use up-to-date wcwidth tables from fish-shell, so certain Unicode characters can be displayed correctly on platforms like macOS.
It revives #1143 with a simple wrapper. We now vendor an unmodified copy of
widechar_width.hfrom https://github.com/ridiculousfish/widecharwidth.Fixes #234.
cc @faho (#1143 (comment))