-
-
Notifications
You must be signed in to change notification settings - Fork 417
Remove redundant file dialog string conversions #1665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Just a little correction here, HXCPP (with smart strings enabled) never uses UTF-8 for strings. It uses ASCII and when any character in a string doesn't fit within ASCII range then the entire string is converted to UTF-16. HXCPP, however, provides a function to convert it's strings to UTF-8. |
|
Yes, that's correct. However, an ASCII string is valid utf-8, so when using hxs_utf8 on an hxcpp string which is ASCII encoded, it just returns a pointer to the ASCII string without any conversion being necessary. |
|
I was wondering if you point that out :) I thought you might know, my apologies for stating the obvious. |
|
No worries, it's worth making the distinction :) |
|
There are a few more places where wstrings are converted to utf8 still. These include most of the system functions, though these are windows only, where wchar is equal to char16_t so The bigger issue is |
No need to iterate through the whole string to find the length, just to see if it is empty.
|
I fixed conflicts in the PR and updated it to clean things up and reduce duplicate code. I've tested on linux and windows and it works as expected. Aside from cleaning things up and simplicity, it is also a good idea to make this change because the |
|
I was going to ask about memory leaks, since some Not that I don't trust you to check, I just wanted to understand it myself. Plus, I learned some basics of gdb, which is nice. |
This is something I found confusing too so perhaps a comment might be useful. I can't remember if tfd has some documentation explaining this or if i just had to check the source? Also, I suppose the static string might also not be thread safe, though that is regardless of this PR. Though, given it is file dialogs it might be unlikely to cause issues in practice.
I agree, we want code that can be easily verified/understood by anyone reading it, rather than just assuming it works based on trust in one contributor. |
I checked the source when I wrote that, but it turns out they also mention it in the readme: "String memory is preallocated statically for all the returned values."
Agreed on all counts. I doubt most OSes allow opening dialogs from non-UI threads, but if they did, both threads would read/write that same memory. |
Currently, there are a lot of redundant string conversions when using the file dialog api. This was mentioned briefly in #1622.
Here is a rough overview of the current situation:
Linux/macOS:
hxstring->std::wstring->std::string, passed into tinyfiledialogs utf-8 api, thenstd::string->std::wstring->hxstringhl_vstring->std::wstring->std::string, passed into tinyfiledialogs utf-8 api, thenstd::string->std::wstring->utf-8bytes -> nativeutf-16hashlink string (on Haxe side viaString.fromUtf8())Windows:
hxstring->std::wstring, passed into tinyfiledialogs utf-16 api, thenstd::wstring->hxstringhl_vstring->std::wstring, passed into tinyfiledialogs utf-16 api, thenstd::wstring->utf-8bytes -> nativeutf-16hashlink string (on Haxe side viaString.fromUtf8())This cleans things up to avoid unnecessary conversions, so now it looks like this in most cases:
Linux/macOS:
hxstring->utf-8[0], passed into tinyfiledialogs utf-8 api, thenutf-8->hxstringhl_vstring->utf-8, passed into tinyfiledialogs utf-8 api, thenutf-8->utf-16hashlink string (on Haxe side viaString.fromUtf8())Windows:
hxstring->utf-16[0], passed into tinyfiledialogs utf-16 api, thenutf-16->hxstringhl_vstring, passed into tinyfiledialogs utf-16 api, thenutf-16->utf8->utf-16hashlink string (on Haxe side viaString.fromUtf8())[0] hxcpp can use ASCII (which is compatible with utf-8) or utf-16 depending on the string, so these conversions are avoided in some cases.
We can improve Hashlink further (for Windows), by returning utf-16 strings from the apis, however, I'm not sure if this would be considered a breaking change (new lime.hdlls would be incompatible with old lime haxe files).
In #1622, we briefly discussed using the utf-8 tinyfiledialog functions on Windows to unify the api, however, I looked into this and on Windows these functions just convert back to utf-16, so there would still be extra conversions. So I think this is the cleanest solution.
I've tested on Linux and Windows with the code from #1622, and everything works well.