Skip to content

[PoC] Add version(NoAutoDecode) to allow opt-out of auto-decoding#5513

Closed
wilzbach wants to merge 1 commit intodlang:masterfrom
wilzbach:no-auto-decode
Closed

[PoC] Add version(NoAutoDecode) to allow opt-out of auto-decoding#5513
wilzbach wants to merge 1 commit intodlang:masterfrom
wilzbach:no-auto-decode

Conversation

@wilzbach
Copy link
Contributor

This is a Proof of Concept (PoC) idea to allow people to opt-in to use Phobos without auto-decoding.
If this works well, it could be improved (e.g. per module) to allow gradual migration.

It's an adaption of an KISS idea posted on the NG by @adamdruppe.

But I guess a code sample shows more than 1000 words:

void main(string[] args)
{
    import std.range;
    string cStr = "abc";
    wstring wStr = "abc";
    pragma(msg, typeof(cStr.front));
    pragma(msg, typeof(wStr.front));
}
> ~/dlang/dmd/generated/linux/release/64/dmd  -run test2.d
dchar
dchar
> ~/dlang/dmd/generated/linux/release/64/dmd  -version=NoAutoDecode -run test.d
immutable(char)
immutable(wchar)

This obviously needs more work and maybe a real world test.
But before investing more time into this, I am interested: what do you think about this approach? Worth looking into?

@wilzbach wilzbach added the Review:WIP Work In Progress - not ready for review or pulling label Jun 26, 2017
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @wilzbach! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

  • smaller, focused PRs are easier to review than big ones

  • try not to mix up refactoring or style changes with bug fixes or feature enhancements

  • provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

@JackStouffer
Copy link
Contributor

A problem I can envision with this is the tons of auto decoding/narrow string special cases all over Phobos. This could lead to yet even more special cases.

@quickfur
Copy link
Member

I would love to see something like this merged into Phobos, however one potential problem that immediately came to mind with using version(NoAutoDecode) is that it would not work if there's some auto-decoding-dependent code in Phobos that's compiled into libphobos2.so, because then you might end up with libphobos2.so containing some code compiled with auto-decoding, but called from user code that links to the Phobos template compiled without auto-decoding, so you may get some unexpected results.

If we're sure everything will work as expected, though, this could be worth pushing through.

But we also have to make sure any code that actually depends on auto-decoding semantics will still work. Depending on what this code is, we might need to reimplement a few things.

@wilzbach
Copy link
Contributor Author

A problem I can envision with this is the tons of auto decoding/narrow string special cases all over Phobos. This could lead to yet even more special cases.

We could (try to) compile Phobos with version(NoAutoDecode) and start to eat our own dogfood?

@dukc
Copy link
Contributor

dukc commented Jul 17, 2017

I see a great problem here: While it would give a nice choice to give for application writers, it would be horrible for libraries, which would either have to be ready for both versions, or force the user to not autodecode nor use any autodecoding libraries.

If we only found a way to give that choice for each module, not for each compilation...

@CyberShadow
Copy link
Member

Unless said library is dealing with Unicode, I think most D code will not be affected by lack of autodecoding.

I suppose there could be problems with libraries assuming the type of elements' ranges, but that is easily fixed by proper application of ElementType etc.

@dukc
Copy link
Contributor

dukc commented Jul 18, 2017

...or defining the autodecoding elements locally. Yes, works.

But the point here is that autodecoding libraries need to change their code, and they need to do it immediately. For them it may be worse than if we just removed auto-decoding without any deprectation period. At least then they wouldn't have to deal with the still decoding version.

@dukc
Copy link
Contributor

dukc commented Jul 18, 2017

What if we moved everything except autodecoding from std.range to a new module, and made std.range to import it publicly? More or less an ugly solution, but no breakage.

@wilzbach
Copy link
Contributor Author

Okay it seems that this doesn't work that well. Hopefully someone comes up with a better idea to finally phase out auto-decoding by default ...

@wilzbach wilzbach closed this Nov 17, 2017
@wilzbach wilzbach deleted the no-auto-decode branch November 17, 2017 16:24
@quickfur
Copy link
Member

quickfur commented Mar 7, 2018

I wonder if a possible alternative approach is to start with a version to opt-in to no-autodecoding, say over the period of a release or two, then change it to opt-out the following release, wait another release or two, then kill off autodecoding forever.

During the transition things will be messy, but hopefully going through a deprecation-cycle-like process would make such a thing more acceptable to people? Just throwing out a wild idea here.

@JackStouffer
Copy link
Contributor

JackStouffer commented Mar 7, 2018

I'm still of the opinion that the right fix for this is Andrei's hypothesized RCStr library type. The main benefit from such a design (other than the ref counting) is the decision that was made to not offer RCStr as a range itself. To get a range over the underlying data, you would have to call one of three member functions: byCodeUnit, byCodePoint, byGrapheme. This forces the user to make an explicit decision about the behavior of their string code. This would also allow existing code which calls the already existing versions of these functions to work without modification.

I think phasing out/changing such a fundamental behavior of Phobos will lead to a lot more problems for D's image than we anticipate.

@adamdruppe
Copy link
Contributor

adamdruppe commented Mar 7, 2018 via email

@schveiguy
Copy link
Member

I'm still of the opinion that the right fix for this is Andrei's hypothesized RCStr library type.

I've brought up something like this before. What we need is for the compiler to allow something other than immutable(char)[] to be aliased to string. Then we have room to play with ideas.

I've said it before, the problem with auto-decoding for me is not the auto-decoding. It's that Phobos doesn't agree with the language. The language says char[] is an array. Phobos tries to pretend it's not. And it works horribly.

@JackStouffer
Copy link
Contributor

What we need is for the compiler to allow something other than immutable(char)[] to be aliased to string.

In general, aren't we trying to move away from strings and towards to character ranges?

@schveiguy
Copy link
Member

What I'm saying is, the literal "hello, world!" needs a type. Right now, the compiler says it's an immutable(char)[]. What I'd like is for it to say it's a string and let the library define that type to be something other than immutable(char)[]. Then we can figure out the right type to use (RCString?), and develop Phobos around it (with a possible deprecation path).

@adamdruppe
Copy link
Contributor

adamdruppe commented Mar 7, 2018 via email

@quickfur
Copy link
Member

quickfur commented Mar 7, 2018

@adamdruppe That won't help if you need to pass a string literal to a function that takes string, because D doesn't support implicit construction.

@schveiguy
Copy link
Member

Not to mention auto x = "abc";

@quickfur
Copy link
Member

quickfur commented Mar 7, 2018

Or if you want to construct a string[] with literal syntax like auto strs = [ "abc", "def", "ghi" ];. You'll end up with strs of the wrong type.

@adamdruppe
Copy link
Contributor

D should support implicit construction! There's a reason C++ allows it... and that reason is std::string lol. :) Though then there's still the billion string templates that wouldn't implicit construct anyway...

idk, I actually think D got strings right. Phobos mucked it up with the range... but even then, most of Phobos i think did an ok job, and the language itself did the right thing.

@jmdavis
Copy link
Member

jmdavis commented Mar 7, 2018

Having string be immutable(char)[] works just fine so long as Phobos actually treats it that way and range-based functions operate on arbitrary ranges of characters rather than specifically expecting ranges of dchar (some Phobos functions do work with arbitrary ranges of characters now, but some don't). The problem is how to transition from the current behavior to the correct behavior without breaking everything in the process and without effectively dividing the D ecosystem in two (e.g. code that works with auto-decoding versioned in and code that works with it versioned out). Either way, I think that the first step is to make as little of Phobos as possible care about whether strings are treated as ranges of dchar or not. We've talked about this before, and some work has been done towards that, but there's still more work to be done there. If Phobos generally didn't care, then switching would be far less painful, and it would be far more reasonable to talk about switching the behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review:Needs Decision Review:WIP Work In Progress - not ready for review or pulling

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants