[PoC] Add version(NoAutoDecode) to allow opt-out of auto-decoding by wilzbach · Pull Request #5513 · dlang/phobos

wilzbach · 2017-06-26T16:45:51Z

This is a Proof of Concept (PoC) idea to allow people to opt-in to use Phobos without auto-decoding.
If this works well, it could be improved (e.g. per module) to allow gradual migration.

It's an adaption of an KISS idea posted on the NG by @adamdruppe.

But I guess a code sample shows more than 1000 words:

void main(string[] args)
{
    import std.range;
    string cStr = "abc";
    wstring wStr = "abc";
    pragma(msg, typeof(cStr.front));
    pragma(msg, typeof(wStr.front));
}

> ~/dlang/dmd/generated/linux/release/64/dmd  -run test2.d
dchar
dchar

> ~/dlang/dmd/generated/linux/release/64/dmd  -version=NoAutoDecode -run test.d
immutable(char)
immutable(wchar)

This obviously needs more work and maybe a real world test.
But before investing more time into this, I am interested: what do you think about this approach? Worth looking into?

dlang-bot · 2017-06-26T16:45:52Z

Thanks for your pull request, @wilzbach! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.

Some tips to help speed things up:

smaller, focused PRs are easier to review than big ones
try not to mix up refactoring or style changes with bug fixes or feature enhancements
provide helpful commit messages explaining the rationale behind each change

Bear in mind that large or tricky changes may require multiple rounds of review and revision.

Please see CONTRIBUTING.md for more information.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

JackStouffer · 2017-06-26T16:55:36Z

A problem I can envision with this is the tons of auto decoding/narrow string special cases all over Phobos. This could lead to yet even more special cases.

quickfur · 2017-06-26T16:58:35Z

I would love to see something like this merged into Phobos, however one potential problem that immediately came to mind with using version(NoAutoDecode) is that it would not work if there's some auto-decoding-dependent code in Phobos that's compiled into libphobos2.so, because then you might end up with libphobos2.so containing some code compiled with auto-decoding, but called from user code that links to the Phobos template compiled without auto-decoding, so you may get some unexpected results.

If we're sure everything will work as expected, though, this could be worth pushing through.

But we also have to make sure any code that actually depends on auto-decoding semantics will still work. Depending on what this code is, we might need to reimplement a few things.

wilzbach · 2017-06-28T15:48:08Z

A problem I can envision with this is the tons of auto decoding/narrow string special cases all over Phobos. This could lead to yet even more special cases.

We could (try to) compile Phobos with version(NoAutoDecode) and start to eat our own dogfood?

dukc · 2017-07-17T16:10:26Z

I see a great problem here: While it would give a nice choice to give for application writers, it would be horrible for libraries, which would either have to be ready for both versions, or force the user to not autodecode nor use any autodecoding libraries.

If we only found a way to give that choice for each module, not for each compilation...

CyberShadow · 2017-07-18T03:07:41Z

Unless said library is dealing with Unicode, I think most D code will not be affected by lack of autodecoding.

I suppose there could be problems with libraries assuming the type of elements' ranges, but that is easily fixed by proper application of ElementType etc.

dukc · 2017-07-18T11:34:42Z

...or defining the autodecoding elements locally. Yes, works.

But the point here is that autodecoding libraries need to change their code, and they need to do it immediately. For them it may be worse than if we just removed auto-decoding without any deprectation period. At least then they wouldn't have to deal with the still decoding version.

dukc · 2017-07-18T11:45:42Z

What if we moved everything except autodecoding from std.range to a new module, and made std.range to import it publicly? More or less an ugly solution, but no breakage.

wilzbach · 2017-11-17T16:24:37Z

Okay it seems that this doesn't work that well. Hopefully someone comes up with a better idea to finally phase out auto-decoding by default ...

quickfur · 2018-03-07T17:30:00Z

I wonder if a possible alternative approach is to start with a version to opt-in to no-autodecoding, say over the period of a release or two, then change it to opt-out the following release, wait another release or two, then kill off autodecoding forever.

During the transition things will be messy, but hopefully going through a deprecation-cycle-like process would make such a thing more acceptable to people? Just throwing out a wild idea here.

JackStouffer · 2018-03-07T17:52:26Z

I'm still of the opinion that the right fix for this is Andrei's hypothesized RCStr library type. The main benefit from such a design (other than the ref counting) is the decision that was made to not offer RCStr as a range itself. To get a range over the underlying data, you would have to call one of three member functions: byCodeUnit, byCodePoint, byGrapheme. This forces the user to make an explicit decision about the behavior of their string code. This would also allow existing code which calls the already existing versions of these functions to work without modification.

I think phasing out/changing such a fundamental behavior of Phobos will lead to a lot more problems for D's image than we anticipate.

adamdruppe · 2018-03-07T17:54:32Z

On Wed, Mar 07, 2018 at 05:52:40PM +0000, Jack Stouffer wrote: three member functions: `byCodeUnit`, `byCodePoint`, `byGrapheme`. This forces the user to make an explicit decision about the behavior of their string code. This would also allow existing code which calls the already existing versions of these functions to work without modification.

It's easy to do that with regular `string` too. In fact, that's how I'd probably proceed as a transition path: deprecate the top-level string range functions and force people to just choose one at the call site.

schveiguy · 2018-03-07T18:47:31Z

I'm still of the opinion that the right fix for this is Andrei's hypothesized RCStr library type.

I've brought up something like this before. What we need is for the compiler to allow something other than immutable(char)[] to be aliased to string. Then we have room to play with ideas.

I've said it before, the problem with auto-decoding for me is not the auto-decoding. It's that Phobos doesn't agree with the language. The language says char[] is an array. Phobos tries to pretend it's not. And it works horribly.

JackStouffer · 2018-03-07T18:52:07Z

What we need is for the compiler to allow something other than immutable(char)[] to be aliased to string.

In general, aren't we trying to move away from strings and towards to character ranges?

schveiguy · 2018-03-07T18:56:30Z

What I'm saying is, the literal "hello, world!" needs a type. Right now, the compiler says it's an immutable(char)[]. What I'd like is for it to say it's a string and let the library define that type to be something other than immutable(char)[]. Then we can figure out the right type to use (RCString?), and develop Phobos around it (with a possible deprecation path).

adamdruppe · 2018-03-07T18:59:20Z

On Wed, Mar 07, 2018 at 06:47:39PM +0000, Steven Schveighoffer wrote: I've brought up something like this before. What we need is for the compiler to allow something other than `immutable(char)[]` to be aliased to string. Then we have room to play with ideas.

The compiler has always allowed that. In fact, you can redefined string on a per-module level:

…

--- struct MyString { this(immutable(char)[] s) {} } alias string = MyString; void main() { string s = "hi"; } --- The literal is still typed immutable(char)[], but you can have a constructor accept it like I did there.

quickfur · 2018-03-07T19:00:17Z

@adamdruppe That won't help if you need to pass a string literal to a function that takes string, because D doesn't support implicit construction.

schveiguy · 2018-03-07T19:01:07Z

Not to mention auto x = "abc";

quickfur · 2018-03-07T19:01:30Z

Or if you want to construct a string[] with literal syntax like auto strs = [ "abc", "def", "ghi" ];. You'll end up with strs of the wrong type.

adamdruppe · 2018-03-07T19:12:13Z

D should support implicit construction! There's a reason C++ allows it... and that reason is std::string lol. :) Though then there's still the billion string templates that wouldn't implicit construct anyway...

idk, I actually think D got strings right. Phobos mucked it up with the range... but even then, most of Phobos i think did an ok job, and the language itself did the right thing.

jmdavis · 2018-03-07T19:18:31Z

Having string be immutable(char)[] works just fine so long as Phobos actually treats it that way and range-based functions operate on arbitrary ranges of characters rather than specifically expecting ranges of dchar (some Phobos functions do work with arbitrary ranges of characters now, but some don't). The problem is how to transition from the current behavior to the correct behavior without breaking everything in the process and without effectively dividing the D ecosystem in two (e.g. code that works with auto-decoding versioned in and code that works with it versioned out). Either way, I think that the first step is to make as little of Phobos as possible care about whether strings are treated as ranges of dchar or not. We've talked about this before, and some work has been done towards that, but there's still more work to be done there. If Phobos generally didn't care, then switching would be far less painful, and it would be far more reasonable to talk about switching the behavior.

Add version(NoAutoDecode) to allow opt-out of auto-decoding

38e6603

wilzbach added the Review:WIP Work In Progress - not ready for review or pulling label Jun 26, 2017

wilzbach added the Review:Needs Decision label Jul 9, 2017

wilzbach mentioned this pull request Jul 9, 2017

Make more modules runnable #5582

Merged

wilzbach closed this Nov 17, 2017

wilzbach deleted the no-auto-decode branch November 17, 2017 16:24

WalterBright mentioned this pull request Mar 23, 2019

add std.range.primitives.Autodecoding #6930

Merged

Uh oh!

Conversation

wilzbach commented Jun 26, 2017

Uh oh!

dlang-bot commented Jun 26, 2017

Bugzilla references

Uh oh!

JackStouffer commented Jun 26, 2017

Uh oh!

quickfur commented Jun 26, 2017

Uh oh!

wilzbach commented Jun 28, 2017

Uh oh!

dukc commented Jul 17, 2017

Uh oh!

CyberShadow commented Jul 18, 2017

Uh oh!

dukc commented Jul 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dukc commented Jul 18, 2017

Uh oh!

wilzbach commented Nov 17, 2017

Uh oh!

quickfur commented Mar 7, 2018

Uh oh!

JackStouffer commented Mar 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamdruppe commented Mar 7, 2018 via email

Uh oh!

schveiguy commented Mar 7, 2018

Uh oh!

JackStouffer commented Mar 7, 2018

Uh oh!

schveiguy commented Mar 7, 2018

Uh oh!

adamdruppe commented Mar 7, 2018 via email

Uh oh!

quickfur commented Mar 7, 2018

Uh oh!

schveiguy commented Mar 7, 2018

Uh oh!

quickfur commented Mar 7, 2018

Uh oh!

adamdruppe commented Mar 7, 2018

Uh oh!

jmdavis commented Mar 7, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

dukc commented Jul 18, 2017 •

edited

Loading

JackStouffer commented Mar 7, 2018 •

edited

Loading