Skip to content

Comments

Fix issue 16291 -- phobosinit fails to register encodings on individual tests#4744

Closed
schveiguy wants to merge 2 commits intodlang:masterfrom
schveiguy:fixcycles5
Closed

Fix issue 16291 -- phobosinit fails to register encodings on individual tests#4744
schveiguy wants to merge 2 commits intodlang:masterfrom
schveiguy:fixcycles5

Conversation

@schveiguy
Copy link
Member

std.encoding and std.process depend on an extern(C) binding to call their standalone static ctors. Doing it this way avoids cycles in the modules for unit tests.

However, my previous scheme of having std.internal.phobosinit import std.encoding allows the linker to simply discard that entire module since nothing depends on or calls it. It must be imported the other way, and must not import anything (to avoid cycles). I also had removed the import from std.process thinking it was unnecessary (incorrectly).

@dlang-bot
Copy link
Contributor

dlang-bot commented Aug 23, 2016

Fix Bugzilla Description
16291 phobosinit never gets called (EncodingScheme.create fails)

@dlang-bot
Copy link
Contributor

@schveiguy, thanks for your PR! By analyzing the annotation information on this pull request, we identified @JackStouffer, @kyllingstad and @andralex to be potential reviewers. @JackStouffer: The PR was automatically assigned to you, please reassign it if you were identified mistakenly.

(The DLang Bot is under development. If you experience any issues, please open an issue at its repo.)

std/encoding.d Outdated
// cycles.
extern(C) void std_encoding_shared_static_this()
{
foreach(scheme; [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this literal allocate?

Copy link
Member Author

@schveiguy schveiguy Aug 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does. Is it important for it not to? I'll fix it back to a slew of calls. Would be nice to have a way to make an array at compile-time mid-expression for things like this.

@schveiguy
Copy link
Member Author

Awesome. I've uncovered some sort of dmd heisenbug that works just fine on my system. @WalterBright how do I debug that signal 11 from here?

@codecov-io
Copy link

codecov-io commented Aug 23, 2016

Current coverage is 88.78% (diff: 100%)

Merging #4744 into master will decrease coverage by <.01%

@@             master      #4744   diff @@
==========================================
  Files           121        121          
  Lines         74159      74152     -7   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits          65845      65838     -7   
  Misses         8314       8314          
  Partials          0          0          

Powered by Codecov. Last update 0538d0c...78d7c3b

@schveiguy
Copy link
Member Author

I misread the call, I thought it was the build, but it's actually the checkwhitespace.d program. I can reproduce the crash locally.

@andralex
Copy link
Member

Auto-merge toggled on

@andralex
Copy link
Member

Whatever it is, I trust you'll get to the bottom of it.

@schveiguy
Copy link
Member Author

HAHA I think somehow the linker/compiler is trimming out all the data in this file. If I insert this line, then it works:

auto ascii = new EncodingSchemeASCII;

That has to be a bug... I'm going to just put in the line above, to get this done. This whole file needs a rewrite anyway.

@dlang-bot
Copy link
Contributor

@schveiguy, thanks for your PR! By analyzing the annotation information on this pull request, we identified @JackStouffer, @complexmath and @kyllingstad to be potential reviewers. @JackStouffer: The PR was automatically assigned to you, please reassign it if you were identified mistakenly.

(The DLang Bot is under development. If you experience any issues, please open an issue at its repo.)

@schveiguy
Copy link
Member Author

Auto-merge toggled off

@schveiguy
Copy link
Member Author

Something is really really wrong on OSX for this fix...

@schveiguy
Copy link
Member Author

OK, I guess the failure is from the exe size growing too much. WTF...

@schveiguy
Copy link
Member Author

I'm like 90% convinced that nobody uses std.encoding.EncodingScheme. I think none of the schemes were registered before this PR, and clearly the EXE size grows considerably when they are included.

@JackStouffer
Copy link
Contributor

Dear christ, this is such a dirty hack. I would really rather not have code like this in Phobos; if I wasn't following the development of this, maintaining this would royally suck.

Maybe I'm overreacting, but if we can't get normal shared static this to work the way it's supposed to in the standard library then something is definitely wrong outside of the design of std.encoding.

@schveiguy
Copy link
Member Author

@JackStouffer this is an old old module. At this point, changing it to be better designed is kind of a lost cause.

The hack for cycles is fine for now. That's not the issue with this PR. Until we can somehow mark a static ctor as being manually-verified as standalone, it's impossible to deal with the issue without doing something like this -- the runtime has to run static ctors in the right order, and the only way to do this is sort them at startup. The compiler doesn't give enough info to do a complete perfect dependency sort, so we are stuck with module-level sorting. When there are module cycles, you have to refactor like this.

The issue with this PR is that on OSX (at least), the module info isn't properly being stored and/or read. I'm setting up a bug report right now.

@schveiguy
Copy link
Member Author

@MartinNowak
Copy link
Member

MartinNowak commented Oct 1, 2016

This won't fully solve the problem (and all the trouble for some stupid false cycle detection).
Without a static ctor there no module info ends up in a binary using std.encoding.
Without a module info none of the classes are referenced (and thus no encoding scheme present, except for the ASCII one you referenced).
Because the stupid EncodingScheme.register just takes a string instead of a class alias, it'll fail (now silently), to register most of the encodings, b/c literally noone uses EncodingSchemeUTF8 in his code.

With the current situation std.net.curl becomes unusable (b/c it instantiates whatever Transfer-Encoding it sees). Please add a new EncodingScheme.register(alias klass) overload and properly register all the existing schemes to fix the regression, fix should address stable branch now.

@MartinNowak
Copy link
Member

MartinNowak commented Oct 1, 2016

I'm like 90% convinced that nobody uses std.encoding.EncodingScheme. I think none of the schemes were registered before this PR, and clearly the EXE size grows considerably when they are included.

Wrong, anyone importing std.encoding (e.g. indirectly through std.net.curl), got all the encodings (and their binary sizes), and they are necessary for a working curl transfer decoding.

{
version (OSX)
{
import std.internal.phobosinit; // needed to make sure the function gets called
Copy link
Member

@MartinNowak MartinNowak Oct 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't reference the encoding scheme init from std.process, merging all constructors in phobosinit is a really bad idea.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that's why default hello world goes up in space so much. Ugly as it is, we need an individual internal module for each module that has cycles.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this relates to https://issues.dlang.org/show_bug.cgi?id=16580 as well.

@schveiguy schveiguy mentioned this pull request Oct 2, 2016
@schveiguy
Copy link
Member Author

Without a static ctor there no module info ends up in a binary using std.encoding.

Why is that? I think moduleInfo should be present if classinfo is used (at the very least). Otherwise, runtime reflection initialization doesn't work (Object.factory).

I agree that std.encoding should use compile-time reflection for initialization.

@MartinNowak
Copy link
Member

MartinNowak commented Oct 2, 2016

Why is that? I think moduleInfo should be present if classinfo is used (at the very least). Otherwise, runtime reflection initialization doesn't work (Object.factory).

We're running in circles dlang/dmd#6076 (comment), Object.factory does not work w/ static libraries and unreferenced classes. It's just by chance that static ctors drag in ModuleInfo which drag in all class references.
I don't see any reason why EncodingScheme.create/register uses Object.factory.
In fact given how slow Object.factory is, it's a really bad idea to rely on that for EncodingScheme.create.
So unless I'm overlooking a good reason, let's please deprecate this nonsense, fix the existing EncodingSchemes, and move on.

@schveiguy
Copy link
Member Author

Fixed in #4840

@schveiguy schveiguy closed this Oct 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants