Skip to content

Comments

Fix issue 16423 - ModuleInfo missing when linking to static lib with classes#6076

Closed
jacob-carlborg wants to merge 3 commits intodlang:masterfrom
jacob-carlborg:issue-16423
Closed

Fix issue 16423 - ModuleInfo missing when linking to static lib with classes#6076
jacob-carlborg wants to merge 3 commits intodlang:masterfrom
jacob-carlborg:issue-16423

Conversation

@jacob-carlborg
Copy link
Contributor

No description provided.

@dlang-bot
Copy link
Contributor

dlang-bot commented Aug 24, 2016

Fix Bugzilla Description
16423 ModuleInfo missing when linking to static lib with classes

@dlang-bot
Copy link
Contributor

@jacob-carlborg, thanks for your PR! By analyzing the annotation information on this pull request, we identified @MartinNowak, @yebblies and @WalterBright to be potential reviewers. @MartinNowak: The PR was automatically assigned to you, please reassign it if you were identified mistakenly.

(The DLang Bot is under development. If you experience any issues, please open an issue at its repo.)

@schveiguy
Copy link
Member

Awesome! Thanks for doing this so quickly

@jacob-carlborg
Copy link
Contributor Author

Hmm, not really sure how to solve the Windows test failure.

@schveiguy
Copy link
Member

It's possible Windows doesn't have the same problem. I have found several disturbing things while adding back correct cycle detection, and one of them was that Windows would find cycles that the other OSes did not. See: dlang/druntime#1602 (comment)

@jacob-carlborg
Copy link
Contributor Author

I'm guessing the linkers behave differently. Perhaps the linkers on Posix remove/merge duplicated symbols.


if (m->doppelganger)
{
objc_Module_genmoduleinfo_classes();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm following the logic correctly, the only difference between the before and after for your patch (assuming betterC flag is not set) is that this call is now omitted. Is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The change is to always call genModuleInfo. objc_Module_genmoduleinfo_classes is called by genModuleInfo, I removed some of the duplications as well. Seems like the bigger if statement above didn't catch all cases where the module info needed to be generated. m->doppelganger is true if the -lib flag is passed. Before my change the function would return, if the -lib flag was passed, before the module info had been generated.

@andralex
Copy link
Member

Auto-merge toggled on

@jacob-carlborg
Copy link
Contributor Author

@andralex seems this doesn't work on Windows.

@andralex
Copy link
Member

@jacob-carlborg if you can fix this then the pull will be merged, if it's unsolvable on Windows please version code appropriately

@jacob-carlborg
Copy link
Contributor Author

Ok, thanks.

@WalterBright
Copy link
Member

Auto-merge toggled off

@WalterBright
Copy link
Member

Turned autotester off until tests pass, so it does not consume testing resources.

@jacob-carlborg
Copy link
Contributor Author

Ok, I have no idea what to do with Win32, it doesn't work with the new behavior or the old behavior.

@schveiguy
Copy link
Member

So you are saying the test on the current compiler doesn't put the class info into the module info? Is there an issue here where we are expecting a design that doesn't exist? If so, I need to solve the original problem in a different way.

@jacob-carlborg
Copy link
Contributor Author

jacob-carlborg commented Sep 8, 2016

So you are saying the test on the current compiler doesn't put the class info into the module info? Is there an issue here where we are expecting a design that doesn't exist? If so, I need to solve the original problem in a different way.

The current compiler doesn't work for your test case for any platform. With my patch your test case works on all platforms except Windows. On Windows, with my patch, there's duplicated symbols when building Phobos.

I tried to keep the old behavior on Windows, but then the new (your) test case doesn't work on Windows.

@schveiguy
Copy link
Member

OK, got it. This sucks. @WalterBright, @MartinNowak what is supposed to happen in terms of storage of moduleinfo with static library when all instantiations of the class are done either via Object.factory (or by doing a lookup of the classinfo by name to instantiate), or via extern(C) functions? See attached bug report. Am I expecting something that just isn't valid? This is related to dlang/phobos#4744 and std.encoding.

@jacob-carlborg
Copy link
Contributor Author

My patch was kind of a guess. I was able to "fix" this so quickly because I had run into the same problem but for Objective-C module info. So I just guessed that always generating the module info would fix the problem. It did, but caused other problems on Windows.

I think your use case is valid but I'm not sure if my patch is the correct fix. There's an instance variable, needmoduleinfo, in Module in the DMD source code. But it looks like that is not set in all cases where it should be set.

@MartinNowak
Copy link
Member

Well, this goes against much that we're trying to achieve, might result in huge binary increases, and tries to hack one purpose of static libraries and dmd's multilib (not dragging in everything).

It's a shame that this didn't raise any red flags and that the PR description didn't contain any information about the PR's effect and trade-offs.
Are we really supposed to read through the low-level diffs and a big bug report to even get an abstract understanding of what this tries to achieve and what its consequences are?
I actually find this misguiding and irresponsible to propose such a change w/o the slightest warning.

I mentioned a couple of workarounds in the Bugzilla report Issue 16423 – ModuleInfo missing when linking to static lib with classes, and would like to leave it at that.

@jacob-carlborg
Copy link
Contributor Author

Well, this goes against much that we're trying to achieve

No, I'm trying to fix a bug.

might result in huge binary increases

You're overreacting.

It's a shame that this didn't raise any red flags and that the PR description didn't contain any information about the PR's effect and trade-offs.
Are we really supposed to read through the low-level diffs and a big bug report to even get an abstract understanding of what this tries to achieve and what its consequences are?
I actually find this misguiding and irresponsible to propose such a change w/o the slightest warning.

You mean like you did with #4638 which contains no description at all. I rather try to fix something instead of your approach of trying to sneak by another (possible) language breaking change without anyone noticing to remove a feature instead of fixing it.

@schveiguy
Copy link
Member

Are we really supposed to read through the low-level diffs and a big bug report to even get an abstract understanding of what this tries to achieve and what its consequences are?

Um yes. The bug report explains everything. What is difficult to understand? I'm using obj.classinfo, but it's not being included in moduleinfo (in fact, module info is not included at all).

I actually find this misguiding and irresponsible

I think this is way over the top, and does not foster any goodwill towards would-be contributors.

@MartinNowak
Copy link
Member

MartinNowak commented Sep 22, 2016

I actually find this misguiding and irresponsible

I think this is way over the top, and does not foster any goodwill towards would-be contributors.

Yes sorry, that was a bit harsh.
This still creates a lot of problems and knowing that binary sizes of D programs are a problem is a commonplace.

You mean like you did with #4638 which contains no description at all.

Sure, I haven't always been as stringent w/ good PRs in the past.

might result in huge binary increases

You're overreacting.

Well, the references ModuleInfo -> imported ModuleInfo -> classes -> data&functions are one of the main reasons for fat binaries.

@MartinNowak
Copy link
Member

Are we really supposed to read through the low-level diffs and a big bug report to even get an abstract understanding of what this tries to achieve and what its consequences are?
Um yes. The bug report explains everything. What is difficult to understand? I'm using obj.classinfo, but it's not being included in moduleinfo (in fact, module info is not included at all).

The example is fairly complex, which is OK for a bug report if necessary, still takes a couple of minutes to grasp.
But there is only vague information about the actual cause and how this PR fixes it, neither in the bug report nor in this PR.

@schveiguy
Copy link
Member

schveiguy commented Sep 22, 2016

Unfortunately, it is necessary. You need to use the -lib compiler flag, which means a simple setup is difficult.

The original test case is Phobos, which is compiled similarly.

I think the idea behind the PR is to include moduleinfo whenever classes are present. Yes, this can lead to fat binaries, but how does one expect Object.factory to work otherwise?

Note that the current std.encoding module requires this functionality to register 3rd party encodings (It doesn't really need to, but difficult to change now).

@jacob-carlborg
Copy link
Contributor Author

The example is fairly complex, which is OK for a bug report if necessary, still takes a couple of minutes to grasp.

It's quite simple. Compile a module with the -lib, then try to access the class info for a class from that module. Of course, it's not possible to run a library so an executable is necessary as well.

But there is only vague information about the actual cause

The cause is that the compiler doesn't output module info/class info when it should do so when the -lib flag is passed. i.e. it has different behavior when the -lib flag is passed and when it's not.

and how this PR fixes it, neither in the bug report nor in this PR.

The fix is to always output the module info.

@MartinNowak
Copy link
Member

I think the idea behind the PR is to include moduleinfo whenever classes are present. Yes, this can lead to fat binaries, but how does one expect Object.factory to work otherwise?

I've given a few examples on the Bugzilla page, you need to include the ModuleInfo yourself.
The simplest ways to achieve that are to not use multilibs (-lib) or the -whole-archive linker switch.
Not including unused modules/symbols is the purpose of static libraries in general (and -lib multilibs) in particular.

But there is only vague information about the actual cause
The cause is that the compiler doesn't output module info/class info when it should do so when the -lib flag is passed. i.e. it has different behavior when the -lib flag is passed and when it's not.

That's not correct, the ModuleInfo containing all localClasses is generated with the first object for that module. But because it's not referenced, it doesn't end up in the executable.

and how this PR fixes it, neither in the bug report nor in this PR.
The fix is to always output the module info.

But you're not outputting the ModuleInfo, you're generating additional fake ModuleInfos for multilib doppelgänger modules. Just try your patch, you'll find a new module library.1 containing a single entry in localClasses.
If you want to drag in the ModuleInfo with a class, then the multilib object containing the class definition should reference the ModuleInfo. This is done for modules w/ ctors/dtors ¹ and a major cause for bloated binaries, b/c it also drags in all other classes of that module (hence #4638).

Just to summarize the state.

  • ModuleInfos are only emitted for sorting constructors/destructors.
  • We could add a separate section to collect references to all classes that are part of an executable. Thus making Object.factory independent of module constructors. This is somwhat similar to what you're doing w/ this hacky PR that emits one fake ModuleInfo for each class.
  • There is no way the compiler can drag in separately compiled modules or classes. Also a sound implementation of static libraries (when compiling multiple modules) w/ -lib will only drag in what's actually used. This heavily limits the usefulness of Object.factory w/ static libraries, please accept this fact instead of trying to hack around it.
    If you want to use unreferenced symbols, you can either tell your linker, use plain object files, or shared libraries. DMD even allows you to compile a whole library to a single object file including everything.

I don't think that the fake ModuleInfos library.1 added by this PR are good for anything.
They also don't solve the problem as might have been intended (or maybe you didn't fully understand what your PR does).
While the fake ModuleInfos add the ability to find the class via iterating all ModuleInfos, it also allows to instantiate them via ClassInfo.find("library.1.C").
The fix for ClassInfo.find("library.C") issue is just a side-effect of the fake ModuleInfos referencing the actual ModuleInfo. As said above, this could be done directly, but drags in the whole module and all of it's classes, thus isn't acceptable for static libraries w/o #4638.

@andralex
Copy link
Member

andralex commented Oct 2, 2016

I think we should be fine with requiring people who want to use factory to use e.g. -Xlinker --whole-archive lib1.a lib2.a ... -Xlinker --no-whole-archive on Linux, -Xlinker -force_load lib1.a -Xlinker -force_load lib2.a etc on OSX, and the new /WHOLEARCHIVE on Visual Studio. These idioms are commonly required by C++ object factories, serialization engines etc. and there isn't pressure for integrating these solutions into the language.

@jacob-carlborg
Copy link
Contributor Author

I don't think that the fake ModuleInfos library.1 added by this PR are good for anything.

What are these fake module infos you're talking about?

@jacob-carlborg
Copy link
Contributor Author

@MartinNowak I think there's a similar problem in std.process [1] as well. It's broken on macOS because module constructors are not run if compiled with -lib and the module is not referenced.

[1] https://issues.dlang.org/show_bug.cgi?id=16580

@MartinNowak
Copy link
Member

What are these fake module infos you're talking about?

Just calling genModuleInfo for doppelgänger modules creates stripped down doppelgänger ModuleInfos, instead of referencing the original ones. Try foreach (m; ModuleInfo) writeln(m.name);, you'll see library.1, those don't fix Object.factory. I assume the real moduleinfo, which would need to be referenced to properly fix this, gets dragged in as a side effect.
This is already somewhat broken in your Objective-C "fix".

Thx for your good wrap-up Andrei.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants