[REG 2.067.0] Issue 14431 - huge slowdown of compilation speed by 9rnsr · Pull Request #4784 · dlang/dmd

9rnsr · 2015-06-28T14:53:15Z

https://issues.dlang.org/show_bug.cgi?id=14431

In the pull request #4384, all instance has been changed to invoke
semantic3(). It was for the link-failure issue in specific case, but it
was too excessive.

Semantic analysis strategy for template instances:
We cannot determine which instance does not need to be placed in object
file until semantic analysis completed. Therefore, for all templates
instantiated in root module, compiler should invoke their semantic3 --
regardless of whether those are also instantiated in non-root module. If
a template is only instantiated in non-root module, we can elide its
semantic3 (and for the compilation speed we should do that).
Code generation storategy for template instances:
If a template is instantiated in non-root module, compiler usually does
not have to put it in object file. But if a template is instantiated in
both of root and non-root modules which mutually import each other, it
needs to placed in objfile.

WalterBright · 2015-06-28T18:02:44Z

Does this replace #4626 ?

9rnsr · 2015-06-28T18:19:31Z

@WalterBright I'm not yet sure it's possible.

9rnsr · 2015-06-29T09:58:57Z

OK, auto-tester is now all green.

9rnsr · 2015-06-29T10:17:04Z

Does this replace #4626 ?

Now, this actually replaces it.

WalterBright · 2015-06-29T22:14:16Z

We cannot determine which instance does not need to be placed in object
file until semantic analysis completed.

Why? If the semantic analysis produces different results, there's a bug somewhere.

Therefore, for all templates instantiated in root module, compiler should invoke their semantic3 --
regardless of whether those are also instantiated in non-root module.

I'm very unhappy with legitimizing the idea that this could produce different instantiations. The whole point is they should be the same.

if a template is instantiated in both of root and non-root modules which mutually import each other, it
needs to placed in objfile.

This will simply kill compilation speed, as nobody seems to be able to resist having every module import every other module (Phobos is a prime example). I don't see how this PR fixes compile speed issues. I prefer the much more straightforward #4780

9rnsr · 2015-06-29T23:34:48Z

Why? If the semantic analysis produces different results, there's a bug somewhere.

I'm very unhappy with legitimizing the idea that this could produce different instantiations. The whole point is they should be the same.

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

This will simply kill compilation speed, as nobody seems to be able to resist having every module import every other module (Phobos is a prime example).

??? It's not a problem with the normal Phobos usage. For example, when you compile a "hello world" program, your code (root module) actually imports Phobos, but Phobos code (non-root module) does not import your program.

I don't see how this PR fixes compile speed issues.

This PR stops excessive semantic3 for the templates instantiated in non-root module. It will restore the compilation speed up to the equivalent to the 2.065.

In my Windows 7 PC (64bit Core i7):
Compiled program: import std.stdio; void main() { writeln("Hello world"); }
Measurement by: PS C:\d> Measure-Command {dmd.exe test.d}

With 2.065: 800msec
With 2.066: 1.1sec
With 2.067: 1.4sec
With this PR: 800msec

9rnsr · 2015-06-29T23:40:40Z

And, the excessive semantic3 had caused:

redundant symbol emission (Issue 14508 - [REG2.067.0] compiling with -unittest instantiates templates in non-root modules)
unnecessary -property check in Phobos code (Issue 14564 - [REG2.067] dmd -property -unittest combination causes compiler error)

This PR will also fix those regression issues.

WalterBright · 2015-06-30T00:49:24Z

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

There must never be a case where template function mangled names are the same but the instantiated bodies are different. This includes any attribute inference. If this happens, the whole compilation model will fall apart.

For example, when you compile a "hello world" program, your code (root module) actually imports Phobos, but Phobos code (non-root module) does not import your program.

True, but every Phobos module imports every other Phobos module, and a user's project's modules will also typically import every other project module. Using separate compilation and heavy template use (which is becoming the norm) will become impractical.

Measurement by:

Thank you for the good measurements. But I don't feel this is representative for projects outlined in the previous paragraph.

redundant symbol emission (Issue 14508 - [REG2.067.0] compiling with -unittest instantiates templates in non-root modules)

Corrected as well with my simple PR #4780

unnecessary -property check in Phobos code (Issue 14564 - [REG2.067] dmd -property -unittest combination causes compiler error)

Hmm, that might be better addressed with -property fixes.

JinShil · 2015-06-30T00:53:33Z

Hmm, that might be better addressed with -property fixes.

@WalterBright #2305 would benefit from your feedback, and it can be rebased or modified per your input.

9rnsr · 2015-06-30T11:18:29Z

There must never be a case where template function mangled names are the same but the instantiated bodies are different. ...

You're misunderstanding my argument. If a template is instantiated in root module, we cannot skip its semantic3 until found the same instance in non-root. And in worst case the determination will be deferred until the global semantic3 stage completed, because a function defined in root module may import other modules from the function body.

(I noticed that, if the non-root instantiation found quickly, that would be able to stop being unneeded semantic3 sometimes. I think it would be a possible future optimization for the compilation speed.)

True, but every Phobos module imports every other Phobos module, and a user's project's modules will also typically import every other project module. Using separate compilation and heavy template use (which is becoming the norm) will become impractical.

I noticed that #4780 would be disaster, Read my comment.

Corrected as well with my simple PR #4780

"simple solution" does not guarantee its correctness, especially for complex problems.

9rnsr · 2015-07-16T13:32:59Z

Refactor code: Add TemplateInstance::appendToModuleMember().

MartinNowak · 2015-07-17T06:06:52Z

Could someone draw a simple overview of all the cases to explain where the code of a template instance ends up.
It seems that we're adding more and more complexity, overfitting the code to every corner case instead of coming up with a general solution.
As the regression was already introduced in 2.067.0 we might take another release to solve this properly.

MartinNowak · 2015-07-17T06:09:34Z

src/template.c

What does that mean? If it has a name please abstract it to a function.

The first instantiation inst was not instantiated in root module (inst->minst == NULL means it was speculative, and inst->minst->isRoot() == false means it was in non-root module), but this is instantiated in root module.

A simple bool isSpeculative() { return minst == NULL; } would help.

So this selects the case where this is non-speculative and root, but the same instance inst was done in speculative or non-root context. Why would you want non-root instances here?

If inst was done in speculative, it's already inserted in root module members. So the second time insertion is necessary only for non-root instances.

A simple bool isSpeculative() { return minst == NULL; } would help.

Unfortunately we already have Dsymbol::isSpeculative() and its meaning is not same with the condition tested in here. I'd really want to determine good name, but it's not discovered yet.

If inst was done in speculative, it's already inserted in root module members.

Why is that???

If a template is instantiated in speculative context, the context should handle all errors during the instantiation - it's including the errors in function body.

void foo(T)(T t) { auto x = undefined_name; } static assert(!is(typeof(foo(1)))); // the undefined error should be captured in here.

For that, all of speculative instantiations will be handled as root instances, and then, their semantic3 pass will surely get called.

That's actually very simple and roughly means if (this->needsCodegen && !inst->needsCodegen).

Yes, that's right.

9rnsr · 2015-07-17T14:09:19Z

OK, I succeeded fixing Win64 failure. All green.

WalterBright · 2015-07-17T21:13:55Z

It seems that we're adding more and more complexity, overfitting the code to every corner case instead of coming up with a general solution. As the regression was already introduced in 2.067.0 we might take another release to solve this properly.

This is why I introduced the alternate fix, #4780

It is most appropriate for 2.068 at least, as it will enable us to get it out the door and has known behavior.

9rnsr · 2015-07-19T10:17:12Z

Could someone draw a simple overview of all the cases to explain where the code of a template instance ends up.

I wrote a wiki page for the explanation.
http://wiki.dlang.org/Template_Instantiation_Strategy

WalterBright · 2015-07-19T23:42:16Z

@9rnsr thank you for writing up the explanation. (It belongs in the source code, not the wiki.)

MartinNowak · 2015-07-21T10:30:13Z

src/template.c

Why swapping the contexts and appending inst instead of simply continuing with this which should already be member of a root module?

It's a bad idea for debugging and understanding to change minst/tinst, just like you wouldn't want to change the scope of a symbol.

We need the second insertion at most only once. By the swap, inst is changed to a root instance, then the condition at line 5961 no longer hit. This is for the calculation amount reduction.

And, minst and tinst do not affect semantic analysis results. So the swap is not dangerous.

I still don't get it.
The whole template compilation/codegen pipeline is already complicated enough. We should avoid additional tricks, even more so if they are undocumented.

I updated to make the comment detailed more. Is that enough?

A poor solution is adding one bool flag bool TemplateInstance::isAlreadyInstantiatedInRoot; and test&check it here. But it will increment the size of TemplateInstance object. As far as possible I'd like to save memory footprint of dmd.

I know the holes between bool variables, but it's unrelated refactoring, and should not go for the release.

Why you require such the redundant bool? Current "swap" is enough smart and efficient. I have no reason to degrade code quality.

This is a really good argument to split TemplateInstantiation and TemplateInstance. This double usage had always confused me and likely eats up a lot of memory as well.
#4780 (comment)

So the point of this is that all identical TemplateInstances delegate to the first TemplateInstance (findExistingInstance) to avoid duplicate semantic. But if the first (primary) instance was done in speculative context, then no code will be generated. Therefore tinst and minst are swapped to make the primary instance a normal root instance.
It's still confusing that the scopes of this and inst won't match their tinst and minst variables.
It would be nice to move this into a shallowSwap function and to swap all fields that deal with the instantiation but not the instance, e.g. swap scope but not members.

Will the fact that a parent was a speculative template instance affect the semantic of members?
It should only change the error reporting, but maybe I'm overlooking something here.

It's still confusing that the scopes of this and inst won't match their tinst and minst variables.

The scope of instantiated position, it's the Scope*sc parameter of TemplateInstance::semantic() function, is not saved anywhere. Therefore the mismatch of tinst/minst won't leak out of the function. And in TemplateInstance::semantic2() and semantic3(), the given Scope *sc parameter is just ignored.

Will the fact that a parent was a speculative template instance affect the semantic of members?

Never.

MartinNowak · 2015-07-21T10:34:45Z

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

Since when does inference affect mangling? I thought we'd avoided that b/c of ABI stability and separate compilation concerns.

9rnsr · 2015-07-21T12:09:11Z

Since when does inference affect mangling? I thought we'd avoided that b/c of ABI stability and separate compilation concerns.

Since the first (inference modifies FuncDeclaration::type, then it will be a part of function symbol). In my recognition, typesafe linking is priority than the compilation speed currently.

MartinNowak · 2015-07-22T10:34:45Z

src/template.c

You should comment why you're doing something, what you're doing is fairly obvious.

9rnsr · 2015-07-25T03:01:14Z

@MartinNowak @WalterBright Do you have any other objections about this PR? Even if you don't want to go this change in 2.068 release, we can fix the issue in master branch...

WalterBright · 2015-07-28T04:00:15Z

The algorithm is very difficult to understand. I don't know how to explain it to someone else. I worry that if the only person it is understandable to is Kenji, then nobody else can maintain the compiler. This is not practical for a team organization.

Martin, do you understand the algorithm?
The wiki, which refers to compiler internals, does not belong on the wiki, it belongs in the compiler source code.
The wiki should have an explanation for how it works that is intended for the D user, i.e. not in terms of compiler internals.

9rnsr · 2015-07-28T06:05:50Z

@WalterBright I placed the text in wiki because it was convenient. I'll move it in dmd repo if you hope that (e.g. dmd/doc/note.txt?).

9rnsr · 2015-07-30T12:28:11Z

I worry that if the only person it is understandable to is Kenji, then nobody else can maintain the compiler.

I recall that David Nadlinger has fixed a regression issue 13478 in 2.066.1 devel, which was the bug in the template instantiation code.

@klickverbot Could you please join the review of this PR?

MartinNowak · 2015-08-12T07:22:33Z

src/template.c

Again why this change? It seems unrelated to the fix.

This is not necessary to fix the regression, but it will increase stability for the position where the instance will be inserted. For example:

a.d

import ...; template X() { ... } X!()

b.d

import a, ...; X!()

With the command line dmd -c a.d b.d:

If this adjustment does not exist, the instance X!() will be inserted to the member of either a or b. The actual position will depend on the semantic order.

If this adjustment exists, the instance X!() will be always inserted to the member of a.

By that, the X!() code will be always stored into a.obj, in that case. I think the stable algorithm result is better than unstable, if there's not so much performance difference.

MartinNowak · 2015-08-12T07:59:07Z

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

I still don't understand how semantic3 for correct mangling is necessary to find existing instances.

9rnsr · 2015-08-12T08:33:56Z

I still don't understand how semantic3 for correct mangling is necessary to find existing instances.

Perhaps we can elide semantic3 call from some root instances. But currently semantic3 is always called on all root instances. We cannot improve it yet.

WalterBright · 2015-08-12T09:44:40Z

I still don't understand how semantic3 for correct mangling is necessary to find existing instances.

It isn't necessary. A template's arguments completely determine it. If the arguments match the arguments for existing instances, then it MUST MATCH and running semantic3 is not necessary.

In the pull request dlang#4384, all instance has been changed to invoke semantic3(). It was for the link-failure issue in specific case, but it was too excessive. 1. Semantic analysis strategy for template instances: We cannot determine which instance does not need to be placed in object file until semantic analysis completed. Therefore, for all templates instantiated in root module, compiler should invoke their semantic3 -- regardless of whether those are also instantiated in non-root module. If a template is _only_ instantiated in non-root module, we can elide its semantic3 (and for the compilation speed we should do that). 2. Code generation strategy for template instances: If a template is instantiated in non-root module, compiler usually does not have to put it in object file. But if a template is instantiated in both of root and non-root modules which mutually import each other, it needs to placed in objfile.

…th separate compilation

9rnsr · 2015-08-13T10:18:16Z

I added test case to show that the issue 14901 will be actually fixed.

MartinNowak · 2015-08-23T13:26:25Z

I know it's a complex change but all of the fixes here make sense and solve a few big regressions.
Let's please go on with this and work on cleaning up the template implementation later.

MartinNowak · 2015-08-23T13:28:40Z

It isn't necessary. A template's arguments completely determine it. If the arguments match the arguments for existing instances, then it MUST MATCH and running semantic3 is not necessary.

I agree, but changing this isn't actually part of this PR.

MartinNowak · 2015-08-23T13:29:47Z

Can you please rebase+reopen this against stable. Please do so soon, so that we can rigorously test this before 2.068.1.

9rnsr · 2015-08-24T03:40:56Z

Reopened against stable branch. See #4944.

9rnsr force-pushed the fix14431 branch from e1cae3b to 44c8f8a Compare June 28, 2015 17:56

9rnsr force-pushed the fix14431 branch from 44c8f8a to 03dd4d9 Compare June 28, 2015 18:18

9rnsr force-pushed the fix14431 branch 4 times, most recently from ad7e8b7 to 0a3e1c5 Compare June 29, 2015 09:50

9rnsr mentioned this pull request Jun 29, 2015

[REG2.067] Issue 14508 - compiling with -unittest instantiates templates in non-root modules #4626

Closed

9rnsr force-pushed the fix14431 branch from 0a3e1c5 to fdae687 Compare June 29, 2015 10:09

9rnsr force-pushed the fix14431 branch from fdae687 to f7632b9 Compare July 6, 2015 11:54

9rnsr force-pushed the fix14431 branch from f7632b9 to 1d49137 Compare July 16, 2015 13:21

MartinNowak reviewed Jul 17, 2015
View reviewed changes

9rnsr force-pushed the fix14431 branch from 1d49137 to 06b2392 Compare July 17, 2015 13:27

9rnsr force-pushed the fix14431 branch from 06b2392 to 039d45c Compare July 19, 2015 12:00

MartinNowak reviewed Jul 21, 2015
View reviewed changes

MartinNowak reviewed Jul 22, 2015
View reviewed changes

src/template.c Outdated

Copy link

Member

MartinNowak Jul 22, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should comment why you're doing something, what you're doing is fairly obvious.

9rnsr force-pushed the fix14431 branch 2 times, most recently from 469b816 to 77e8079 Compare July 22, 2015 13:29

This was referenced Aug 5, 2015

fix Issue 14828 - duplicate symbol __ModuleInfoZ #4858

Merged

Generate TypeInfo objects as in template instances #3958

Closed

9rnsr force-pushed the fix14431 branch from 77e8079 to 0a3823b Compare August 10, 2015 17:11

MartinNowak reviewed Aug 12, 2015
View reviewed changes

9rnsr force-pushed the fix14431 branch from 0a3823b to 739ba6b Compare August 12, 2015 09:08

9rnsr added 2 commits August 13, 2015 18:47

fix Issue 14901 - template static shared this() run multiple times wi…

7e6c685

…th separate compilation

9rnsr force-pushed the fix14431 branch from 739ba6b to 7e6c685 Compare August 13, 2015 10:06

9rnsr closed this Aug 24, 2015

9rnsr mentioned this pull request Aug 24, 2015

[REG 2.067.0] Issue 14431 - huge slowdown of compilation speed #4944

Merged

Uh oh!

Comments

Conversation

9rnsr commented Jun 28, 2015

Uh oh!

WalterBright commented Jun 28, 2015

Uh oh!

9rnsr commented Jun 28, 2015

Uh oh!

9rnsr commented Jun 29, 2015

Uh oh!

9rnsr commented Jun 29, 2015

Uh oh!

WalterBright commented Jun 29, 2015

Uh oh!

9rnsr commented Jun 29, 2015

Uh oh!

9rnsr commented Jun 29, 2015

Uh oh!

WalterBright commented Jun 30, 2015

Uh oh!

JinShil commented Jun 30, 2015

Uh oh!

9rnsr commented Jun 30, 2015

Uh oh!

9rnsr commented Jul 16, 2015

Uh oh!

MartinNowak commented Jul 17, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

9rnsr commented Jul 17, 2015

Uh oh!

WalterBright commented Jul 17, 2015

Uh oh!

9rnsr commented Jul 19, 2015

Uh oh!

WalterBright commented Jul 19, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MartinNowak commented Jul 21, 2015

Uh oh!

9rnsr commented Jul 21, 2015