Skip to content

Comments

[REG 2.067.0] Issue 14431 - huge slowdown of compilation speed#4784

Closed
9rnsr wants to merge 2 commits intodlang:masterfrom
9rnsr:fix14431
Closed

[REG 2.067.0] Issue 14431 - huge slowdown of compilation speed#4784
9rnsr wants to merge 2 commits intodlang:masterfrom
9rnsr:fix14431

Conversation

@9rnsr
Copy link
Contributor

@9rnsr 9rnsr commented Jun 28, 2015

https://issues.dlang.org/show_bug.cgi?id=14431

In the pull request #4384, all instance has been changed to invoke
semantic3(). It was for the link-failure issue in specific case, but it
was too excessive.

  1. Semantic analysis strategy for template instances:
    We cannot determine which instance does not need to be placed in object
    file until semantic analysis completed. Therefore, for all templates
    instantiated in root module, compiler should invoke their semantic3 --
    regardless of whether those are also instantiated in non-root module. If
    a template is only instantiated in non-root module, we can elide its
    semantic3 (and for the compilation speed we should do that).
  2. Code generation storategy for template instances:
    If a template is instantiated in non-root module, compiler usually does
    not have to put it in object file. But if a template is instantiated in
    both of root and non-root modules which mutually import each other, it
    needs to placed in objfile.

@WalterBright
Copy link
Member

Does this replace #4626 ?

@9rnsr
Copy link
Contributor Author

9rnsr commented Jun 28, 2015

@WalterBright I'm not yet sure it's possible.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jun 29, 2015

OK, auto-tester is now all green.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jun 29, 2015

Does this replace #4626 ?

Now, this actually replaces it.

@WalterBright
Copy link
Member

We cannot determine which instance does not need to be placed in object
file until semantic analysis completed.

Why? If the semantic analysis produces different results, there's a bug somewhere.

Therefore, for all templates instantiated in root module, compiler should invoke their semantic3 --
regardless of whether those are also instantiated in non-root module.

I'm very unhappy with legitimizing the idea that this could produce different instantiations. The whole point is they should be the same.

if a template is instantiated in both of root and non-root modules which mutually import each other, it
needs to placed in objfile.

This will simply kill compilation speed, as nobody seems to be able to resist having every module import every other module (Phobos is a prime example). I don't see how this PR fixes compile speed issues. I prefer the much more straightforward #4780

@9rnsr
Copy link
Contributor Author

9rnsr commented Jun 29, 2015

Why? If the semantic analysis produces different results, there's a bug somewhere.

I'm very unhappy with legitimizing the idea that this could produce different instantiations. The whole point is they should be the same.

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

This will simply kill compilation speed, as nobody seems to be able to resist having every module import every other module (Phobos is a prime example).

??? It's not a problem with the normal Phobos usage. For example, when you compile a "hello world" program, your code (root module) actually imports Phobos, but Phobos code (non-root module) does not import your program.

I don't see how this PR fixes compile speed issues.

This PR stops excessive semantic3 for the templates instantiated in non-root module. It will restore the compilation speed up to the equivalent to the 2.065.


In my Windows 7 PC (64bit Core i7):
Compiled program: import std.stdio; void main() { writeln("Hello world"); }
Measurement by: PS C:\d> Measure-Command {dmd.exe test.d}

With 2.065: 800msec
With 2.066: 1.1sec
With 2.067: 1.4sec
With this PR: 800msec

@9rnsr
Copy link
Contributor Author

9rnsr commented Jun 29, 2015

And, the excessive semantic3 had caused:

  • redundant symbol emission (Issue 14508 - [REG2.067.0] compiling with -unittest instantiates templates in non-root modules)
  • unnecessary -property check in Phobos code (Issue 14564 - [REG2.067] dmd -property -unittest combination causes compiler error)

This PR will also fix those regression issues.

@WalterBright
Copy link
Member

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

There must never be a case where template function mangled names are the same but the instantiated bodies are different. This includes any attribute inference. If this happens, the whole compilation model will fall apart.

For example, when you compile a "hello world" program, your code (root module) actually imports Phobos, but Phobos code (non-root module) does not import your program.

True, but every Phobos module imports every other Phobos module, and a user's project's modules will also typically import every other project module. Using separate compilation and heavy template use (which is becoming the norm) will become impractical.

Measurement by:

Thank you for the good measurements. But I don't feel this is representative for projects outlined in the previous paragraph.

redundant symbol emission (Issue 14508 - [REG2.067.0] compiling with -unittest instantiates templates in non-root modules)

Corrected as well with my simple PR #4780

unnecessary -property check in Phobos code (Issue 14564 - [REG2.067] dmd -property -unittest combination causes compiler error)

Hmm, that might be better addressed with -property fixes.

@JinShil
Copy link
Contributor

JinShil commented Jun 30, 2015

Hmm, that might be better addressed with -property fixes.

@WalterBright #2305 would benefit from your feedback, and it can be rebased or modified per your input.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jun 30, 2015

There must never be a case where template function mangled names are the same but the instantiated bodies are different. ...

You're misunderstanding my argument. If a template is instantiated in root module, we cannot skip its semantic3 until found the same instance in non-root. And in worst case the determination will be deferred until the global semantic3 stage completed, because a function defined in root module may import other modules from the function body.

(I noticed that, if the non-root instantiation found quickly, that would be able to stop being unneeded semantic3 sometimes. I think it would be a possible future optimization for the compilation speed.)

True, but every Phobos module imports every other Phobos module, and a user's project's modules will also typically import every other project module. Using separate compilation and heavy template use (which is becoming the norm) will become impractical.

I noticed that #4780 would be disaster, Read my comment.

Corrected as well with my simple PR #4780

"simple solution" does not guarantee its correctness, especially for complex problems.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jul 16, 2015

Refactor code: Add TemplateInstance::appendToModuleMember().

@MartinNowak
Copy link
Member

Could someone draw a simple overview of all the cases to explain where the code of a template instance ends up.
It seems that we're adding more and more complexity, overfitting the code to every corner case instead of coming up with a general solution.
As the regression was already introduced in 2.067.0 we might take another release to solve this properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that mean? If it has a name please abstract it to a function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first instantiation inst was not instantiated in root module (inst->minst == NULL means it was speculative, and inst->minst->isRoot() == false means it was in non-root module), but this is instantiated in root module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple bool isSpeculative() { return minst == NULL; } would help.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this selects the case where this is non-speculative and root, but the same instance inst was done in speculative or non-root context. Why would you want non-root instances here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If inst was done in speculative, it's already inserted in root module members. So the second time insertion is necessary only for non-root instances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple bool isSpeculative() { return minst == NULL; } would help.

Unfortunately we already have Dsymbol::isSpeculative() and its meaning is not same with the condition tested in here. I'd really want to determine good name, but it's not discovered yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If inst was done in speculative, it's already inserted in root module members.

Why is that???

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a template is instantiated in speculative context, the context should handle all errors during the instantiation - it's including the errors in function body.

void foo(T)(T t) { auto x = undefined_name; }
static assert(!is(typeof(foo(1))));   // the undefined error should be captured in here.

For that, all of speculative instantiations will be handled as root instances, and then, their semantic3 pass will surely get called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's actually very simple and roughly means if (this->needsCodegen && !inst->needsCodegen).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jul 17, 2015

OK, I succeeded fixing Win64 failure. All green.

@WalterBright
Copy link
Member

It seems that we're adding more and more complexity, overfitting the code to every corner case instead of coming up with a general solution. As the regression was already introduced in 2.067.0 we might take another release to solve this properly.

This is why I introduced the alternate fix, #4780

It is most appropriate for 2.068 at least, as it will enable us to get it out the door and has known behavior.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jul 19, 2015

Could someone draw a simple overview of all the cases to explain where the code of a template instance ends up.

I wrote a wiki page for the explanation.
http://wiki.dlang.org/Template_Instantiation_Strategy

@WalterBright
Copy link
Member

@9rnsr thank you for writing up the explanation. (It belongs in the source code, not the wiki.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why swapping the contexts and appending inst instead of simply continuing with this which should already be member of a root module?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bad idea for debugging and understanding to change minst/tinst, just like you wouldn't want to change the scope of a symbol.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the second insertion at most only once. By the swap, inst is changed to a root instance, then the condition at line 5961 no longer hit. This is for the calculation amount reduction.

And, minst and tinst do not affect semantic analysis results. So the swap is not dangerous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't get it.
The whole template compilation/codegen pipeline is already complicated enough. We should avoid additional tricks, even more so if they are undocumented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated to make the comment detailed more. Is that enough?

A poor solution is adding one bool flag bool TemplateInstance::isAlreadyInstantiatedInRoot; and test&check it here. But it will increment the size of TemplateInstance object. As far as possible I'd like to save memory footprint of dmd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the holes between bool variables, but it's unrelated refactoring, and should not go for the release.

Why you require such the redundant bool? Current "swap" is enough smart and efficient. I have no reason to degrade code quality.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good argument to split TemplateInstantiation and TemplateInstance. This double usage had always confused me and likely eats up a lot of memory as well.
#4780 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the point of this is that all identical TemplateInstances delegate to the first TemplateInstance (findExistingInstance) to avoid duplicate semantic. But if the first (primary) instance was done in speculative context, then no code will be generated. Therefore tinst and minst are swapped to make the primary instance a normal root instance.
It's still confusing that the scopes of this and inst won't match their tinst and minst variables.
It would be nice to move this into a shallowSwap function and to swap all fields that deal with the instantiation but not the instance, e.g. swap scope but not members.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the fact that a parent was a speculative template instance affect the semantic of members?
It should only change the error reporting, but maybe I'm overlooking something here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still confusing that the scopes of this and inst won't match their tinst and minst variables.

The scope of instantiated position, it's the Scope*sc parameter of TemplateInstance::semantic() function, is not saved anywhere. Therefore the mismatch of tinst/minst won't leak out of the function. And in TemplateInstance::semantic2() and semantic3(), the given Scope *sc parameter is just ignored.

Will the fact that a parent was a speculative template instance affect the semantic of members?

Never.

@MartinNowak
Copy link
Member

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

Since when does inference affect mangling? I thought we'd avoided that b/c of ABI stability and separate compilation concerns.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jul 21, 2015

Since when does inference affect mangling? I thought we'd avoided that b/c of ABI stability and separate compilation concerns.

Since the first (inference modifies FuncDeclaration::type, then it will be a part of function symbol). In my recognition, typesafe linking is priority than the compilation speed currently.

src/template.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should comment why you're doing something, what you're doing is fairly obvious.

@9rnsr 9rnsr force-pushed the fix14431 branch 2 times, most recently from 469b816 to 77e8079 Compare July 22, 2015 13:29
@9rnsr
Copy link
Contributor Author

9rnsr commented Jul 25, 2015

@MartinNowak @WalterBright Do you have any other objections about this PR? Even if you don't want to go this change in 2.068 release, we can fix the issue in master branch...

@WalterBright
Copy link
Member

The algorithm is very difficult to understand. I don't know how to explain it to someone else. I worry that if the only person it is understandable to is Kenji, then nobody else can maintain the compiler. This is not practical for a team organization.

  1. Martin, do you understand the algorithm?
  2. The wiki, which refers to compiler internals, does not belong on the wiki, it belongs in the compiler source code.
  3. The wiki should have an explanation for how it works that is intended for the D user, i.e. not in terms of compiler internals.

@9rnsr
Copy link
Contributor Author

9rnsr commented Jul 28, 2015

@WalterBright I placed the text in wiki because it was convenient. I'll move it in dmd repo if you hope that (e.g. dmd/doc/note.txt?).

@9rnsr
Copy link
Contributor Author

9rnsr commented Jul 30, 2015

I worry that if the only person it is understandable to is Kenji, then nobody else can maintain the compiler.

I recall that David Nadlinger has fixed a regression issue 13478 in 2.066.1 devel, which was the bug in the template instantiation code.

@klickverbot Could you please join the review of this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again why this change? It seems unrelated to the fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary to fix the regression, but it will increase stability for the position where the instance will be inserted. For example:

a.d

import ...;

template X() { ... }

X!()

b.d

import a, ...;

X!()

With the command line dmd -c a.d b.d:

  • If this adjustment does not exist, the instance X!() will be inserted to the member of either a or b. The actual position will depend on the semantic order.
  • If this adjustment exists, the instance X!() will be always inserted to the member of a.

By that, the X!() code will be always stored into a.obj, in that case. I think the stable algorithm result is better than unstable, if there's not so much performance difference.

@MartinNowak
Copy link
Member

In D2, we have attribute inference and it changes instantiated function mangled names. So, to get the exact lined symbol names, we need to run semantic3 on such the instances.

I still don't understand how semantic3 for correct mangling is necessary to find existing instances.

@9rnsr
Copy link
Contributor Author

9rnsr commented Aug 12, 2015

I still don't understand how semantic3 for correct mangling is necessary to find existing instances.

Perhaps we can elide semantic3 call from some root instances. But currently semantic3 is always called on all root instances. We cannot improve it yet.

@WalterBright
Copy link
Member

I still don't understand how semantic3 for correct mangling is necessary to find existing instances.

It isn't necessary. A template's arguments completely determine it. If the arguments match the arguments for existing instances, then it MUST MATCH and running semantic3 is not necessary.

9rnsr added 2 commits August 13, 2015 18:47
In the pull request dlang#4384, all instance has been changed to invoke
semantic3(). It was for the link-failure issue in specific case, but it
was too excessive.

1. Semantic analysis strategy for template instances:
  We cannot determine which instance does not need to be placed in object
file until semantic analysis completed. Therefore, for all templates
instantiated in root module, compiler should invoke their semantic3 --
regardless of whether those are also instantiated in non-root module. If
a template is _only_ instantiated in non-root module, we can elide its
semantic3 (and for the compilation speed we should do that).

2. Code generation strategy for template instances:
  If a template is instantiated in non-root module, compiler usually does
not have to put it in object file. But if a template is instantiated in
both of root and non-root modules which mutually import each other, it
needs to placed in objfile.
@9rnsr
Copy link
Contributor Author

9rnsr commented Aug 13, 2015

I added test case to show that the issue 14901 will be actually fixed.

@MartinNowak
Copy link
Member

I know it's a complex change but all of the fixes here make sense and solve a few big regressions.
Let's please go on with this and work on cleaning up the template implementation later.

@MartinNowak
Copy link
Member

It isn't necessary. A template's arguments completely determine it. If the arguments match the arguments for existing instances, then it MUST MATCH and running semantic3 is not necessary.

I agree, but changing this isn't actually part of this PR.

@MartinNowak
Copy link
Member

Can you please rebase+reopen this against stable. Please do so soon, so that we can rigorously test this before 2.068.1.

@9rnsr
Copy link
Contributor Author

9rnsr commented Aug 24, 2015

Reopened against stable branch. See #4944.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants