Skip to content

Disable dead code when an if/else if condition is constant#3134

Merged
kinke merged 30 commits intoldc-developers:masterfrom
baziotis:if_false2
Nov 5, 2019
Merged

Disable dead code when an if/else if condition is constant#3134
kinke merged 30 commits intoldc-developers:masterfrom
baziotis:if_false2

Conversation

@baziotis
Copy link
Contributor

@baziotis baziotis commented Aug 15, 2019

A more specific PR, result of a previous: #3133
Issue: #2344

Although @JohanEngelen you proposed that I focus on if (0) on this PR, when I wrote this, if (1) had exactly the same logic and seemed like it fit quite nicely. I can remove it though.

-- Doesn't take into consideration labels.

The comments in the previous PR are worth a look. The tl;dr is:

  • A discussion about labels.
  • @JohanEngelen proposed that I take a look at how other compilers (DMD, GDC and Clang) handle this.

Regarding the last one, I had no hope with DMD. I did spend some time and I could not even target where an if statement (or such) is handled.

Lastly, I don't quite get insertBB and irs->DBuilder. I know what a basic block is and that the DBuilder has to do with debug info. But I don't understand exactly how they are used, when etc.

Edit: I should probably add a delete cond_e as it is done in e.g. (WhileStatement](https://github.com/ldc-developers/ldc/blob/master/gen/statements.cpp#L410). Probably was missed by accident here.

@baziotis
Copy link
Contributor Author

baziotis commented Aug 15, 2019

Just saw GDC. I have some problems to compile from source and verify the path but 99% it's the following.
So, an IfStatement is handled here: toir.cpp :: visit(IfStatement)
You can see it is quite simple (and doesn't handle the issue). Generating IR from GCC though (i.e. fdump-tree-original) has eliminated the code.
And that certainly happens with the call to build_vcondition. Which if you follow it, it eventually goes here fold_build3_loc, which goes here fold_ternary_loc.
Of course it's quite a complicated it path, I'm not even close to understanding it fully, but a couple important points:
a) It's GCC tree transformations that do the work, not GDC.
b) At some point, this function calls contains_label_p which checks if a sub-tree contains any label. Which seems to follow what I assumed in a previous comment that we seem to care only for whether a block contains any label and not the labels per se. Which is good as it is I think quite easy info to have from the front-end.
Btw, note that this function seems to rather walk the tree to find if it has any labels.

@baziotis
Copy link
Contributor Author

baziotis commented Aug 15, 2019

Just saw Clang as well. I did not expect to have so clear code. And it's pretty similar to LDC to.
I was trying for about an hour to install clang without re-installing LLVM with no success. So, this is too without running.
An IfStmt is emitted here: CodeGenFunction::EmitIfStmt. I'm glad that Clang seems to do it similarly. It's clever this Executed - Skipped usage. Although I see an incrementProfileCounter there which I don't do when I eliminate so that might be wrong.

Anyway, here there is again a call to ContainsLabel.
Which is a very simple function. Simple to implement in LDC too it seems. And it walks the tree.

@baziotis
Copy link
Contributor Author

@kinke , @JohanEngelen what do you think? I'm pinging you as you may haven't seen it because it's marked as WIP and also fails.
The tl;dr is:

  • It fails because it does not handle labels.
  • To handle labels, we need to know if a block has any label inside. This info can be provided either by DMD or we have to walk the tree.
  • The 2 comments above describe what GDC and Clang do. For GDC, GCC does the job while on Clang, Clang. Both walk the tree every time. For DMD, I could not find what it does.

I have discussed this on Slack and I don't see us getting this info, so I guess the only option is to walk the tree.
You can see that Clang has a very simple implementation of ContainsLabel because of ->children method. The closest I know to this in DMD is flatten but this is not generic enough (e.g. IfStatement does not implement it). So, I guess the only option is to implement our own.

@kinke
Copy link
Member

kinke commented Aug 19, 2019

Thx for the analysis. I don't find this -O0 optimization very useful, that's why I haven't posted earlier. I'm not sure what the goal is - reducing compile times for countless runtime-ifs with static condition (alternative: static if), reducing -O0 code size, ...?

If the compilation speed isn't the primary concern, we could make our lives easy and leverage some LLVM optimization pass to get rid of these superfluous branches and basic blocks. At -O1, one of the first passes is SimplifyCFG, which should do this, but it might also optimize some more which could be unexpected for debugging.

Checking for labels in the dead branch during codegen shouldn't be very hard; there are Visitor base classes in DMD which can be used to walk the subtree.

@baziotis
Copy link
Contributor Author

Checking for labels in the dead branch during codegen shouldn't be very hard; there are Visitor base classes in DMD which can be used to walk the subtree.

Ok, I was not aware. Yes the implication with "implement our own" is that it can't be as simple as Clang's. There has to be some sort of visitor.

Thx for the analysis. I don't find this -O0 optimization very useful, that's why I haven't posted earlier. I'm not sure what the goal is - reducing compile times for countless runtime-ifs with static condition (alternative: static if), reducing -O0 code size, ...?

TBH, I don't find it very useful either. I mean, it's good to have it, but if the expense is to have a whole visitor for it, probably not worth it. Even more in D, where as you said, there is static if. I guess the probabilities of having something like that are very low, except for very deep constant propagation where the fact that the condition is constant is not obvious to the programmer.

Here are though some maybe more subtle things:
a) Clang mentions:

This avoids emitting dead code and simplifies the CFG substantially.

One does not care to simplify the CFG in -O0. Which makes me think that Clang has added it to reduce compilation times but in optimized builds.

b) Using a simple grep, I had seen ContainsLabel been used in a couple of other places. If we have more use cases, that could make it worth it (I can't think of any other).

If the compilation speed isn't the primary concern, we could make our lives easy and leverage some LLVM optimization pass to get rid of these superfluous branches and basic blocks. At -O1, one of the first passes is SimplifyCFG, which should do this, but it might also optimize some more which could be unexpected for debugging.

I think that this is a bad option. Both for debugging but also, correct me if I'm wrong, LDC would like to be faster.

@JohanEngelen
Copy link
Member

I believe this helps in compilation speed for large if statements. The benefit is probably not so big, but the cost isn't either. (you should only traverse the tree when the condition is known at compile-time)

Some extra pointers:

  • for traversing the AST, you can use RecursiveVisitor and RecursiveWalker. Upon visiting a LabelStatement, you can stop recursion by setting stop to true.
  • emitCounterIncrement is our version of Clang's incrementProfileCounter (looks like Clang chose a better function name than I did). It should only be executed for the if(true) branch. This triggers me: you should add tests for correct profile counting aswell as correct coverage (-cov) counting.

@baziotis
Copy link
Contributor Author

you should only traverse the tree when the condition is known at compile-time
Yes of course.

  • for traversing the AST, you can use RecursiveVisitor and RecursiveWalker. Upon visiting a LabelStatement, you can stop recursion by setting stop to true.

Thanks I planned to implement my own visitor. I'm comfortable with the solution, it's the DMD (?) visitors that I'm not sure about. I'll search about them.

  • emitCounterIncrement is our version of Clang's incrementProfileCounter (looks like Clang chose a better function name than I did). It should only be executed for the if(true) branch. This triggers me: you should add tests for correct profile counting aswell as correct coverage (-cov) counting.

Clang seems to have both emitCounterIncrement, incrementProfileCounter.

I'm not at all familiar with PGO. It seems that it tracks branches (or basic blocks..) that are taken, so it makes sense that we want it only for if (true).
For the tests that you referenced, are there example tests?

@JohanEngelen
Copy link
Member

I meant that you should create your own visitor subclassed from RecursiveVisitor and then use RecursiveWalker.

  • coverage test: dmd-testsuite/runnable/cov2.d

  • profile counting tests: tests/PGO/... This will probably be hard to understand...

@baziotis
Copy link
Contributor Author

baziotis commented Aug 22, 2019

  • coverage test: dmd-testsuite/runnable/cov2.d

I had some problems last time I tried to run dmd testsuite. I'll see what I can do, but till then, in this
case, isn't coverage handled by the call to emitCoverageLinecountInc? This is above the test for elimination, which means it shouldn't change the current behavior.
Now, as I said I'm not familiar with PGO, but this seems wrong. i.e. maybe this should not be called if we have if (false) ?

  • profile counting tests: tests/PGO/... This will probably be hard to understand...

It's weird yes. But I think it starts to make sense. For a simple code like this:
if (...) { ... } or if (...) { } else if (...) { ... } it seems that for every basic block that a taken branch targets (i.e. in the first example, only the if:, not the else) there is a pattern:

  • get PGO counter from an array
  • add 1 to it (i.e. mark taken)
  • save back
    and the array seems to have as many counters as target basic blocks.
    Oh, and there's a counter for entering the function as well.

Is it kind of what I described? And the PGO tests should try to verify that every block has this pattern?
Although, the tests themselves are quite complicated, I don't yet understand them.

@JohanEngelen
Copy link
Member

For coverage: although the code changes may clearly show no bugginess, it's good to test that

   if(false)  // <--- this should still show a line count increase even though completely elided.
   {

   }

For PGO: indeed, there is only a counter for the if-branch, and not for the else-branch. (assuming we know the execution count before the if-statement, else_count = before_if - if_branch_count. I wrote the tests and also have to study them a bit ;-) I think things will make more sense if you study the IR output of the PGO tests a little.

@baziotis
Copy link
Contributor Author

baziotis commented Aug 22, 2019

For coverage: although the code changes may clearly show no bugginess, it's good to test that

   if(false)  // <--- this should still show a line count increase even though completely elided.
   {

   }

I get the idea but not how to do the test. cov2 or dmd_coverDestPath doc don't help either.

For PGO: indeed, there is only a counter for the if-branch, and not for the else-branch. (assuming we know the execution count before the if-statement, else_count = before_if - if_branch_count.

Smart!

I wrote the tests and also have to study them a bit ;-) I think things will make more sense if you study the IR output of the PGO tests a little.

They start to make sense yes. I'll write some tests tomorrow and you can review the way forward. :)
2 points:
a) Currently, when we have code if (true) { } we handle 2 PGO counters (one for the function, the other for the if), as you said above. Do we actually want that? I'm asking because in such a case, the second counter will always be equal to the first.
b) Most of PGO/* tests are unsupported for me. Can I do something?

Ah, unrelated but just crossed my mind:
In this:

if (false) {
	L1:
	int a = 1;
} else {
	goto L1;
}

Currently we generate the false branch as we should but we also generate a branch instruction.
This can be avoided of course. Meaning, we should just generate the if code without branch, then the else. Now that I'm thinking of that, I also have to answer the question whether the order matters. A quick thought is not but I'll come back to it when I implement it.

@JohanEngelen
Copy link
Member

For coverage: although the code changes may clearly show no bugginess, it's good to test that

   if(false)  // <--- this should still show a line count increase even though completely elided.
   {

   }

I get the idea but not how to do the test. cov2 or dmd_coverDestPath doc don't help either.

Note the // POST_SCRIPT: runnable/extra-files/coverage-postscript.sh in cov2.d.

For PGO: indeed, there is only a counter for the if-branch, and not for the else-branch. (assuming we know the execution count before the if-statement, else_count = before_if - if_branch_count.

Smart!

I wrote the tests and also have to study them a bit ;-) I think things will make more sense if you study the IR output of the PGO tests a little.

They start to make sense yes. I'll write some tests tomorrow and you can review the way forward. :)
2 points:
a) Currently, when we have code if (true) { } we handle 2 PGO counters (one for the function, the other for the if), as you said above. Do we actually want that? I'm asking because in such a case, the second counter will always be equal to the first.

Yeah that's fine. People and tooling may query the counters and expect there to be the counter with the correct count.

b) Most of PGO/* tests are unsupported for me. Can I do something?

What platform are you on? When building LLVM, you also need to build LLVM's compiler-rt for the functionality. (should be part of LDC's LLVM repo)

Ah, unrelated but just crossed my mind:
In this:

if (false) {
	L1:
	int a = 1;
} else {
	goto L1;
}

Currently we generate the false branch as we should but we also generate a branch instruction.
This can be avoided of course. Meaning, we should just generate the if code without branch, then the else.

I would leave this optimization out, it's -O0 after all ;) Basically treat this as a normal case where you cannot elide.

@baziotis
Copy link
Contributor Author

Note the // POST_SCRIPT: runnable/extra-files/coverage-postscript.sh in cov2.d.

Thanks, I'll study it tomorrow.

Yeah that's fine. People and tooling may query the counters and expect there to be the counter with the correct count.

I see, alright.

What platform are you on? When building LLVM, you also need to build LLVM's compiler-rt for the functionality. (should be part of LDC's LLVM repo)

I'm on ubuntu linux. When I ran CMake, I used -DCOMPILER_RT_INCLUDE_TESTS=OFF as pointed in the Wiki. Maybe this is the problem.

I would leave this optimization out, it's -O0 after all ;) Basically treat this as a normal case where you cannot elide.

Indeed. Ok, as we're doing right now.

@baziotis
Copy link
Contributor Author

You forgot the crucial .gitmodules change,

Thanks. I actually did the change but of course, because it's me, I forgot to git add it.

that seems to work now (at least for Azure). Your dmd-testsuite branch is still based on current ldc, i.e., includes an unrelated cppmangle test change, which will most likely fail.

If I understood correctly, now it should be ok. I rebased back to the commit that the ldc dmd-testsuite submodule is and added my commit on top.

@kinke
Copy link
Member

kinke commented Sep 28, 2019

A little trick to quickly locate the dmd-testsuite failures for Azure: click on the dmd-testsuite step to open the log, then press Ctrl+F to open the VS-Code-search, and then search for failed. (yes, including the .).

irs->DBuilder.EmitBlockStart(executed->loc);
}
// True condition, the branch is taken so emit counter increment.
if (!const_val->isZeroValue()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it ever happen to have !zeroValue && !executed ? Basically, this boils down to can ever the stmt->ifbody be null ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, but it's probably better to assume the front-end might spit out a null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good, just double-checking.

@baziotis
Copy link
Contributor Author

Conflicting files
tests/d2/dmd-testsuite

Oh really ? What now ? :P

@kinke
Copy link
Member

kinke commented Sep 29, 2019

Yeah sry about that :] - you'll have to rebase this and your dmd-testsuite branch again. As a bonus, the unrelated Win64 Azure failures will be fixed.

@baziotis
Copy link
Contributor Author

Yeah sry about that :] - you'll have to rebase this and your dmd-testsuite branch again. As a bonus, the unrelated Win64 Azure failures will be fixed.

Ok ok, thanks. :)

@baziotis
Copy link
Contributor Author

I think that something's wrong with the 2 failed tests.

@kinke
Copy link
Member

kinke commented Sep 30, 2019

Yeah don't worry about those 2 CI failures for now, they generally fail at the moment, so this PR is virtually green now.

@baziotis
Copy link
Contributor Author

Great!

@baziotis baziotis changed the title [WIP] Disable dead code when an if/else if condition is constant Disable dead code when an if/else if condition is constant Oct 23, 2019
Conflicts:
	tests/d2/dmd-testsuite
@JohanEngelen
Copy link
Member

@baziotis Rather than merging master into your branch, you should rebase the changes onto master. That way history stays much cleaner.

@baziotis
Copy link
Contributor Author

baziotis commented Nov 3, 2019

@baziotis Rather than merging master into your branch, you should rebase the changes onto master. That way history stays much cleaner.

Could you specify where I did the opposite ? I might have missed it, AFAIK, I did what you propose.
Note: The last merge of master into my branch (the branch of this PR) was not done by me.

@kinke
Copy link
Member

kinke commented Nov 3, 2019

I did, because I also created a corresponding dmd-testsuite branch in the official repo (rebased and both previous commits squashed into one), and that doesn't play with nice with multiple commits in here touching dmd-testsuite. I was going to squash these 30 commits anyway - are there other preferences?

@kinke
Copy link
Member

kinke commented Nov 4, 2019

@JohanEngelen: Do you prefer it unsquashed then? Otherwise I'll squash-merge later this evening and release the first 1.19 beta.

@kinke kinke merged commit 94743b9 into ldc-developers:master Nov 5, 2019
@JohanEngelen
Copy link
Member

@kinke sry for late response, I was out traveling. Indeed, with squash merge it doesn't matter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants