Disable dead code when an if/else if condition is constant by baziotis · Pull Request #3134 · ldc-developers/ldc

baziotis · 2019-08-15T21:03:31Z

A more specific PR, result of a previous: #3133
Issue: #2344

Although @JohanEngelen you proposed that I focus on if (0) on this PR, when I wrote this, if (1) had exactly the same logic and seemed like it fit quite nicely. I can remove it though.

-- Doesn't take into consideration labels.

The comments in the previous PR are worth a look. The tl;dr is:

A discussion about labels.
@JohanEngelen proposed that I take a look at how other compilers (DMD, GDC and Clang) handle this.

Regarding the last one, I had no hope with DMD. I did spend some time and I could not even target where an if statement (or such) is handled.

Lastly, I don't quite get insertBB and irs->DBuilder. I know what a basic block is and that the DBuilder has to do with debug info. But I don't understand exactly how they are used, when etc.

Edit: I should probably add a delete cond_e as it is done in e.g. (WhileStatement](https://github.com/ldc-developers/ldc/blob/master/gen/statements.cpp#L410). Probably was missed by accident here.

baziotis · 2019-08-15T21:26:02Z

Just saw GDC. I have some problems to compile from source and verify the path but 99% it's the following.
So, an IfStatement is handled here: toir.cpp :: visit(IfStatement)
You can see it is quite simple (and doesn't handle the issue). Generating IR from GCC though (i.e. fdump-tree-original) has eliminated the code.
And that certainly happens with the call to build_vcondition. Which if you follow it, it eventually goes here fold_build3_loc, which goes here fold_ternary_loc.
Of course it's quite a complicated it path, I'm not even close to understanding it fully, but a couple important points:
a) It's GCC tree transformations that do the work, not GDC.
b) At some point, this function calls contains_label_p which checks if a sub-tree contains any label. Which seems to follow what I assumed in a previous comment that we seem to care only for whether a block contains any label and not the labels per se. Which is good as it is I think quite easy info to have from the front-end.
Btw, note that this function seems to rather walk the tree to find if it has any labels.

baziotis · 2019-08-15T21:45:09Z

Just saw Clang as well. I did not expect to have so clear code. And it's pretty similar to LDC to.
I was trying for about an hour to install clang without re-installing LLVM with no success. So, this is too without running.
An IfStmt is emitted here: CodeGenFunction::EmitIfStmt. I'm glad that Clang seems to do it similarly. It's clever this Executed - Skipped usage. Although I see an incrementProfileCounter there which I don't do when I eliminate so that might be wrong.

Anyway, here there is again a call to ContainsLabel.
Which is a very simple function. Simple to implement in LDC too it seems. And it walks the tree.

baziotis · 2019-08-19T19:20:26Z

@kinke , @JohanEngelen what do you think? I'm pinging you as you may haven't seen it because it's marked as WIP and also fails.
The tl;dr is:

It fails because it does not handle labels.
To handle labels, we need to know if a block has any label inside. This info can be provided either by DMD or we have to walk the tree.
The 2 comments above describe what GDC and Clang do. For GDC, GCC does the job while on Clang, Clang. Both walk the tree every time. For DMD, I could not find what it does.

I have discussed this on Slack and I don't see us getting this info, so I guess the only option is to walk the tree.
You can see that Clang has a very simple implementation of ContainsLabel because of ->children method. The closest I know to this in DMD is flatten but this is not generic enough (e.g. IfStatement does not implement it). So, I guess the only option is to implement our own.

kinke · 2019-08-19T19:55:51Z

Thx for the analysis. I don't find this -O0 optimization very useful, that's why I haven't posted earlier. I'm not sure what the goal is - reducing compile times for countless runtime-ifs with static condition (alternative: static if), reducing -O0 code size, ...?

If the compilation speed isn't the primary concern, we could make our lives easy and leverage some LLVM optimization pass to get rid of these superfluous branches and basic blocks. At -O1, one of the first passes is SimplifyCFG, which should do this, but it might also optimize some more which could be unexpected for debugging.

Checking for labels in the dead branch during codegen shouldn't be very hard; there are Visitor base classes in DMD which can be used to walk the subtree.

baziotis · 2019-08-19T21:01:02Z

Checking for labels in the dead branch during codegen shouldn't be very hard; there are Visitor base classes in DMD which can be used to walk the subtree.

Ok, I was not aware. Yes the implication with "implement our own" is that it can't be as simple as Clang's. There has to be some sort of visitor.

Thx for the analysis. I don't find this -O0 optimization very useful, that's why I haven't posted earlier. I'm not sure what the goal is - reducing compile times for countless runtime-ifs with static condition (alternative: static if), reducing -O0 code size, ...?

TBH, I don't find it very useful either. I mean, it's good to have it, but if the expense is to have a whole visitor for it, probably not worth it. Even more in D, where as you said, there is static if. I guess the probabilities of having something like that are very low, except for very deep constant propagation where the fact that the condition is constant is not obvious to the programmer.

Here are though some maybe more subtle things:
a) Clang mentions:

This avoids emitting dead code and simplifies the CFG substantially.

One does not care to simplify the CFG in -O0. Which makes me think that Clang has added it to reduce compilation times but in optimized builds.

b) Using a simple grep, I had seen ContainsLabel been used in a couple of other places. If we have more use cases, that could make it worth it (I can't think of any other).

If the compilation speed isn't the primary concern, we could make our lives easy and leverage some LLVM optimization pass to get rid of these superfluous branches and basic blocks. At -O1, one of the first passes is SimplifyCFG, which should do this, but it might also optimize some more which could be unexpected for debugging.

I think that this is a bad option. Both for debugging but also, correct me if I'm wrong, LDC would like to be faster.

JohanEngelen · 2019-08-21T19:09:12Z

I believe this helps in compilation speed for large if statements. The benefit is probably not so big, but the cost isn't either. (you should only traverse the tree when the condition is known at compile-time)

Some extra pointers:

for traversing the AST, you can use RecursiveVisitor and RecursiveWalker. Upon visiting a LabelStatement, you can stop recursion by setting stop to true.
emitCounterIncrement is our version of Clang's incrementProfileCounter (looks like Clang chose a better function name than I did). It should only be executed for the if(true) branch. This triggers me: you should add tests for correct profile counting aswell as correct coverage (-cov) counting.

baziotis · 2019-08-21T22:33:08Z

you should only traverse the tree when the condition is known at compile-time
Yes of course.

for traversing the AST, you can use RecursiveVisitor and RecursiveWalker. Upon visiting a LabelStatement, you can stop recursion by setting stop to true.

Thanks I planned to implement my own visitor. I'm comfortable with the solution, it's the DMD (?) visitors that I'm not sure about. I'll search about them.

emitCounterIncrement is our version of Clang's incrementProfileCounter (looks like Clang chose a better function name than I did). It should only be executed for the if(true) branch. This triggers me: you should add tests for correct profile counting aswell as correct coverage (-cov) counting.

Clang seems to have both emitCounterIncrement, incrementProfileCounter.

I'm not at all familiar with PGO. It seems that it tracks branches (or basic blocks..) that are taken, so it makes sense that we want it only for if (true).
For the tests that you referenced, are there example tests?

JohanEngelen · 2019-08-22T05:50:55Z

I meant that you should create your own visitor subclassed from RecursiveVisitor and then use RecursiveWalker.

coverage test: dmd-testsuite/runnable/cov2.d
profile counting tests: tests/PGO/... This will probably be hard to understand...

baziotis · 2019-08-22T13:27:10Z

coverage test: dmd-testsuite/runnable/cov2.d

I had some problems last time I tried to run dmd testsuite. I'll see what I can do, but till then, in this
case, isn't coverage handled by the call to emitCoverageLinecountInc? This is above the test for elimination, which means it shouldn't change the current behavior.
Now, as I said I'm not familiar with PGO, but this seems wrong. i.e. maybe this should not be called if we have if (false) ?

profile counting tests: tests/PGO/... This will probably be hard to understand...

It's weird yes. But I think it starts to make sense. For a simple code like this:
if (...) { ... } or if (...) { } else if (...) { ... } it seems that for every basic block that a taken branch targets (i.e. in the first example, only the if:, not the else) there is a pattern:

get PGO counter from an array
add 1 to it (i.e. mark taken)
save back
and the array seems to have as many counters as target basic blocks.
Oh, and there's a counter for entering the function as well.

Is it kind of what I described? And the PGO tests should try to verify that every block has this pattern?
Although, the tests themselves are quite complicated, I don't yet understand them.

gen/statements.cpp

tests/codegen/false_cond_labels.d

gen/toir.cpp

JohanEngelen · 2019-08-22T18:56:42Z

For coverage: although the code changes may clearly show no bugginess, it's good to test that

   if(false)  // <--- this should still show a line count increase even though completely elided.
   {

   }

For PGO: indeed, there is only a counter for the if-branch, and not for the else-branch. (assuming we know the execution count before the if-statement, else_count = before_if - if_branch_count. I wrote the tests and also have to study them a bit ;-) I think things will make more sense if you study the IR output of the PGO tests a little.

gen/statements.cpp

baziotis · 2019-08-22T23:35:19Z

For coverage: although the code changes may clearly show no bugginess, it's good to test that
   if(false)  // <--- this should still show a line count increase even though completely elided.
   {

   }

I get the idea but not how to do the test. cov2 or dmd_coverDestPath doc don't help either.

For PGO: indeed, there is only a counter for the if-branch, and not for the else-branch. (assuming we know the execution count before the if-statement, else_count = before_if - if_branch_count.

Smart!

I wrote the tests and also have to study them a bit ;-) I think things will make more sense if you study the IR output of the PGO tests a little.

They start to make sense yes. I'll write some tests tomorrow and you can review the way forward. :)
2 points:
a) Currently, when we have code if (true) { } we handle 2 PGO counters (one for the function, the other for the if), as you said above. Do we actually want that? I'm asking because in such a case, the second counter will always be equal to the first.
b) Most of PGO/* tests are unsupported for me. Can I do something?

Ah, unrelated but just crossed my mind:
In this:

if (false) {
	L1:
	int a = 1;
} else {
	goto L1;
}

Currently we generate the false branch as we should but we also generate a branch instruction.
This can be avoided of course. Meaning, we should just generate the if code without branch, then the else. Now that I'm thinking of that, I also have to answer the question whether the order matters. A quick thought is not but I'll come back to it when I implement it.

JohanEngelen · 2019-08-23T17:38:03Z

For coverage: although the code changes may clearly show no bugginess, it's good to test that
   if(false)  // <--- this should still show a line count increase even though completely elided.
   {

   }
I get the idea but not how to do the test. cov2 or dmd_coverDestPath doc don't help either.

Note the // POST_SCRIPT: runnable/extra-files/coverage-postscript.sh in cov2.d.

For PGO: indeed, there is only a counter for the if-branch, and not for the else-branch. (assuming we know the execution count before the if-statement, else_count = before_if - if_branch_count.

Smart!

I wrote the tests and also have to study them a bit ;-) I think things will make more sense if you study the IR output of the PGO tests a little.

They start to make sense yes. I'll write some tests tomorrow and you can review the way forward. :)
2 points:
a) Currently, when we have code if (true) { } we handle 2 PGO counters (one for the function, the other for the if), as you said above. Do we actually want that? I'm asking because in such a case, the second counter will always be equal to the first.

Yeah that's fine. People and tooling may query the counters and expect there to be the counter with the correct count.

b) Most of PGO/* tests are unsupported for me. Can I do something?

What platform are you on? When building LLVM, you also need to build LLVM's compiler-rt for the functionality. (should be part of LDC's LLVM repo)

Ah, unrelated but just crossed my mind:
In this:
if (false) {
	L1:
	int a = 1;
} else {
	goto L1;
}
Currently we generate the false branch as we should but we also generate a branch instruction.
This can be avoided of course. Meaning, we should just generate the if code without branch, then the else.

I would leave this optimization out, it's -O0 after all ;) Basically treat this as a normal case where you cannot elide.

gen/statements.cpp

tests/codegen/const_cond.d

tests/codegen/const_cond_labels.d

baziotis · 2019-08-23T23:58:37Z

Note the // POST_SCRIPT: runnable/extra-files/coverage-postscript.sh in cov2.d.

Thanks, I'll study it tomorrow.

Yeah that's fine. People and tooling may query the counters and expect there to be the counter with the correct count.

I see, alright.

What platform are you on? When building LLVM, you also need to build LLVM's compiler-rt for the functionality. (should be part of LDC's LLVM repo)

I'm on ubuntu linux. When I ran CMake, I used -DCOMPILER_RT_INCLUDE_TESTS=OFF as pointed in the Wiki. Maybe this is the problem.

I would leave this optimization out, it's -O0 after all ;) Basically treat this as a normal case where you cannot elide.

Indeed. Ok, as we're doing right now.

baziotis · 2019-09-28T21:04:37Z

You forgot the crucial .gitmodules change,

Thanks. I actually did the change but of course, because it's me, I forgot to git add it.

that seems to work now (at least for Azure). Your dmd-testsuite branch is still based on current ldc, i.e., includes an unrelated cppmangle test change, which will most likely fail.

If I understood correctly, now it should be ok. I rebased back to the commit that the ldc dmd-testsuite submodule is and added my commit on top.

gen/recursivevisitor.h

gen/statements.cpp

kinke · 2019-09-28T21:38:33Z

A little trick to quickly locate the dmd-testsuite failures for Azure: click on the dmd-testsuite step to open the log, then press Ctrl+F to open the VS-Code-search, and then search for failed. (yes, including the .).

gen/statements.cpp

baziotis · 2019-09-29T13:41:07Z

gen/statements.cpp

+          irs->DBuilder.EmitBlockStart(executed->loc);
+        }
+        // True condition, the branch is taken so emit counter increment.
+        if (!const_val->isZeroValue()) {


Can it ever happen to have !zeroValue && !executed ? Basically, this boils down to can ever the stmt->ifbody be null ?

I don't think so, but it's probably better to assume the front-end might spit out a null.

Good, just double-checking.

baziotis · 2019-09-29T13:49:39Z

Conflicting files
tests/d2/dmd-testsuite

Oh really ? What now ? :P

kinke · 2019-09-29T13:51:06Z

Yeah sry about that :] - you'll have to rebase this and your dmd-testsuite branch again. As a bonus, the unrelated Win64 Azure failures will be fixed.

baziotis · 2019-09-29T13:56:31Z

Yeah sry about that :] - you'll have to rebase this and your dmd-testsuite branch again. As a bonus, the unrelated Win64 Azure failures will be fixed.

Ok ok, thanks. :)

baziotis · 2019-09-30T11:45:29Z

I think that something's wrong with the 2 failed tests.

kinke · 2019-09-30T12:03:11Z

Yeah don't worry about those 2 CI failures for now, they generally fail at the moment, so this PR is virtually green now.

baziotis · 2019-09-30T12:08:06Z

Great!

Conflicts: tests/d2/dmd-testsuite

JohanEngelen · 2019-11-03T17:15:10Z

@baziotis Rather than merging master into your branch, you should rebase the changes onto master. That way history stays much cleaner.

baziotis · 2019-11-03T17:20:20Z

@baziotis Rather than merging master into your branch, you should rebase the changes onto master. That way history stays much cleaner.

Could you specify where I did the opposite ? I might have missed it, AFAIK, I did what you propose.
Note: The last merge of master into my branch (the branch of this PR) was not done by me.

kinke · 2019-11-03T17:33:25Z

I did, because I also created a corresponding dmd-testsuite branch in the official repo (rebased and both previous commits squashed into one), and that doesn't play with nice with multiple commits in here touching dmd-testsuite. I was going to squash these 30 commits anyway - are there other preferences?

kinke · 2019-11-04T17:28:32Z

@JohanEngelen: Do you prefer it unsquashed then? Otherwise I'll squash-merge later this evening and release the first 1.19 beta.

JohanEngelen · 2019-11-08T15:12:31Z

@kinke sry for late response, I was out traveling. Indeed, with squash merge it doesn't matter

Disable dead code when an if/else if condition is constant

cb8d4ca

baziotis mentioned this pull request Aug 15, 2019

[WIP] Eliminate branch when the condition is false #3133

Closed

Correct documentation on the test case

969c320

JohanEngelen reviewed Aug 22, 2019

View reviewed changes

gen/statements.cpp Outdated Show resolved Hide resolved

JohanEngelen reviewed Aug 22, 2019

View reviewed changes

gen/statements.cpp Outdated Show resolved Hide resolved

JohanEngelen reviewed Aug 22, 2019

View reviewed changes

tests/codegen/false_cond_labels.d Outdated Show resolved Hide resolved

JohanEngelen reviewed Aug 22, 2019

View reviewed changes

gen/toir.cpp Show resolved Hide resolved

JohanEngelen reviewed Aug 22, 2019

View reviewed changes

gen/statements.cpp Outdated Show resolved Hide resolved

baziotis added 3 commits August 22, 2019 23:04

Added label handling and PGO increment

12b9fbb

Remove RecursiveVisitor

0d53253

Add more complicated tests

e2536b1

Better named tests

1505fef

JohanEngelen reviewed Aug 23, 2019

View reviewed changes

gen/statements.cpp Outdated Show resolved Hide resolved

JohanEngelen reviewed Aug 23, 2019

View reviewed changes

gen/statements.cpp Outdated Show resolved Hide resolved

JohanEngelen reviewed Aug 23, 2019

View reviewed changes

tests/codegen/const_cond.d Outdated Show resolved Hide resolved

JohanEngelen reviewed Aug 23, 2019

View reviewed changes

tests/codegen/const_cond.d Outdated Show resolved Hide resolved

JohanEngelen reviewed Aug 23, 2019

View reviewed changes

tests/codegen/const_cond_labels.d Show resolved Hide resolved