Skip to content

Conversation

@rdmarsh2
Copy link
Contributor

This is the first part of a port of the Java range analysis library using the IR. It's based off #197, but I wanted to get it out for a round of review now. deeeb25 is the first commit that isn't part of #197.

I'm expecting the main review to come from @jbj, but I'd also like a look from @dave-bartolomeo for the IR changes and from @aschackmull since he wrote the Java library.

@rdmarsh2 rdmarsh2 requested a review from jbj September 28, 2018 17:48
@rdmarsh2 rdmarsh2 force-pushed the rdmarsh/cpp/sign-analysis branch from 1151b6e to a03ac4f Compare September 28, 2018 19:16
@aschackmull
Copy link
Contributor

Have you tried some tests for completeness on some complex snapshots? I.e. testing that every integral-typed expression has sign information. I seem to remember needing to close some holes in various corner cases when doing the java version.

@aschackmull
Copy link
Contributor

The Java guards library includes a set of "implies" predicates to handle
short-circuiting conditionals. C++ handles those in IR generation, so
dominance on the IR produces correct results for controlling blocks.

This is not entirely correct. The java CFG also handles short-circuiting conditionals, so dominance does "the right thing". The reason why these are also included in the implies family of predicates is to support when such conditions occur in non-boolean contexts. E.g.:

boolean b = cond1 && cond2;
if (b) {
  foo();
}

In this case we'd like both cond1 and cond2 to act as guards for foo(), which isn't directly given from dominance.

Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are some fundamental questions here we should answer about the relationship between Instruction, ValueNumber and a possible Operand type. This doesn't have to hold back this PR, but I think it should hold back further work on porting range analysis. Let's discuss over Hangouts when @dave-bartolomeo is back next week.

opcode instanceof Opcode::Negate
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have to change all three copies of Instruction.qll. Check with buildutils-internal/scripts/pr-checks/sync-identical-files.py if it worked.

* Gets a sign that `operand` may have at `pos`, taking guards into account.
*/
cached
private Sign operandSign(Instruction pos, Instruction operand) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are now two cached predicates in this file. Please group them in a cached module like we've done elsewhere to make sure the engine will put them in the same stage both now and in the future. Otherwise we risk evaluating most of the range analysis twice.



/** Holds if `e` can be negative and cannot be positive. */
predicate strictlyNegative(Instruction i, Instruction pos) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The QLDoc for both the unary and binary versions of these predicates needs updating. It mentions e, but there's only i and pos.

)
or
// use hasGuard here?
result = operandSign(i, i.(PhiInstruction).getAnOperand())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use hasGuard here?

I'm having trouble answering that question too. Maybe that's because this port sometimes uses Instruction in place of Java's Expr (like in this predicate) and sometimes uses Instruction i, Instruction pos. We can get away with using Instruction for Expr in cases like addition and such because the IR does not apply common subexpression elimination and therefore guarantees that each such Instruction is consumed only once, but that's not the case for instructions like LoadInstruction or PhiInstruction.

To expose this problem in its full generality, one option would be to replace our use of Instruction with ValueNumber. Then this library can see that x - 2 is positive under if (x - 2 > 0), but we'll have to be consistent in using ValueNumber val, Instruction pos. Or we could create a SsaReadPosition class like in Java to use for pos.

jbj
jbj previously approved these changes Oct 5, 2018
Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @rdmarsh2. What do you say to @aschackmull's comments? Are they best addressed now or in a follow-up PR?

@jbj
Copy link
Contributor

jbj commented Oct 5, 2018

The test is failing due to a semantic merge conflict with #254.

@rdmarsh2
Copy link
Contributor Author

I'll fix the test failures and run on some real snapshots this afternoon. I'll try and get the implies predicates done tomorrow.

@jbj jbj mentioned this pull request Oct 24, 2018
@rdmarsh2 rdmarsh2 force-pushed the rdmarsh/cpp/sign-analysis branch from a6fe2b4 to 1d7e802 Compare November 8, 2018 00:08
@rdmarsh2 rdmarsh2 requested a review from a team as a code owner November 8, 2018 00:08

/**
* Holds if `lowerbound` is a lower bound for `bounded` at `pos`. This is restricted
* Holds if `lowerbound` is a lower bound for `bounded``. This is restricted
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double backtick

* to only include bounds for which we might determine a sign.
*/
private predicate lowerBound(IRGuardCondition comp, Instruction lowerbound, Instruction bounded, Instruction pos, boolean isStrict) {
private predicate lowerBound(IRGuardCondition comp, Operand lowerbound, Operand bounded, boolean isStrict) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To port this library to Operand, I thought you'd just change pairs of (Instruction instr, Instruction pos) into Operand op, but this commit also changes occurrences like Instruction lowerbound to have type Operand without calling its .getInstruction(). That will increase the size of each predicate like this by effectively joining with all the places where lowerbound is used. Is there a good reason to do that?

Copy link
Contributor Author

@rdmarsh2 rdmarsh2 Nov 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows us to be more precise when lowerbound has a lower bound, as in the following (contrived) example:

int f(int x, int y) {
  if (x < 0) {
    return x;
  }
  iif (x < y) {
   return y; // y is strictly positive because of the bound on x above
  }
  return 0;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test demonstrating this for lower bounds

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ,there actually is a cartesian product there - I'll see if I can add it in directly or if I need to do a rewrite of IRGuards.qll to make it work

this = TNeg() and result = TNeg() or
this = TNeg() and result = TZero() or
this = TZero() and result = TPos() or
this = TPos() and result = TPos()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code assumes that overflow can't occur, but we often get FP reports for the existing range analysis library because overflow is relied on in practice. See, for example, https://discuss.lgtm.com/t/false-positive-c-comparison-result-is-always-the-same/1428.

I expect that we'll need to model overflow for unsigned integers, but we can continue to ignore it for signed integers because it's undefined there. For a case like inc, this means adding that this = TPos() and result = TZero() in the unsigned case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the corresponding query in java, FPs for overflow checks are instead excluded in the query. This works because overflow checks are pretty idiomatic and easy to recognise, and thus allows the rangeanalysis to stay (mostly) oblivious to such overflows.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's go with the Java approach for now and evaluate whether to change it later, when we have queries and results to look at.

@jbj
Copy link
Contributor

jbj commented Nov 9, 2018

I'm willing to merge this when the tests pass if you can vouch for the performance of the modified IRGuards.qll. I wonder if IRGuardCondition should extend Operand instead of Instruction, but we can discuss that outside this PR.

@rdmarsh2
Copy link
Contributor Author

rdmarsh2 commented Nov 9, 2018

From testing on LuaJit, it doesn't look like there's a significant impact on IRGuards performance; both runs spent approximately 17 seconds evaluating the IRGuards stages.

@rdmarsh2
Copy link
Contributor Author

rdmarsh2 commented Nov 9, 2018

I wonder if IRGuardCondition should extend Operand instead of Instruction, but we can discuss that outside this PR.

That's an interesting point... It might allow us to infer some additional bounds when a variable is compared to multiple other variables in a short-circuiting conditional. I agree we should leave it for another PR.

@jbj jbj merged commit 0cb09b1 into github:master Nov 12, 2018
@rdmarsh2 rdmarsh2 deleted the rdmarsh/cpp/sign-analysis branch November 12, 2018 18:00
aibaars pushed a commit that referenced this pull request Oct 14, 2021
MathiasVP pushed a commit to MathiasVP/ql that referenced this pull request Aug 10, 2025
MathiasVP pushed a commit to MathiasVP/ql that referenced this pull request Aug 10, 2025
…slip"

This reverts commit 4dfa5d2, reversing
changes made to 8cd58aa.
MathiasVP pushed a commit to MathiasVP/ql that referenced this pull request Aug 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants