Skip to content

Conversation

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Oct 2, 2018

Improve data flow handling in Buffer.qll. This should resolve the false positives that came to light after support for negative indices was added to OverflowBuffer.ql (#194).

@geoffw0 geoffw0 added the C++ label Oct 2, 2018
forex(Expr def |
DataFlow::localFlowStep(DataFlow::exprNode(def), DataFlow::exprNode(bufferExpr)) |
result = getBufferSize(def, _)
) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use or forex reminds me of the code that was removed in https://git.semmle.com/Semmle/code/commit/90a05cd6fafcd2e179c49f17a5b67bb01de2b57d because it had bad performance on certain pathological snapshots. I'm guessing the same transformation could be needed here.

As I understand it, the performance is quadratic in the number of def: for each def the implicit exists will find one or more result, and then each of those result will be checked against every other def in the implicit forall.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, as I understand it the issue is that forex(x | y) is not a primitive language feature but syntax sugar for forall(x | y) and exists(x | y). Which is usually fine, but if y contains a variable from outside the forex it's possible to produce a cartesian product. I haven't been able to characterize exactly when this happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevertheless I've made the change, and it has no noticeable effect on performance (as we'd expect outside of extreme cases).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way to see the the problem is that if you wrote forall instead of forex here, the compiler would rightfully complain that result was not bound: for a bufferExpr with no associated def, any value of result would satisfy the forall. This would make result take infinitely many values in the sense that the type checker considers int to be infinite.

In other words, the forall can only check whether a candidate for result has the desired properties -- it cannot generate a candidate. When it's joined with an exists, the candidates can be generated by exists, and then each candidate is checked by the forall. If there are n different candidates, and each candidate needs to be compared for equality with the n other candidates, then the run time is n^2.

As I'm writing this out, I can see that it would take some quite contrived code to trigger the quadratic behaviour.

@jbj jbj added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Oct 2, 2018
forall(MyInt y | rel(x, y))
and
exists(MyInt y | rel(x, y))
select x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You accidentally committed your test code here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, thanks, fixed!

Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The internal PR passed its tests, so this should be safe to merge.

@jbj jbj merged commit 3e022ad into github:master Oct 10, 2018
smowton added a commit to smowton/codeql that referenced this pull request Apr 16, 2022
…lace-operators

Extract simple in-place operators
MathiasVP pushed a commit to MathiasVP/ql that referenced this pull request Aug 21, 2025
PS: Better usability when working with nested classes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ depends on internal PR This PR should only be merged in sync with an internal Semmle PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants