-
Notifications
You must be signed in to change notification settings - Fork 1.9k
JS: Captured Nodes, type inference + a query #946
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
asger-semmle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you run a benchmark with definitions.ql included so the cost of the new query is more obvious?
It's a shame the type inference doesn't give more results. Non-escaping object literals probably aren't all that common outside of compiled code.
| w = s.getAPropertyWrite() and | ||
| fun.flowsTo(w.getRhs()) and | ||
| ( | ||
| not exists(w.getPropertyName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any dynamically stored property is assumed to be a call target? Please explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just standard soundness: we can not exclude any callees.
| * A source for which the flow is entirely captured by the dataflow library. | ||
| * All uses of the node is represented by `this.flowsTo(_)` and friends. | ||
| */ | ||
| class CapturedSource extends DataFlow::SourceNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please call this something else? Capture usually has to do with closures and captured variables, where it means something completely different. The name CapturedSource seems to suggest it is a source node referenced by an inner function, like:
let obj = {};
return function() {
return obj.x;
};There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I agree that we should call it something different, but I am not satisfied with my other names. Do you have a suggestion?
For reference, some of my dismissed prefixes:
- Local
- this is my favorite, and we have
LocalFunctionalready. I feel thatLocalNodeclashes with the concept of global variables though.LocallyUsedNodeis a slightly longer alternative, with a clearer relation to its semantic.
- this is my favorite, and we have
- Trackable
- clashes with
TrackedNode
- clashes with
- Restricted
- NonEscaping
- Sound
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, veto on "captured" from me as well. Since it only handles object literals, would "LocalObject" perhaps be an option?
|
Clearer evaluation with a comparison table, it is remarkably fast. |
xiemaisi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite through with this yet, but here are a few comments to consider. Looking very nice overall!
| * A source for which the flow is entirely captured by the dataflow library. | ||
| * All uses of the node is represented by `this.flowsTo(_)` and friends. | ||
| */ | ||
| class CapturedSource extends DataFlow::SourceNode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, veto on "captured" from me as well. Since it only handles object literals, would "LocalObject" perhaps be an option?
xiemaisi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more thoughts.
| s.hasOwnProperty(name) and | ||
| hasDefiniteReceiver(this, s) and | ||
| w = s.getAPropertyWrite() and | ||
| fun.flowsTo(w.getRhs()) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we guard somewhere else against the case that the RHS is a function that we cannot track?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a LocalObject object can not have an unknown method, because that method could expose the object through this.
javascript/ql/src/semmle/javascript/dataflow/internal/InterProceduralTypeInference.qll
Outdated
Show resolved
Hide resolved
|
All comments addressed |
|
LGTM; handing over to @Semmle/doc for query help review. |
|
Reminder to @Semmle/doc: this is a 1.20 PR that still needs a query help review. |
|
Yes sorry @xiemaisi you used to mention me but I don't always pick up mentions of @semmle/doc, especially in busy times like this. Will try to do the review for this shortly. |
mchammer01
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@esben-semmle and @xiemaisi - apologies for the late review, this passed me by.
This LGTM.
I've made a very minor comment that you can ignore if you want. I leave it up to you (I approved this PR as the issue I raised is minor).
| <example> | ||
|
|
||
| <p> | ||
| In this code, the function <code>f</code> initializes a property <code>prop_a</code> with a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggesting adding a comma between but later on and this property, to aid comprehension
|
I think we should merge this now then, to get it into 1.20 as easily as possible. I noticed that the name 'captured' still is used in variables and tests, so I will fix that while adding mc's suggested comma. |
This PR introduce the concept of a "captured node", a captured node is a node whose that the dataflow library can reason about soundly. As a first example, I have introduced a
CapturedSourcefor the object literals that never flow into a heap location or another site that we can not track with the dataflow library.I believe the analysis for when a captured node escapes is sound (except for some very unlikely prototype fiddling, see the comments in the code), but I invite you to think hard if I have missed a case.
I have used the captured nodes for two separate things:
The main feature of this PR is an improvement for the type inference: we can now track types out of some method calls. We could also use captured nodes to improve the type inference for plain property reads, but then we will get a bunch of false positives to due lack of refinement nodes for properties (requires a separate CFG improvement).
The secondary feature of this PR is a new query that detects unused properties (similar to
js/unused-variable). The false positives from this query ve been enourmously useful for identifying unsoundness in the characterization of a captured node.The new alerts look good, and the performance overhead is negligible.
The evaluation of the Type inference improvement shows surprisinlyg few new results. Maybe a better propagation of information from the global type inference layer could help.
The evaluation of the
js/unused-propertyquery is surprisingly full of new results (10k+!), which is a great indicator for the potential of additional use of the captured node analysis. I have looked at a lot of the results, and I relatively feel confident in them now. The query is just arecommendation, so any remaining unsoundness will not be a disaster. (the timeout for the closure library is not related to this improvement, it only happens on azure)