Skip to content

Conversation

@xiemaisi
Copy link

This adds a query that would have spotted the event-stream incident. The idea is to look for hard-coded hexadecimal data that is transformed and then interpreted as code or as an import path. There are very few good reasons to do this; the only somewhat acceptable use-case I've seen so far is using eval to check whether the runtime supports bigints. For now, I'm willing to accept the very occasional false positives (in test code) this causes.

The first two commits introduce two unrelated library improvements we need to catch the event-stream backdoor: a more lenient definition of what constitutes a require call (accounting for cases where our syntactic heuristics fail to identify a Node.js module as such), and a taint propagation step through Buffer.from. These two could affect many other queries, so I'm running a full-scale dist-compare to find out. So far, I have not seen any performance regressions, and occasional false-positive fixes due to the improved require handling.

As discussed with @pavgust and @sjvs, I'm targetting this at 1.19 due to customer interest.

@xiemaisi xiemaisi added the JS label Nov 29, 2018
@xiemaisi xiemaisi added this to the 1.19 milestone Nov 29, 2018
@xiemaisi xiemaisi requested a review from a team as a code owner November 29, 2018 09:54
@xiemaisi
Copy link
Author

Ping @mc-semmle for doc review; this should hopefully be the last new JavaScript query for 1.19, apologies for squeezing it in so late.

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM.
I have a few thoughts on the sources and sanitizers, but those are improvements that we can address later.

* A constant string consisting of eight or more hexadecimal characters, viewed
* as a source of hard-coded data that should not be interpreted as code.
*/
private class DefaultSource extends Source, DataFlow::ValueNode {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried treating Buffer.from(_, "hex") or and other decoders as sources? Maybe base-64 decoding should not be a source since it could be a compression strategy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea. There are a few other avenues for further improvement which I'd like to explore after the release.

Copy link
Contributor

@mchammer01 mchammer01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiemaisi - this looks good to me. One minor suggestion that you can ignore if you wish.

## General improvements

* Modelling of taint flow through array operations has been improved. This may give additional results for the security queries.
* Modelling of taint flow through array and buffer operations has been improved. This may give additional results for the security queries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use "Modeling" as I believe it's the most common used form in the US. However, I leave this up to you as this spelling may be used elsewhere

ghost
ghost previously approved these changes Nov 29, 2018
Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. Do you want another evaluation before we merge?

@xiemaisi
Copy link
Author

The changes don't look like they will affect performance, but of course I can do a quick run on a few projects if you prefer. The larger-scale evaluation is still ongoing (based on the old version of the query before the most recent changes), I'll report back once it's done.

At this point, there is no extreme rush to get this query merged, we've missed this week's dist upgrade anyway.

@asger-semmle asger-semmle dismissed mchammer01’s stale review November 29, 2018 17:04

The review suggestion has been implemented

@asger-semmle
Copy link
Contributor

@mc-semmle when you select "Request changes" in GitHub's interface it blocks merging until you change it again. I've "dismissed" your review for now since it seems to have been addressed.

@asger-semmle asger-semmle merged commit f85e30a into github:rc/1.19 Nov 29, 2018
@mchammer01
Copy link
Contributor

Yes my comment seems to have been addressed but I had already left the office so I didn't get a chance to approve the review.

@xiemaisi xiemaisi deleted the js/numeric-constant-interpreted-as-code branch January 10, 2019 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants