new-audit(unminified-javascript): detect savings from minifcation#3950
new-audit(unminified-javascript): detect savings from minifcation#3950
Conversation
|
I was surprised how close the margin of error was using your simplified lexer for minified savings here. Nice work, @patrickhulce! |
| tokenLengthWithMangling += token.type === 'Identifier' ? 1 : token.value.length; | ||
| } | ||
|
|
||
| if (1 - tokenLength / contentLength < IGNORE_THRESHOLD_IN_PERCENT) return null; |
There was a problem hiding this comment.
let's add a comment to indicate this is for handling pre-minified code. \o/
paulirish
left a comment
There was a problem hiding this comment.
nice. The change in fd547cd was the big tweak I think this needed, making it more conservative and reducing our chance of false positives.
Naturally this does increase our bundle size? bundlesize isn't describing how much though.. do you know what the delta is? (Just want to have a record of it)
| }; | ||
| } | ||
|
|
||
| /** |
There was a problem hiding this comment.
can you add a comment here explaining the basic approach? using the description from this PR works for me.
let's also call out that inline scripts are not evaluated. (i dont think it matters in practice, but since other audits of ours include inline scripts we should just be explicit)
There was a problem hiding this comment.
yeah sounds good 👍
done
| .catch(_ => null) | ||
| .then(content => { | ||
| if (!content) return; | ||
| scriptContentMap.set(record.url, content); |
There was a problem hiding this comment.
since we're definitely dealing with networkRecords i'd rather be using requestIds here as the key.
over in the audit we could could then use WebInspector.NetworkLog.requestForId() to grab the request and pull the URL off that. wdyt?
i suppose this'll make testing slightly harder, so curious what you think.
There was a problem hiding this comment.
yeah that's fair, if same URL was requested multiple times with different content we should surface that 👍
9c65ab0 to
25508c7
Compare
| const esprima = require('esprima'); | ||
|
|
||
| const IGNORE_THRESHOLD_IN_PERCENT = .1; | ||
| const IGNORE_THRESHOLD_IN_PERCENT = 10; |
There was a problem hiding this comment.
lol, yeah I ended up switching because the wastedPercent value is x/100, it's the wastedRatio thats x/1




addresses js part of #3459 using the 3rd approach outlined
strategy: estimate minification savings by determining the ratio of the length of js tokens to overall string length
this was surprisingly accurate at identifying if a script was already minified, but is only ~6 lines using esprima and takes ~30ms/MB to tokenize compared to ~2000ms/MB for uglify and even longer for babel-minify
below is a table outlining the observed savings,
w/gzipdenotes the % savings after accounting for gzip, which is usually lower because minification tends to remove things that compress well(74.5 + 88.8 ) / 2(31.3 + 63.1) / 2(49.4% + 75.2%) / 2(73.3% + 87.9%) / 2(49.1% + 74.4%) / 2(67.9% + 82.1%) / 2identifying the already minified scripts is as easy as checking if the savings is low as no production minified script had more than 5% savings
if this all looks good, I'll go ahead and add tests, remove WIP 👍
relevant code section for estimating minification is
lighthouse/lighthouse-core/audits/byte-efficiency/unminified-javascript.js
Lines 34 to 42 in 321df67