new_audit: mixed-content https upgradeability by christhompson · Pull Request #3953 · GoogleChrome/lighthouse

christhompson · 2017-11-30T01:11:38Z

This introduces a new gatherer and audit that tests which requests a page makes can be directly upgraded to use HTTPS, and which fail when upgraded. The report gives a list of resources which were loaded insecurely in the default pass, their initiating document, and whether they are upgradeable. This implements the audit proposed in #3164.

It is currently implemented as a separate custom config, as it requires an additional pass in which to do the upgrading. The new pass uses the mixed-content gatherer, which intercepts each HTTP request, and sends an HTTP 302 Found redirect to upgrade it to HTTPS. The audit compares the network records for the default pass with the upgraded requests form the mixed-content pass.

I've also included some basic tests for the functionality of both the gatherer and the audit.

This can currently be run using the following invocation:
lighthouse-cli/index.js --chrome-flags="--headless" --config-path=./lighthouse-core/config/mixed-content.js http://www.bbc.com/

The gatherer relies on the Network request interception commands, which may require Canary at the moment since they're currently Experimental.

The audit report looks like this (the "Full URL" column uses Lighthouse's URL type, which reveals the full URL on hover):

christhompson · 2017-11-30T01:14:44Z

One note is that the audit currently has some potential false-negatives, mostly around requests that involve redirects, where it will say a resource is non-upgradeable when it may in fact be upgradeable to HTTPS.

These tend to be most prevalent in what I sometimes refer to as "second-order requests" -- requests made by a subresource loaded by the main page (often advertising libraries). One approach I have tried is to only check and report resources loaded by the main frame (where the initiator is the same as the URL loaded by the gatherer). It's a bit of a tradeoff of complete details about the resources requested and accuracy -- if anyone feels strongly about false-negatives, I can make that change.

adrifelt · 2017-11-30T13:17:09Z

Two ideas about false negatives:

You could put together a whitelist of common 3P inclusions with known HTTPS support and shortcut the check for those inclusions. We'd save the time for doing the second pass for that resource, and also would be able to avoid a false positive.
We could change the UI so that we emphasize the "Yes" resources and never clearly say "No." The feature would be positioned as helping you find the things that are easy to switch and point out the things that are potentially problematic. For the "No" cases, we could say "Maybe," "Needs Investigation," or something smarter that a UX person could think of. :)

estark37 · 2017-12-07T17:34:26Z

@christhompson is this waiting on review/advice from a Lighthouse person? @paulirish could you suggest someone to review? Thanks!

christhompson · 2017-12-07T18:13:33Z

Yes, this should be ready for review.

Also pinging @vinamratasingal who may be able to give more feedback about the UI of the audit report.

Re: False negatives, the whitelist would be a nice addition, but there are still some cases that fall outside that (where the current implementation fails to detect a successful upgrade). We still need to load the entire page a second time, so I don't think we'd be able to short-circuit. I think the "Needs Investigation" change may be the best, especially since it frames the audit more positively overall (maybe combined with a tweak to the main audit explanation text).

vinamratasingal-zz · 2017-12-08T03:28:40Z

Hi, just got the chance to look at this. Some quick feedback (also apologies if this had been decided/discussed on a previous thread, I'm just jumping in):

My understanding that this was going to show up in the Best Practices section of the report. Is that still the plan or is the plan to put in its own section of the report?
It would be great if we could condense the top level description of the category and add some documentation explaining this. Usually we have documentation on developers.google.com for all the audits, not sure if this is something your team has already done/is planning on doing. Suggestion on wording for the description: "These audits check whether mixed content warnings are blocking your page to upgrade to HTTPS. [Learn more]"
Can we separate out the upgradable/insecure requests? I believe we actually show insecure requests (see screenshot below) and the specific URLs if the website is not on HTTPS, so perhaps we just list the requests that we know for sure are upgradable.
Is there a way that we can help users prioritize/organize the different URLs that we show (perhaps ones that are more important to upgrade than others?)
Also, is this audit only applicable for websites already supporting HTTPS? (i.e. if I as a dev haven't even begun migrating my site to HTTPS, this audit isn't really applicable since I need to do that before handling mixed-content warnings?)

christhompson · 2017-12-12T00:26:36Z

My understanding that this was going to show up in the Best Practices section of the report. Is that still the plan or is the plan to put in its own section of the report?

I'd be happy if we could include it in the Best Practices section. I'd thought it was not going to be included in the default config at the moment since it causes an extra page load pass.

It would be great if we could condense the top level description of the category and add some documentation explaining this. Usually we have documentation on developers.google.com for all the audits, not sure if this is something your team has already done/is planning on doing. Suggestion on wording for the description: "These audits check whether mixed content warnings are blocking your page to upgrade to HTTPS. [Learn more]"

I can reduce the text at the top of the audit. I am planning on writing documentation that can be linked to, but was unsure of the process/order for doing that.

Can we separate out the upgradable/insecure requests? I believe we actually show insecure requests (see screenshot below) and the specific URLs if the website is not on HTTPS, so perhaps we just list the requests that we know for sure are upgradable.

This may also help resolve the false-negatives issue. The audit could be framed as "there is no harm in upgrading these resources to HTTPS right now, to maximize your readiness to switch to HTTPS in the future".

Is there a way that we can help users prioritize/organize the different URLs that we show (perhaps ones that are more important to upgrade than others?)

I included the Initiator column to try to add a priority level of some sort -- things initiated by the main page are more under the control of the developer, and so they would be higher priority. Maybe it would make sense to only include the resources that were initiated by the main page, especially given the discussion on the previous bullet.

Also, is this audit only applicable for websites already supporting HTTPS? (i.e. if I as a dev haven't even begun migrating my site to HTTPS, this audit isn't really applicable since I need to do that before handling mixed-content warnings?)

This audit is intended for sites that have not yet migrated to HTTPS. The idea is that they can check whether they can just "flip the switch" to HTTPS by simply changing the relevant URLs to HTTPS, or if they need to contact specific third-parties about how to upgrade. If all URLs are upgradeable, then the developer knows they should have minimal problems when making the switch.

vinamratasingal-zz · 2017-12-13T02:37:51Z

re: Best Practices- ah, I got confused. No, the plan currently is only to expose it via the full config on the CLI, not via Best Practices.

re: documentation- we have a tech writer (Kayce B), and it would be good to reach out to him to get the documentation written.

re: changes- awesome. Do you mind uploading an updated screenshot so I can take another look?

patrickhulce

sorry for the delay here and thanks for the contribution!!

as you note, we'll keep this in a separate config and out of the default until there's a way to do it without adding an additional pass, but overall approach seems reasonable to me.

patrickhulce · 2017-12-12T00:31:09Z

lighthouse-core/audits/mixed-content.js

+      insecureUrls.forEach(record => {
+        const resource = {
+          host: new URL(record.url).hostname,
+          full: record.url,


nit: how about fullUrl for clarity

patrickhulce · 2017-12-12T00:34:30Z

lighthouse-core/audits/mixed-content.js

+      const upgradePassSecureHosts = new Set();
+      upgradedRecords.forEach(record => {
+        upgradePassHosts.add(new URL(record.url).hostname);
+        if (MixedContent.isSecureRecord(record) && record.finished) {


is .finished want we want without checking .failed?

from devtools it seems as though finished will be true even if the request failed, but perhaps in practice we observe something different? :)

Good call. I added another check for !record.failed.

patrickhulce · 2017-12-12T00:35:17Z

lighthouse-core/audits/mixed-content.js

+
+      const totalRecords = defaultRecords.length;
+      const score = 100 *
+          (secureRecords.length + 0.5 * upgradeableResources.length)


nice! I like this scoring rubric :)

patrickhulce · 2017-12-12T00:37:45Z

lighthouse-core/gather/gatherers/mixed-content.js

+ * @return {string}
+ */
+const btoa = function(str) {
+  return Buffer.from(str, 'utf8').toString('base64');


to be honest I find this line more readable than good ol' btoa and since we only use it once I'd vote to nix

at the least maybe we can say encodeAsBase64 or something

Yeah, that makes sense. I've pulled this inline here.

patrickhulce · 2017-12-12T00:39:18Z

lighthouse-core/test/gather/gatherers/mixed-content-test.js

+  return Buffer.from(str, 'base64').toString('utf8');
+};
+
+describe('btoa', () =>{


feels a bit weird to test a function that gets exercised only in the tests, did you mean to expose the btoa from the gatherer file?

Now that I just inlined the b64 function in the other file, I did the same here and got rid of the test. In hindsight don't think it's a particularly useful path to test (and some of the other tests should break if something strange happens to it).

christhompson · 2017-12-14T00:12:35Z

Thanks for the comments. I've updated the PR to address the comments so far.

I've simplified the audit report quite a bit by restricting it to list resources that are upgradeable that are initiated by the main page only. Here's what it looks like currently:

paulirish

What you've made here looks great. Really like the original UX and the latest revised stuff.

Since this currently relies on interception, we are a little blocked on getting https://bugs.chromium.org/p/chromium/issues/detail?id=764505 sorted out. But since this isn't landing in the default config, we may have some options.

Really nice job putting this one together.

vinamratasingal-zz · 2017-12-20T18:20:19Z

This looks great, thanks for working through with us Chris! One last question before giving the LGTM on the UX:

Why are we talking about the number of insecure requests total? Should we just focus on the number of upgradable requests instead?

christhompson · 2017-12-20T19:40:20Z

@vinamratasingal Good catch. With the latest changes I didn't really update the main string. The scoring of the audit will still cause failures on any insecure resources being loaded, but given the new output of the audit, I think something like the following would be good:

Failure text: "Some resources loaded are insecure: XX requests could be upgraded to HTTPS."
Passing text: "All resources loaded are secure."

I can push a quick update making that change.

@paulirish I was a little worried the Experimental nature of request interception would be a pain. I'd been testing the custom config using --headless so I guess I got lucky and never ran into the issue because of that. Let me know what you think the best path going forward is for adding the audit or working around this bug.

This updates the audit failure text to only reference the number of upgradeable resources. It also simplifies how resources are tracked now that we don't output details about non-upgradeable resources. It also only counts upgradeable resources that were initiated by the main page for the count and score, so that the count of upgradeable resources in the text matches the number of items in the audit report table.

vinamratasingal-zz · 2017-12-20T23:18:39Z

New strings LGTM. Mind publishing an updated screenshot so I can take a final pass and LGTM?

christhompson · 2017-12-20T23:48:56Z

Here's a screenshot of the latest version of the report:

vinamratasingal-zz · 2017-12-20T23:50:56Z

LGTM

christhompson · 2018-01-11T17:46:32Z

Pinging this thread. @paulirish As you mentioned, this currently only works with the Chrome headless flag. Would it still be possible to land this since it isn't included in any default config/category? This would then mostly be a complication to the documentation for how to run the audit.

paulirish · 2018-01-12T02:29:18Z

@christhompson yup we can definitely land this. I was playing with this audit today and had a few ideas. Instead of leaving review comments I went ahead and made the changes: christhompson#2
WDYT?

Also now that the Table and URL renderers suck less, I don't think we need the extra Hostname column in the report. Does that work for you? Or would you rather keep it explicit so they're easier to scan. I could go either way.

If this all works, for you, lets land it. Then we can write docs on 1) how to run this audit and 2) what the audit is about.

Cheers and thanks for your patience on this one! really appreciate it.

Review edits

googlebot · 2018-01-16T00:49:53Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this State. It's up to you to confirm consent of the commit author(s) and merge this pull request when appropriate.

christhompson · 2018-01-16T00:52:13Z

I like the cleanliness of the new column style in your changes -- it also fixes the minor annoyance of the left/right aligned columns in the old table. Adding the is-on-http audit to the config is also a good idea -- thanks for adding that.

paulirish

lgtm!

Thanks again for your patience on this PR. Apologies for the delay.

paulirish · 2018-01-16T19:18:52Z

Landed! 🎉 🎉

Some next steps:

readme docs on how to run this audit
audit docs for what this audit is about
an npm script to make running this easier? e.g. yarn run mixedcontent http://www.bbc.com

vinamratasingal-zz · 2018-01-16T22:24:28Z

Chris- are you going to take on the work of the docs? Or do we need help from @kaycebasques here? (Kayce does all of our tech docs on developers.google.com)

christhompson · 2018-01-16T22:27:23Z

I have a draft of some docs for this -- I'd be happy to coordinate with Kayce to get them in proper shape.

vinamratasingal-zz · 2018-01-16T22:28:05Z

Sounds good- let me know if you need help with this. Thanks for your patience Chris! :)

christhompson added 7 commits November 28, 2017 10:18

Mixed content upgrade check using interception

a7bbe08

Functional version of audit

61d0426

Fix FP from caching, fix insecure record count

c1c3b2a

Cleanup and add tests

05db0db

Linter fixes

834f871

More cleanup and test fixes

a0597ca

Fix request count

52610f8

christhompson requested review from brendankenny, patrickhulce and paulirish as code owners November 30, 2017 01:11

christhompson changed the title ~~new-audit: mixed-content https upgradeability~~ new_audit: mixed-content https upgradeability Nov 30, 2017

vinamratasingal-zz added this to the Sprint Seis: December 11 - 22 milestone Dec 11, 2017

patrickhulce reviewed Dec 13, 2017

View reviewed changes

patrickhulce added the waiting4committer label Dec 13, 2017

christhompson added 2 commits December 13, 2017 15:01

Simplify audit text and columns

48eaa72

Address review comments

8e8788e

patrickhulce added waiting4reviewer and removed waiting4committer labels Dec 14, 2017

paulirish reviewed Dec 20, 2017

View reviewed changes

paulirish modified the milestones: Sprint Seis: December 11 - 22, Sprint Siete: January 8-19 Jan 8, 2018

paulirish added 3 commits January 11, 2018 18:22

refactor loop

d2a0cfd

Also run the HTTPS audit in this config

0ad0577

Wording tweaks. Only show URL column

2cd3678

paulirish mentioned this pull request Jan 12, 2018

A few edits christhompson/lighthouse#2

Merged

paulirish and others added 2 commits January 12, 2018 13:44

shorter displayValue, given the new audit title

9fff186

Merge pull request #2 from paulirish/mixedcontent-edits

69a60be

Review edits

paulirish approved these changes Jan 16, 2018

View reviewed changes

paulirish merged commit 1df4565 into GoogleChrome:master Jan 16, 2018

christhompson mentioned this pull request Jan 24, 2018

cli(yarn): Add mixed-content yarn script #4344

Merged

christhompson mentioned this pull request Feb 7, 2018

core(help-text-fix): Add link to docs for mixed-content audit #4444

Merged

paulirish removed the waiting4reviewer label Mar 6, 2018

connorjclark mentioned this pull request Apr 20, 2020

Alignment with new DevTools Mixed Content issues #10615

Closed

Conversation

christhompson commented Nov 30, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christhompson commented Nov 30, 2017

Uh oh!

adrifelt commented Nov 30, 2017

Uh oh!

estark37 commented Dec 7, 2017

Uh oh!

christhompson commented Dec 7, 2017

Uh oh!

vinamratasingal-zz commented Dec 8, 2017

Uh oh!

christhompson commented Dec 12, 2017

Uh oh!

vinamratasingal-zz commented Dec 13, 2017

Uh oh!

patrickhulce left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christhompson commented Dec 14, 2017

Uh oh!

paulirish left a comment

Choose a reason for hiding this comment

Uh oh!

vinamratasingal-zz commented Dec 20, 2017

Uh oh!

christhompson commented Dec 20, 2017

Uh oh!

vinamratasingal-zz commented Dec 20, 2017

Uh oh!

christhompson commented Dec 20, 2017

Uh oh!

vinamratasingal-zz commented Dec 20, 2017

Uh oh!

christhompson commented Jan 11, 2018

Uh oh!

paulirish commented Jan 12, 2018

Uh oh!

googlebot commented Jan 16, 2018

Uh oh!

christhompson commented Jan 16, 2018

Uh oh!

paulirish left a comment

Choose a reason for hiding this comment

Uh oh!

paulirish commented Jan 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vinamratasingal-zz commented Jan 16, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

christhompson commented Jan 16, 2018

Uh oh!

vinamratasingal-zz commented Jan 16, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

christhompson commented Nov 30, 2017 •

edited

Loading

paulirish commented Jan 16, 2018 •

edited

Loading

vinamratasingal-zz commented Jan 16, 2018 •

edited

Loading