Skip to content

Conversation

@crazy-max
Copy link
Member

while working on #15151, I found out that crawling for search engines was enabled for staging and netlify previews (pull requests): https://www.google.com/search?q=site%3Adocs-stage.docker.com

Signed-off-by: CrazyMax crazy-max@users.noreply.github.com

@netlify
Copy link

netlify bot commented Jul 19, 2022

Deploy Preview for docsdocker ready!

Built without sensitive environment variables

Name Link
🔨 Latest commit ef40cff
🔍 Latest deploy log https://app.netlify.com/sites/docsdocker/deploys/62d69f89e4d14100080d3f83
😎 Deploy Preview https://deploy-preview-15153--docsdocker.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@thaJeztah
Copy link
Member

I think we set a noindex in the page's head on stage and preview;

Screenshot 2022-07-19 at 09 21 41

Looks like we picked the sitemap for that (maybe we should have something more explicit?) https://github.com/docker/docker.github.io/blob/38fec0d159134a9af7e8a3c226057a114b0622be/_includes/head.html#L33-L35

{%- if jekyll.environment == 'production' -%}
User-agent: *

# Docker Engine archives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we have proper redirects, we can probably remove all of these (I think these were added when we were still hosting archives, but they also helped to prevent Google indexing archived URLs it found elsewhere with "new" content, but using the "old" URL - not 100% sure though).

robots.txt Outdated
Comment on lines 5 to 7
This liquid template is necessary to only allow search engine crawling in
production environment. We don't want staging or long-lived previews on Netlify
to be indexed by search engines.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we output this as a regular comment in robots.txt if indexing is disabled? e.g. something like;

# Disable all indexing on staging websites and Netlify previews to prevent
# them showing up in search results.

robots.txt Outdated
Comment on lines 39 to 42
{%- else -%}
User-agent: *
Disallow: /
{%- endif -%}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's checking for that, but looks like the %- and -% on both sides causes no newline to be present at the end; perhaps (one of them) needs to be removed

@crazy-max
Copy link
Member Author

Looks like we picked the sitemap for that (maybe we should have something more explicit?)

Hum yeah I think we should use {%- if jekyll.environment == 'production' -%} as cond

<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
{%- if page.sitemap == false or site.GH_ENV == "gh_pages" %}
{%- if jekyll.environment != 'production' %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to keep the page.sitemap as well; it's used for pages that are not removed, but no longer present in the TOC (and/or shouldn't be indexed, such as pages about deprecated things)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes indeed, fixed.

Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@thaJeztah thaJeztah merged commit f3be551 into docker:master Jul 19, 2022
@crazy-max crazy-max deleted the fix-robots-txt branch July 19, 2022 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants