-
Notifications
You must be signed in to change notification settings - Fork 8.1k
allow search engines from crawling only on production #15153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for docsdocker ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site settings. |
|
I think we set a Looks like we picked the |
| {%- if jekyll.environment == 'production' -%} | ||
| User-agent: * | ||
|
|
||
| # Docker Engine archives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once we have proper redirects, we can probably remove all of these (I think these were added when we were still hosting archives, but they also helped to prevent Google indexing archived URLs it found elsewhere with "new" content, but using the "old" URL - not 100% sure though).
robots.txt
Outdated
| This liquid template is necessary to only allow search engine crawling in | ||
| production environment. We don't want staging or long-lived previews on Netlify | ||
| to be indexed by search engines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we output this as a regular comment in robots.txt if indexing is disabled? e.g. something like;
# Disable all indexing on staging websites and Netlify previews to prevent
# them showing up in search results.
robots.txt
Outdated
| {%- else -%} | ||
| User-agent: * | ||
| Disallow: / | ||
| {%- endif -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it's checking for that, but looks like the %- and -% on both sides causes no newline to be present at the end; perhaps (one of them) needs to be removed
Hum yeah I think we should use |
469e035 to
bda0fcf
Compare
bda0fcf to
006b24c
Compare
_includes/head.html
Outdated
| <head> | ||
| <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> | ||
| {%- if page.sitemap == false or site.GH_ENV == "gh_pages" %} | ||
| {%- if jekyll.environment != 'production' %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably need to keep the page.sitemap as well; it's used for pages that are not removed, but no longer present in the TOC (and/or shouldn't be indexed, such as pages about deprecated things)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes indeed, fixed.
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
006b24c to
ef40cff
Compare
thaJeztah
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!

while working on #15151, I found out that crawling for search engines was enabled for staging and netlify previews (pull requests): https://www.google.com/search?q=site%3Adocs-stage.docker.com
Signed-off-by: CrazyMax crazy-max@users.noreply.github.com