Conversation
WalkthroughRewrote robots.txt to explicitly allow Googlebot, AdsBot-Google, and Google-InspectionTool while disallowing all other user agents. Removed the previous global block and an admonition comment. Added a catch-all disallow rule at the end. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
public/robots.txt (5)
3-6: Use Allow: / for clarity; empty Disallow is valid but less explicit.An empty Disallow is interpreted as “allow everything,” but being explicit improves readability, avoids lint warnings, and is consistent with common examples.
Apply this small edit:
-User-agent: Googlebot -Disallow: +User-agent: Googlebot +Allow: /
7-10: Cover AdsBot variants that Google actually uses.Google Ads landing page checks hit with multiple user-agent tokens (e.g., AdsBot-Google, AdsBot-Google-Mobile, AdsBot-Google-Mobile-Apps). If you intend to fully allow AdsBot, include these variants. Consider keeping AdsBot rules separate from general Googlebot rules so you can change them independently later.
Proposed adjustment:
-# Allow AdsBot (for Google Ads landing page checks) -User-agent: AdsBot-Google -Disallow: +# Allow AdsBot (for Google Ads landing page checks) +User-agent: AdsBot-Google +Allow: / + +# Also allow AdsBot mobile variants +User-agent: AdsBot-Google-Mobile +Allow: / + +User-agent: AdsBot-Google-Mobile-Apps +Allow: /Optional (only if you run AdSense): also allow the AdSense crawler:
+User-agent: Mediapartners-Google +Allow: /
11-14: LGTM; consider grouping with Googlebot and making the rule explicit.Allowing Google-InspectionTool is helpful for Search Console tests. You can either keep it separate or group multiple Google UAs that share the same rule to reduce duplication. Also prefer “Allow: /” for explicitness.
Minimal tweak:
-User-agent: Google-InspectionTool -Disallow: +User-agent: Google-InspectionTool +Allow: /Optional grouping (single group for identical rules):
-User-agent: Googlebot -Allow: / - -User-agent: Google-InspectionTool -Allow: / +User-agent: Googlebot +User-agent: Google-InspectionTool +Allow: /If you want to future-proof for other Google-operated crawlers, you may also allow “GoogleOther”:
+User-agent: GoogleOther +Allow: /
15-17: Catch‑all Disallow will block non‑Google bots, including social/link preview crawlers—verify intent (especially on staging).With this rule, Bing, DuckDuckBot, Applebot, and social preview fetchers (facebookexternalhit/Facebot, Twitterbot, LinkedInBot, Slackbot, etc.) won’t fetch your pages. That can:
- Break/unfurl link previews in social apps and chat.
- Prevent non-Google search engines from indexing/crawling.
- Interfere with uptime monitors and analytics crawlers that respect robots.
If “Google-only” is truly intended, keep it. If not, allow the common preview bots while still blocking everyone else.
Alternative that preserves “Google-only” crawling for search but enables previews:
+# Allow common social/link preview bots +User-agent: facebookexternalhit +User-agent: Facebot +User-agent: Twitterbot +User-agent: LinkedInBot +User-agent: Slackbot +User-agent: TelegramBot +User-agent: WhatsApp +Allow: / + # Block everyone User-agent: * Disallow: /Environment note: This PR targets the “staging” branch. If staging is publicly reachable, are we OK with Google crawling it? Many teams block staging (to avoid duplicate content) and allow only production. If needed, I can help set up environment-specific robots.txt (e.g., a dynamic robots endpoint or separate artifacts per deploy target).
3-17: Add a Sitemap directive to help crawlers discover content.Including a Sitemap line is harmless even when access is restricted and helps Google find URLs more reliably.
Add at the top (replace with the correct absolute URL):
+# Sitemap +Sitemap: https://<your-domain>/sitemap.xml + # Allow Googlebot User-agent: Googlebot Allow: /If you don’t have a sitemap, I can generate one or wire this to your build pipeline.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
public/robots.txt(1 hunks)
🧰 Additional context used
🪛 LanguageTool
public/robots.txt
[grammar] ~3-~3: There might be a mistake here.
Context: ...xt.org/robotstxt.html # Allow Googlebot User-agent: Googlebot Disallow: # Allow...
(QB_NEW_EN)
[grammar] ~4-~4: There might be a mistake here.
Context: ... # Allow Googlebot User-agent: Googlebot Disallow: # Allow AdsBot (for Google Ad...
(QB_NEW_EN)
[grammar] ~8-~8: There might be a mistake here.
Context: ...g page checks) User-agent: AdsBot-Google Disallow: # Allow Google Search Console...
(QB_NEW_EN)
[grammar] ~12-~12: There might be a mistake here.
Context: ...g tool User-agent: Google-InspectionTool Disallow: # Block everyone User-agent: ...
(QB_NEW_EN)
[grammar] ~16-~16: There might be a mistake here.
Context: ...isallow: # Block everyone User-agent: * Disallow: /
(QB_NEW_EN)
Description
How Has This Been Tested?
Screenshots (if appropriate):
Types of changes
Summary by CodeRabbit