Skip to content

feat: return 410 status code on missing JS assets#2542

Merged
barjin merged 3 commits into
masterfrom
docs/410-on-stale-js-chunks
May 18, 2026
Merged

feat: return 410 status code on missing JS assets#2542
barjin merged 3 commits into
masterfrom
docs/410-on-stale-js-chunks

Conversation

@barjin
Copy link
Copy Markdown
Member

@barjin barjin commented May 18, 2026

As per the findings of the SEO team:

The 404 JS files (e.g. docs.apify.com/assets/js/3b91d6c4.c7552f0b.js) are auto-generated during deployment. Every time the site is updated and published, these JS files are generated with new random names and the old ones are deleted. Googlebot had cached the old filenames and kept trying to fetch them after they were already gone, causing the 404s.
Why we should solve this: Googlebot is wasting its crawl budget on docs.apify.com requesting JS files that no longer exist, causing an 8% 404 instead of crawling actual content pages.

Returning 410 Gone instead of 404 Not Found on regenerated JS assets should send a stronger signal that the asset path is no longer valid to the consumer.

@apify-service-account
Copy link
Copy Markdown
Contributor

apify-service-account commented May 18, 2026

🗑️ Preview for this PR was deleted.

@github-actions github-actions Bot added this to the 141st sprint - Tooling team milestone May 18, 2026
@github-actions github-actions Bot added the t-tooling Issues with this label are in the ownership of the tooling team. label May 18, 2026
@barjin barjin added the adhoc Ad-hoc unplanned task added during the sprint. label May 18, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Nginx reverse-proxy configuration for the Apify docs sites to return 410 Gone (instead of 404 Not Found) when hashed Docusaurus JS/CSS assets are missing, helping crawlers and caches stop requesting stale bundles after deployments.

Changes:

  • Adds a shared named location that returns HTTP 410 and maps upstream 404s for hashed assets to it.
  • Introduces dedicated regex location blocks for /assets/js/*.js and /assets/css/*.css in the main docs site and each proxied sub-site (/sdk/*, /api/client/*, /cli).

@barjin barjin requested review from B4nan, TC-MO, marcel-rbro and vladfrangu and removed request for vladfrangu May 18, 2026 10:02
@barjin barjin merged commit 4abc10a into master May 18, 2026
20 checks passed
@barjin barjin deleted the docs/410-on-stale-js-chunks branch May 18, 2026 10:49
@barjin
Copy link
Copy Markdown
Member Author

barjin commented May 18, 2026

For the record, this doesn't seem to have helped (see, e.g., the response to https://docs.apify.com/assets/js/abc.js )

@TC-MO
Copy link
Copy Markdown
Contributor

TC-MO commented May 18, 2026

I think nginx takes a while to start working properly (or at least it did for regular 301's I was doing so maybe here is the same case? I'll observe this for the next day/two and see how it behaves)

@B4nan
Copy link
Copy Markdown
Member

B4nan commented May 18, 2026

nginx is fast, but there is a CDN in front of it, which takes a bit. last time I was battling something here, I tried 7 different variants before surrending, only to see the other day that it actually worked.

@barjin
Copy link
Copy Markdown
Member Author

barjin commented May 20, 2026

Okay, confirmed now.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants