Skip to content

chore: prepare Algolia migration#5632

Merged
slorber merged 2 commits intomainfrom
slorber/prepare-algolia-migration
Oct 1, 2021
Merged

chore: prepare Algolia migration#5632
slorber merged 2 commits intomainfrom
slorber/prepare-algolia-migration

Conversation

@slorber
Copy link
Collaborator

@slorber slorber commented Oct 1, 2021

Motivation

Algolia is migrating to a new system where each app has an appId instead of all apps sharing the same appId

This allows them to provide a dashboard for each app, with access to the indice content, a query UI, the ability to trigger crawls manually... much helpful for users to debug their search issues

A progressive batched rollout has started, with the Docusaurus site as one of the first migrated site.

This PR is just some initial doc/config changes we can do today.

We'll write a blog post and update the doc later.

Have you read the Contributing Guidelines on pull requests?

yes

Test Plan

preview

@slorber slorber added the pr: maintenance This PR does not produce any behavior differences to end users when upgrading. label Oct 1, 2021
@slorber slorber requested a review from lex111 as a code owner October 1, 2021 13:03
@netlify
Copy link

netlify bot commented Oct 1, 2021

✔️ [V2]

🔨 Explore the source changes: d66454f

🔍 Inspect the deploy log: https://app.netlify.com/sites/docusaurus-2/deploys/61571059f54ad100080f29f5

😎 Browse the preview: https://deploy-preview-5632--docusaurus-2.netlify.app

@github-actions
Copy link

github-actions bot commented Oct 1, 2021

⚡️ Lighthouse report for the changes in this PR:

Category Score
🟠 Performance 83
🟢 Accessibility 98
🟢 Best practices 100
🟢 SEO 100
🟢 PWA 95

Lighthouse ran on https://deploy-preview-5632--docusaurus-2.netlify.app/

@github-actions
Copy link

github-actions bot commented Oct 1, 2021

Size Change: +2 B (0%)

Total Size: 835 kB

ℹ️ View Unchanged
Filename Size Change
website/.docusaurus/globalData.json 38.5 kB 0 B
website/build/assets/css/styles.********.css 94 kB 0 B
website/build/assets/js/main.********.js 421 kB 0 B
website/build/blog/2017/12/14/introducing-docusaurus/index.html 67 kB 0 B
website/build/blog/index.html 38.1 kB 0 B
website/build/docs/index.html 44.6 kB +1 B (0%)
website/build/docs/installation/index.html 52.8 kB +1 B (0%)
website/build/index.html 30.8 kB 0 B
website/build/tests/docs/index.html 25.5 kB 0 B
website/build/tests/docs/standalone/index.html 22.9 kB 0 B

compressed-size-action

@shortcuts
Copy link
Contributor

Love it! Don't hesitate to invite @lex111 in your new app :D

Co-authored-by: Clément Vannicatte <20689156+shortcuts@users.noreply.github.com>
@facebook-github-bot facebook-github-bot added the CLA Signed Signed Facebook CLA label Oct 1, 2021
@slorber slorber merged commit afff053 into main Oct 1, 2021
@slorber slorber deleted the slorber/prepare-algolia-migration branch October 1, 2021 13:58
@lex111
Copy link
Contributor

lex111 commented Oct 1, 2021

I like this idea, just accepted the invitation, thanks!

@shortcuts
Copy link
Contributor

In the meantime of the migration being officially started, you can already browse some doc for the new infrastructure here: https://deploy-preview-1048--eloquent-haibt-041563.netlify.app/

@krillboi
Copy link

Hi @slorber,

I am currently trying to migrate to the new Algolia configuration but am having some trouble getting the crawler to work.

It looks like you have successfully migrated. Is it possible to get a copy of your new crawler configuration such that I can see if mine looks similar to yours?

I'm currently on beta-4 if that has any importance.

@slorber
Copy link
Collaborator Author

slorber commented Oct 27, 2021

@krillboi the best is to get in touch with @shortcuts , I think they had some infra issues recently on large docs and it may affect your site?

@krillboi
Copy link

@krillboi the best is to get in touch with @shortcuts , I think they had some infra issues recently on large docs and it may affect your site?

I contacted Algolia support and they said something about configuring canonical URLs but I am really unsure what that actually entails as we haven't really changed anything, which is why I thought having a nearly identical config to yours could help, if you already successfully converted your old config to the new setup.

@shortcuts
Copy link
Contributor

I contacted Algolia support and they said something about configuring canonical URLs but I am really unsure what that actually entails as we haven't really changed anything, which is why I thought having a nearly identical config to yours could help, if you already successfully converted your old config to the new setup.

Hey, we provide templates for website generators, so if your previous config was using the Docusaurus v2 one, it should still be the case.

You can find our templates here and make sure it match: https://docsearch.algolia.com/docs/templates/

@krillboi
Copy link

I contacted Algolia support and they said something about configuring canonical URLs but I am really unsure what that actually entails as we haven't really changed anything, which is why I thought having a nearly identical config to yours could help, if you already successfully converted your old config to the new setup.

Hey, we provide templates for website generators, so if your previous config was using the Docusaurus v2 one, it should still be the case.

You can find our templates here and make sure it match: https://docsearch.algolia.com/docs/templates/

Thanks I'll check this out!

@krillboi
Copy link

krillboi commented Oct 27, 2021

@shortcuts

Configuring my config a lot closer to the template seems to have helped me get through, though I am seeing a lot of "ignored because of redirects" entries in Monitoring. Not sure why. Also it seems to get stuck at 425 URLs for me, which might be the infra issue @slorber was referring to? We do have a very large site.

Anyways, I guess I should direct this at Algolia support now instead :)

@shortcuts
Copy link
Contributor

(The DocSearch support will be redirected to me anyway so it's good :D)

though I am seeing a lot of "ignored because of redirects" entries in Monitoring

It's usually because our crawler found links with and without trailing slashes, only one of them will be handled anyway!

Also it seems to get stuck at 425 URLs for me, which might be the infra issue @slorber was referring to? We do have a very large site.

Indeed!

The issue is related to really large pages timing out and retrying, which slows the crawl process a lot (can last few hours). If you know which URL are concerned, we recommend you to exclude them from the crawl (add it in exclusionPatterns) and re-start the crawl job.

We are deploying a fix in our dev env to make more tests and will deploy the real fix in production by tomorrow.

@krillboi
Copy link

It's usually because our crawler found links with and without trailing slashes, only one of them will be handled anyway!

Okay, thanks for clarifying.

We are deploying a fix in our dev env to make more tests and will deploy the real fix in production by tomorrow.

Cool, I'd like to know when it's in production so I can test again. Can inform that the crawl I left at work yesterday afternoon was not finished when I came into work this morning.

@shortcuts
Copy link
Contributor

Cool, I'd like to know when it's in production so I can test again. Can inform that the crawl I left at work yesterday afternoon was not finished when I came into work this morning.

We retry a few time before timing out so it's totally possible. I'll let you know once it's deployed

@shortcuts
Copy link
Contributor

Hey @krillboi, fix is in prod and seems to work well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed Signed Facebook CLA pr: maintenance This PR does not produce any behavior differences to end users when upgrading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants