Skip to content

Doctests: fetch all files in the diff that contain doctests#29716

Closed
gante wants to merge 4 commits intohuggingface:mainfrom
gante:dummy
Closed

Doctests: fetch all files in the diff that contain doctests#29716
gante wants to merge 4 commits intohuggingface:mainfrom
gante:dummy

Conversation

@gante
Copy link
Copy Markdown
Contributor

@gante gante commented Mar 18, 2024

What does this PR do?

The first commit in this PR (incorrectly) changes a logits processor with a doctest. With the change, the doctests are incorrect and should fail. However, that doesn't happen (check its CI status -- no ci/circleci: tests_pr_documentation_tests) 😭

👉 The current doctest fetcher only consider files where the doctest itself was changed. For instance, in some generate-related doctests, I noticed that things got stale without any red CI (especially now, that we don't have the daily doctest runner)

This PR changes the doctest fetcher to add the files in the diff to the doctest CI if they have any doctest, to ensure the doctest stays consistent with the code. We can see in the 2nd commit that the doctest CI is now triggered, and it fails because of the change in the first commit 🙌

The 3rd commit removes the change in the first commit.

🚨 Merging this change may cause future commits to fail, if the corresponding doctests are stale!

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@gante gante changed the title (DONT MERGE) doctest fetcher dummy commit Doctests: push tests fetch all files in the diff that contain doctests Mar 18, 2024
@gante gante marked this pull request as ready for review March 18, 2024 18:36
@gante gante requested review from amyeroberts and ydshieh March 18, 2024 18:36
@gante
Copy link
Copy Markdown
Contributor Author

gante commented Mar 18, 2024

cc @zucchini-nlp (as we've talked about this today)

@gante gante changed the title Doctests: push tests fetch all files in the diff that contain doctests Doctests: fetch all files in the diff that contain doctests Mar 18, 2024
@amyeroberts
Copy link
Copy Markdown
Contributor

Thanks for working on this!

I don't think we want to do this for two reasons:

  • This is going to end up running loads of doc tests which don't need to be run --> slower PR cycles & more pricey CI
  • We still suffer from doctests being tied to e.g. a specific file, which doesn't capture all the dependencies (issue with all our fetchers!)

I'd rather we just had a single, nightly run of doctests, which we can fix the following day. It's important to keep examples up-to-date, but not critical in the same way as failing tests code (❌ on main) or breaking code. Especially as most people only view the stable version release of the docs. cc @ydshieh

@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Mar 21, 2024

I agree the 2 points @amyeroberts mentioned. Not ideal, but it's the trade-off.

However, thanks a lot for the willingness to improve things.

@gante
Copy link
Copy Markdown
Contributor Author

gante commented Mar 21, 2024

I'd argue that the doctests in a given file are a subset of the tests that should be run when a certain file is changed, and not a superset [although it would need much better filtering, to avoid running on things like changing comments in the code]. Moreover, they are even more important than integration tests: they are highlighted use cases of the public interfaces, sitting closer to the users than other tests. It would also give us extra motivation to write good examples 🤗

However, I do understand that we often write examples with large models, and we don't have a @slow decorator. And that we have many other test-related challenges to tackle. In any case, I really believe that doctests are extremely powerful (docs with correctness checks!), so let me know if I can help making doctests on commit a reality :D

@amyeroberts @ydshieh can I ask you to then add some extra priority in our doctest daily runner? I recently came across generation documentation tests that were broken for months 😬

@gante gante closed this Mar 21, 2024
@ydshieh
Copy link
Copy Markdown
Collaborator

ydshieh commented Mar 22, 2024

add some extra priority in our doctest daily runner

Yeah, sorry for this. Not having the bandwidth on it but I should really work on this. Will try my best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants