ci: cache rust dependencies, enable parallelism (1.9x to 5.5x faster)#5236
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5236 +/- ##
==========================================
- Coverage 82.23% 82.23% -0.01%
==========================================
Files 344 344
Lines 144765 144765
Branches 144765 144765
==========================================
- Hits 119050 119048 -2
Misses 21797 21797
- Partials 3918 3920 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - uses: ./.github/workflows/build_linux_wheel | ||
| with: | ||
| args: "--profile ci" |
There was a problem hiding this comment.
Pass required python-minor-version to build wheel actions
.github/workflows/build_linux_wheel declares inputs.python-minor-version as required. After this change the linux test job (lines 113‑115) invokes the action without that input, and the same omission occurs in the linux-arm (lines 184‑188), windows (lines 238‑240), and aws-integtest (lines 262‑264) jobs. GitHub Actions aborts a composite action when a required input is missing (Input required and not supplied: python-minor-version), so all of these jobs now fail before any wheels are built. Please continue passing the matrix’ python minor version (or a fixed value) to every invocation.
Useful? React with 👍 / 👎.
fa43c1e to
f68c5d1
Compare
* Add `ci` profile for smaller build caches. This had a meaningful impact in Lance, and I expect a similar impact here. lance-format/lance#5236 * Get caching working in Rust. Previously was not working due to `workspaces: rust`. * Get caching working in NodeJs lint job. Previously wasn't working because we installed the toolchain **after** we called `- uses: Swatinem/rust-cache@v2`, which invalidates the cache locally. * Fix broken pytest from async io transition (`pytest.PytestRemovedIn9Warning`) * Altered `get_num_sub_vectors` to handle bug in case of 4-bit PQ. This was cause of `rust future panicked: unknown error`. Raised an issue upstream to change panic to error: lance-format/lance#5257 * Call `npm run docs` to fix doc issue. * Disable flakey Windows test for consistency. It's just an OS-specific timer issue, not our fault. * Fix Windows absolute path handling in namespaces. Was causing CI failure `OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: `
…lance-format#5236) In lance-format#4139, we stopped caching built dependencies. Since then, we've had to wait 20 minutes for the test to build, before we even run any tests. This PR re-enables the build cache and implements a different mitigation to avoid large caches: we disable debug info in dependencies and only keep it in our crates. This is inspired by DataFusion's CI: https://github.com/apache/datafusion/blob/377c0fce481d561d58e3a6fad2dca18cb1d58384/Cargo.toml#L247-L259 Below are the reductions in size of the target directory. Previously, Rust jobs were running with `debuginfo=1`, so the target directory was about 16GB. With these changes, it's down to 7.6GB. If we decided later we wanted it even smaller, we could enable `-C opt-level=s`, which gets us down to 2.1GB. | Profile | Flags | `target/` size | % reduction | |--------|--------|-------:|------:| | dev | | 25.0GB | - | | dev | `-C debuginfo=1` | 16.0GB | 36% | | dev | `-C debuginfo=line-tables-only` | 12.0GB | 52% | | ci | | 7.6GB | 70% | | ci | `-C opt-level=s` | 2.1GB | 92% | Other optimizations added: * Removed `CARGO_BUILD_JOBS`, which was preventing parallelism in builds. We can add this back to specific runners if we find it's necessary. This will speed up cold builds. * Align profile across all builds in CI. Thanks to reduction in disk space, closes lance-format#5218. ### Performance improvement Most of the speed improvement probably comes from removing `CARGO_BUILD_JOBS`. Jobs that run tests aren't much faster since the tests take a while. | Case | Before | After (cold) | mul | After (cached) | mul | |--------|-------:|---:|---:|---:|---:| | linux-arm | 38m 58s | 20m 5s | **1.9x** | 18m 33s | **2.1x** | | MSRV Check | 11m 46s | 5m 17s | **2.3x** | 2m 8s | **5.5x** | | Clippy | 13m 16s | 5m 26s | **2.4x** | 2m 25s | **5.4x** |
In #4139, we stopped caching built dependencies. Since then, we've had to wait 20 minutes for the test to build, before we even run any tests.
This PR re-enables the build cache and implements a different mitigation to avoid large caches: we disable debug info in dependencies and only keep it in our crates. This is inspired by DataFusion's CI: https://github.com/apache/datafusion/blob/377c0fce481d561d58e3a6fad2dca18cb1d58384/Cargo.toml#L247-L259
Below are the reductions in size of the target directory. Previously, Rust jobs were running with
debuginfo=1, so the target directory was about 16GB. With these changes, it's down to 7.6GB. If we decided later we wanted it even smaller, we could enable-C opt-level=s, which gets us down to 2.1GB.target/size-C debuginfo=1-C debuginfo=line-tables-only-C opt-level=sOther optimizations added:
CARGO_BUILD_JOBS, which was preventing parallelism in builds. We can add this back to specific runners if we find it's necessary. This will speed up cold builds.Thanks to reduction in disk space, closes #5218.
Performance improvement
Most of the speed improvement probably comes from removing
CARGO_BUILD_JOBS. Jobs that run tests aren't much faster since the tests take a while.