Skip to content

ci: create a new library compatibility test suite#5178

Merged
wjones127 merged 19 commits intolance-format:mainfrom
wjones127:compat-testing
Nov 14, 2025
Merged

ci: create a new library compatibility test suite#5178
wjones127 merged 19 commits intolance-format:mainfrom
wjones127:compat-testing

Conversation

@wjones127
Copy link
Copy Markdown
Contributor

@wjones127 wjones127 commented Nov 7, 2025

This adds a new compat test suite that replaces the forwards_compat suite. Key changes:

  1. It tests both forwards and backwards compatibility.
  2. It adds the most recent stable and beta release as comparison targets. Individual tests can also choose which versions to test against.
  3. It can be run easily locally via a single command make compattest

Closes #4416

How it works

Tests are written like:

@compat_test()
class MyTest(UpgradeDowngradeTest):
    def __init__(self, path: Path):
        self.path = path

    def create(self):
        # write initial data

    def check_read(self):
        # test reading data

    def check_write(self):
        # test writing data

Then two tests will be generated:

  1. downgrade: We call create in current version, then check_read and check_write in old version.
  2. upgrade_downgrade: We call create in old version, then check_read and check_write in current version, then go back and call check_read and check_write in old version again.

The way we execute on old versions of the library is using a persistent executor process within special virtual environment. We create one virtual env per old version, and one executor process per virtual env. The executor process receives a tuple (TestObject, method_name) (for example, (MyTest, 'create')), which is serialized via pickle. The executor process runs the method, and returns back a status indicating if there were any errors.

wjones127 and others added 8 commits November 6, 2025 14:04
Previously all compatibility tests lived in a monolithic 751-line
index_tests.py file mixed with infrastructure code. This made tests
hard to find and maintain.

Split into focused modules:
- compat_decorator.py: infrastructure and @compat_test decorator
- test_file_formats.py: file format compatibility tests
- test_scalar_indices.py: scalar index compatibility tests
- test_vector_indices.py: vector index compatibility tests

Removed deprecated datagen.py and test_compat.py.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@github-actions github-actions Bot added python ci Github Action or Test issues labels Nov 7, 2025
wjones127 and others added 9 commits November 6, 2025 16:56
Performance improvements:
- Removed pip upgrade step (saves ~1s per version, 7% faster)
- Added --quiet flag to pip install for cleaner output

Instrumentation:
- Added detailed timing instrumentation for performance analysis
- Timing output controlled by DEBUG=1 environment variable
- Tracks venv creation, package install, Lance import, and execution time
- Added PERFORMANCE.md documenting bottlenecks and optimization strategies

Key findings from analysis:
- Package installation: 4.9s (29% of total time)
- First Lance import: ~5.0s (29% of total time)
- Venv creation: 2.2s (13% of total time)
- Persistent subprocess provides 500x speedup on subsequent calls

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Virtual environments are now persistent by default, stored in
~/.cache/lance-compat-venvs/. This provides a 5x speedup for
interactive development after the first run.

Changes:
- Venvs persist across pytest sessions by default
- Validation checks ensure correct Lance version is installed
- Set COMPAT_TEMP_VENV=1 to use temporary venvs (old behavior)
- Added cleanup instructions to PERFORMANCE.md

Performance impact:
- First run: ~13-16s per version (creates venv)
- Subsequent runs: ~2-6s per version (reuses venv)
- Example: 2 tests that took 26s now take 6s

This makes iterative test development much more pleasant!

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Adds a new CI job that runs compatibility tests across multiple Lance
versions to verify forward/backward compatibility. The job:
- Runs on Ubuntu 24.04 with Python 3.13
- Uses temporary venvs (COMPAT_TEMP_VENV=1) for clean CI environments
- Has 60-minute timeout to account for venv creation
- Tests compatibility with versions: 0.16.0, 0.30.0, 0.36.0, latest
  stable, and latest beta

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Adds compatibility tests for two additional vector index types:
- IVF_HNSW_PQ: Hierarchical Navigable Small World with Product Quantization
- IVF_HNSW_SQ: Hierarchical Navigable Small World with Scalar Quantization

These tests only run against versions >= 0.39.0 because earlier versions do
not support remapping for HNSW indices, which is required for optimize
operations like compact_files().

Adds 12 new test cases (6 per index type: 3 versions × 2 test types).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@wjones127 wjones127 marked this pull request as ready for review November 7, 2025 23:51
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread python/python/tests/compat/compat_decorator.py
Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor nits. I suppose one question would be why is this approach better than the old CI-based one?

Comment thread python/python/tests/compat/venv_runner.py
flush=True,
)

method = getattr(obj, method_name)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wrap the method in an object? Why not just pickle the method itself? Then you could use __name__ to get the method name for logging.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose there's two ways you could go here:

  1. (What I did): Put all the parameters (such as the temp directory) in the class attributes. Then you can just pickle the object and call it's method without having to pass them.
  2. Pickle the function and the parameters separately, and pass the parameters in on the other side.

It doesn't seem like pickle works with closures. So I couldn't just create a closure with the parameters passed in, pickle it, and run it on the other side.

I felt like option (1) was a simpler way to implement this. Plus I kind of liked using the class to organize the methods into a logical grouping.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me

Comment thread python/python/tests/compat/venv_manager.py Outdated
Comment on lines +28 to +30
"0.29.1.beta2",
"0.30.0",
"0.36.0",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we still have a hard-coded list of versions. At one point we had discussed using "the latest X major releases" Is that still a follow-up or do we want this approach for some reason?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do the last stable and last beta release. We could expand that further.

Separately, there's the issue of whether we want to hard code versions. Some of the hardcoded ones here are just representing the first time we have compatibility working for that feature. I think at least that version is worth hard-coding in.

We might be able to take some of these versions out in the middle. I think some are there for no particular reason. But it's possible we might want to record some specific version in there if there was a risky compatibility change that happened in that version.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, I think what I'd like to do is for all test, have it test against:

  • Last stable release
  • Last preview release
  • Last 2 major releases

I can make that change.

@wjones127
Copy link
Copy Markdown
Contributor Author

I suppose one question would be why is this approach better than the old CI-based one?

Main advantage is that anyone can run this locally with one command: make compattest. Before they had to run a sequence of steps.

@wjones127 wjones127 merged commit 1ab35d0 into lance-format:main Nov 14, 2025
12 checks passed
@wjones127 wjones127 deleted the compat-testing branch November 14, 2025 00:09
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
This adds a new `compat` test suite that replaces the `forwards_compat`
suite. Key changes:

1. It tests both forwards **and** backwards compatibility.
2. It adds the most recent stable and beta release as comparison
targets. Individual tests can also choose which versions to test
against.
3. It can be run easily locally via a single command `make compattest`

Closes lance-format#4416

## How it works

Tests are written like:

```python
@compat_test()
class MyTest(UpgradeDowngradeTest):
    def __init__(self, path: Path):
        self.path = path

    def create(self):
        # write initial data

    def check_read(self):
        # test reading data

    def check_write(self):
        # test writing data
```

Then two tests will be generated:

1. **downgrade**: We call `create` in current version, then `check_read`
and `check_write` in old version.
2. **upgrade_downgrade**: We call `create` in old version, then
`check_read` and `check_write` in current version, then go back and call
`check_read` and `check_write` in old version again.

The way we execute on old versions of the library is using a persistent
executor process within special virtual environment. We create one
virtual env per old version, and one executor process per virtual env.
The executor process receives a tuple `(TestObject, method_name)` (for
example, `(MyTest, 'create')`), which is serialized via pickle. The
executor process runs the method, and returns back a status indicating
if there were any errors.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Github Action or Test issues python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Broad backwards- & forwards-compatability test

2 participants