Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,17 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/)
and this project adheres to [Semantic Versioning](http://semver.org/).

## ## [0.3.5] - 2024-12-13
## [0.3.6] - 2025-01-10

## Added

- Add expected scores in ColPali E2E test

## Changed

- Loosen package dependencies

## [0.3.5] - 2024-12-13

## Added

Expand All @@ -22,7 +32,6 @@ and this project adheres to [Semantic Versioning](http://semver.org/).

- General `CorpusQueryCollator` for BEIR style dataset training or hard negative training. This deprecates `HardNegCollator` but all changes to the training loop are made for a seemless update.


### Changed

- Updates BiPali config files
Expand All @@ -31,7 +40,6 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
- Removed `add_suffix` in the VisualRetrieverCollator and let the `suffix` be added in the individual processors.
- Changed the incorrect `<pad>` token to `<|endoftext|>` fo query augmentation `ColQwen2Processor`. Note that previous models were trained with `<|endoftext|>` so this is simply a non-breaking inference upgrade patch.


## [0.3.3] - 2024-10-29

### Added
Expand Down
18 changes: 12 additions & 6 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ classifiers = [

dependencies = [
"GPUtil",
"numpy<2.0.0",
"peft>=0.11.0,<0.12.0",
"pillow>=9.2.0,<11.0.0",
"numpy",
"peft>=0.11.0",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump to 12 but in this case, still bind 13 (or latest).
And test !!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bumping to 12 really necessary?

"pillow>=9.2.0",
Comment thread
tonywu71 marked this conversation as resolved.
"requests",
"torch>=2.2.0",
"transformers>=4.46.1,<4.47.0",
Expand All @@ -49,7 +49,9 @@ train = [
"configue>=5.0.0",
"datasets>=2.19.1",
"mteb>=1.16.3,<1.17.0",
"typer>=0.12.3, <1.0.0",
"peft>=0.11.0,<0.12.0",
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the idea of having double requirements here and above ? is there something I don't know about ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point, which was raised by @galleon, is that our users might want to work with loosened dependencies, e.g. what if someone wants to use colpali-engine with the latest version of `transformers?

My proposition is:

  • when users want to train their models, they should install colpali-engine[train] where we can make sure that training behaves as planned i.e. wit a strict dep config
  • when users want to use our models they should install colpali-engine where we have loosened the deps.

While there might be some minor discrepancies in the results, I think it's inevitable and that my proposition achieves a good happy medium. But happy to hear your opinion and/or your replacement solutions on this!

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that it would be nice to have less strict restrictions on inference only usage than training.
Idk how the override behavior works in pyproject depending on the optional dependencies.

I still think we need to upper bond the transformers version. People that want to use with newer versions can always install colpali-engine with --no-deps but otherwise at terms this will just lead to bugs I fear...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good compromise, thanks!

So following your propositions, I have added the upper bound for transformers in the default package configuration and restored the correct upper bounds in the train config. Does the PR look ready to be merged now?

FYI I've tested the pyproject override behavior and it works just as planned: each dep takes the intersection of lower/upper bounds defined in the different groups. So installing colpali-engine[train] takes the intersection of the default deps and the ones defined in the train group.

"pillow>=9.2.0,<11.0.0",
"typer>=0.15.1",
]

interpretability = [
Expand All @@ -58,9 +60,13 @@ interpretability = [
"seaborn>=0.13.2,<1.0.0",
]

dev = ["pytest>=8.0.0", "ruff>=0.4.0"]
dev = ["datasets>=2.19.1", "pytest>=8.0.0", "ruff>=0.4.0"]

all = ["colpali-engine[dev]", "colpali-engine[train]"]
all = [
"colpali-engine[dev]",
"colpali-engine[interpretability]",
"colpali-engine[train]",
]

[project.urls]
homepage = "https://github.com/illuin-tech/colpali"
Expand Down
67 changes: 38 additions & 29 deletions tests/models/paligemma/colpali/test_colpali_e2e.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

import pytest
import torch
from PIL import Image
from datasets import load_dataset

from colpali_engine.models import ColPali, ColPaliProcessor
from colpali_engine.utils.torch_utils import get_torch_device
Expand All @@ -15,6 +15,7 @@ def model_name() -> str:

@pytest.mark.slow
def test_e2e_retrieval_and_scoring(model_name: str):
# Load the model and processor
model = cast(
ColPali,
ColPali.from_pretrained(
Expand All @@ -23,31 +24,39 @@ def test_e2e_retrieval_and_scoring(model_name: str):
device_map=get_torch_device("auto"),
),
).eval()

try:
processor = cast(ColPaliProcessor, ColPaliProcessor.from_pretrained(model_name))

# Your inputs
images = [
Image.new("RGB", (480, 480), color="white"),
Image.new("RGB", (250, 250), color="black"),
]
queries = [
"Is attention really all you need?",
"Are Benjamin, Antoine, Merve, and Jo best friends?",
]

# Process the inputs
batch_images = processor.process_images(images).to(model.device)
batch_queries = processor.process_queries(queries).to(model.device)

# Forward pass
with torch.no_grad():
image_embeddings = model(**batch_images)
query_embeddings = model(**batch_queries)

scores = processor.score_multi_vector(query_embeddings, image_embeddings)
assert isinstance(scores, torch.Tensor)

except Exception as e:
pytest.fail(f"Code raised an exception: {e}")
processor = cast(ColPaliProcessor, ColPaliProcessor.from_pretrained(model_name))

# Load the test dataset
ds = load_dataset("hf-internal-testing/document-visual-retrieval-test", split="test")

# Preprocess the examples
batch_images = processor.process_images(images=ds["image"]).to(model.device)
batch_queries = processor.process_queries(queries=ds["query"]).to(model.device)

# Run inference
with torch.inference_mode():
image_embeddings = model(**batch_images)
query_embeddings = model(**batch_queries)

# Compute retrieval scores
scores = processor.score_multi_vector(
qs=query_embeddings,
ps=image_embeddings,
) # (len(qs), len(ps))

assert scores.ndim == 2, f"Expected 2D tensor, got {scores.ndim}"
assert scores.shape == (len(ds), len(ds)), f"Expected shape {(len(ds), len(ds))}, got {scores.shape}"

# Check if the maximum scores per row are in the diagonal of the matrix score
assert (scores.argmax(dim=1) == torch.arange(len(ds), device=scores.device)).all()

# Further validation: fine-grained check, with a hardcoded score from the original implementation
expected_scores = torch.tensor(
[
[16.5000, 7.5938, 15.6875],
[12.0625, 16.2500, 11.1250],
[15.2500, 12.6250, 21.0000],
],
dtype=scores.dtype,
)
assert torch.allclose(scores, expected_scores, atol=1), f"Expected scores {expected_scores}, got {scores}"