ci: enables ollama integration tests#23
Conversation
|
Lemme know if we are brave enough to run this in CI. One reason not to is that it could be flakey. |
|
always brave enough for CI. |
|
Made tests conditional on Note: docker/testcontainers is not a viable option as the models are slow to start (especially in docker which disallows GPU access), and also eat a lot of memory. I think if we want CI, we should just start ollama in github action. If sounds good, 👍 and I'll give it a go. |
|
oops sorry about red I'll sort it later. |
|
🤞 on CI, I am only running it in latest python, after normal unit tests. |
|
last try on CI today 🤞 |
|
notes for tomorrow:
|
|
put back to draft until I sort out what's going on in CI. Feel free to edit this PR if someone wants to finish before I respond again |
|
@k33g has some interesting options in https://github.com/parakeet-nest/awesome-slms trying again with testcontainers may not be a terrible idea. I'm out of time for the moment so anyone can give a go if they like. |
58b3bcb to
a6e043c
Compare
e175077 to
d599f66
Compare
|
@anuraaga if you want to do another drive-by review, more than welcome! |
| key: ollama-${{ hashFiles('./src/exchange/providers/ollama.py') }} | ||
|
|
||
| - name: Install Ollama | ||
| run: curl -fsSL https://ollama.com/install.sh | sh |
There was a problem hiding this comment.
confusing part is the log here ends up saying "hey I'm running" when well it isn't ;) I used act locally to figure it out.
act pull_request -P ubuntu-latest=ghcr.io/catthehacker/ubuntu:act-latest --container-architecture linux/amd64
.github/workflows/ci.yaml
Outdated
| # this prior to running tests. This also reduces the chance of flakiness. | ||
| - name: Pull and Test Ollama model | ||
| run: | # get the OLLAMA_MODEL from ./src/exchange/providers/ollama.py | ||
| OLLAMA_MODEL=$(uv run python -c "from src.exchange.providers.ollama import OLLAMA_MODEL; print(OLLAMA_MODEL)") |
|
thanks again for the feedback @anuraaga this is better |
|
LGTM (seems possible for repo to disable grey check drive by approvals, just learned that 😀) |
michaelneale
left a comment
There was a problem hiding this comment.
I like this as focused and self contained (and means can use the check as a matrix for ollama support)
3bb1540 to
0ea3af4
Compare
|
PTAL, I added a commit I can revert that makes the test a lot faster by using qwen2.5's smallest model. I did this in a way that leaves the default mistral. Idea being that we don't spend 5mins for the few tests we have now. Let's see. |
tests/providers/test_ollama.py
Outdated
| def test_ollama_integration(): | ||
| provider = OllamaProvider.from_env() | ||
| model = OLLAMA_MODEL | ||
| model = os.getenv("OLLAMA_MODEL", OLLAMA_MODEL) |
There was a problem hiding this comment.
I'm also happy to make a constant in ollama.py OLLAMA_TEST_MODEL, re-used for these two tests. That way we don't need to define it in CI. OTOH, I didn't do that so far because I think ad-hoc can still use the large model, and that's a good thing.
Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
Signed-off-by: Adrian Cole <adrian.cole@elastic.co>
26e48dc to
7cb44d1
Compare


This adds CI for the existing ollama tests.
The model we use, currently mistral-nemo is somewhat large, so not helpful to run in docker for a couple reasons. Instead, we nohup ollama directly.
To reduce problems, pulling and smoke testing the model are done in a separate step, logging on failure. This gives us a high confidence that when the tests actually run, failures are related to tests, not slow first-time model startup hitting some timeout. All that said, the tests themselves do take time as it is using real inference now.