ci: enables ollama integration tests by codefromthecrypt · Pull Request #23 · square/exchange

codefromthecrypt · 2024-09-02T04:21:07Z

This adds CI for the existing ollama tests.

The model we use, currently mistral-nemo is somewhat large, so not helpful to run in docker for a couple reasons. Instead, we nohup ollama directly.

slower execution through containerization
more complicated cache setup, as you need to manage the model, which is pulled in docker

To reduce problems, pulling and smoke testing the model are done in a separate step, logging on failure. This gives us a high confidence that when the tests actually run, failures are related to tests, not slow first-time model startup hitting some timeout. All that said, the tests themselves do take time as it is using real inference now.

src/exchange/message.py

tests/test_integration.py

codefromthecrypt · 2024-09-02T07:43:01Z

Lemme know if we are brave enough to run this in CI. One reason not to is that it could be flakey.

src/exchange/providers/ollama.py

michaelneale · 2024-09-02T08:02:50Z

always brave enough for CI.

codefromthecrypt · 2024-09-03T07:26:32Z

Made tests conditional on OLLAMA_HOST and pull the model we need.

Note: docker/testcontainers is not a viable option as the models are slow to start (especially in docker which disallows GPU access), and also eat a lot of memory. I think if we want CI, we should just start ollama in github action. If sounds good, 👍 and I'll give it a go.

codefromthecrypt · 2024-09-03T07:27:55Z

oops sorry about red I'll sort it later.

codefromthecrypt · 2024-09-03T08:48:36Z

🤞 on CI, I am only running it in latest python, after normal unit tests.

codefromthecrypt · 2024-09-03T09:17:40Z

last try on CI today 🤞

codefromthecrypt · 2024-09-03T09:38:33Z

notes for tomorrow:

1 of 4 passed. so look more closely at why that could be the case.
double check the cache actually works. i used the key based on ollama defaults

codefromthecrypt · 2024-09-04T02:29:28Z

put back to draft until I sort out what's going on in CI. Feel free to edit this PR if someone wants to finish before I respond again

codefromthecrypt · 2024-09-04T03:30:36Z

@k33g has some interesting options in https://github.com/parakeet-nest/awesome-slms trying again with testcontainers may not be a terrible idea. I'm out of time for the moment so anyone can give a go if they like.

codefromthecrypt · 2024-09-19T09:59:26Z

@anuraaga if you want to do another drive-by review, more than welcome!

codefromthecrypt · 2024-09-19T10:01:40Z

.github/workflows/ci.yaml

+          key: ollama-${{ hashFiles('./src/exchange/providers/ollama.py') }}
+
+      - name: Install Ollama
+        run: curl -fsSL https://ollama.com/install.sh | sh


confusing part is the log here ends up saying "hey I'm running" when well it isn't ;) I used act locally to figure it out.

act pull_request -P ubuntu-latest=ghcr.io/catthehacker/ubuntu:act-latest --container-architecture linux/amd64

codefromthecrypt · 2024-09-19T10:04:59Z

been pretty consistently 5m all in

.github/workflows/ci.yaml

anuraaga · 2024-09-19T10:21:59Z

.github/workflows/ci.yaml

+      # this prior to running tests. This also reduces the chance of flakiness.
+      - name: Pull and Test Ollama model
+        run: |  # get the OLLAMA_MODEL from ./src/exchange/providers/ollama.py
+          OLLAMA_MODEL=$(uv run python -c "from src.exchange.providers.ollama import OLLAMA_MODEL; print(OLLAMA_MODEL)")


This is pretty awesome

.github/workflows/ci.yaml

codefromthecrypt · 2024-09-19T12:01:38Z

thanks again for the feedback @anuraaga this is better

anuraaga · 2024-09-19T12:09:09Z

LGTM (seems possible for repo to disable grey check drive by approvals, just learned that 😀)

michaelneale

I like this as focused and self contained (and means can use the check as a matrix for ollama support)

codefromthecrypt · 2024-09-20T23:57:12Z

PTAL, I added a commit I can revert that makes the test a lot faster by using qwen2.5's smallest model. I did this in a way that leaves the default mistral. Idea being that we don't spend 5mins for the few tests we have now. Let's see.

codefromthecrypt · 2024-09-21T00:04:00Z

down to 1m20s

As the default model is for goose anyway, my thinking is that we only test with a model needed to pass tests here. In goose, when we make an integration test, we can use the larger model as well. The main tradeoff is someone could add a typo to the default model, and the PR check wouldn't know about it, but that's the case in the other providers anyway.

codefromthecrypt · 2024-09-21T00:06:07Z

tests/providers/test_ollama.py

 def test_ollama_integration():
    provider = OllamaProvider.from_env()
-    model = OLLAMA_MODEL
+    model = os.getenv("OLLAMA_MODEL", OLLAMA_MODEL)


I'm also happy to make a constant in ollama.py OLLAMA_TEST_MODEL, re-used for these two tests. That way we don't need to define it in CI. OTOH, I didn't do that so far because I think ad-hoc can still use the large model, and that's a good thing.

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>

Co-authored-by: Anuraag (Rag) Agrawal <anuraaga@gmail.com>

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>

codefromthecrypt commented Sep 2, 2024

View reviewed changes

src/exchange/message.py Outdated Show resolved Hide resolved

codefromthecrypt marked this pull request as ready for review September 2, 2024 07:35

codefromthecrypt commented Sep 2, 2024

View reviewed changes

tests/test_integration.py Outdated Show resolved Hide resolved

codefromthecrypt commented Sep 2, 2024

View reviewed changes

src/exchange/providers/ollama.py Show resolved Hide resolved

michaelneale requested a review from zakiali September 2, 2024 08:03

codefromthecrypt changed the title ~~Enables ollama integration tests with llama3.1:8b-instruct-q8_0~~ Enables ollama integration tests with llama3.1:8b-instruct-q4_0 Sep 3, 2024

codefromthecrypt requested a review from michaelneale September 3, 2024 07:26

codefromthecrypt mentioned this pull request Sep 3, 2024

default the ollama host #22

Closed

codefromthecrypt marked this pull request as draft September 4, 2024 02:27

codefromthecrypt mentioned this pull request Sep 7, 2024

feat: convert ollama provider to an openai configuration #34

Merged

codefromthecrypt force-pushed the ollama-integration branch 2 times, most recently from 58b3bcb to a6e043c Compare September 19, 2024 09:45

codefromthecrypt changed the title ~~Enables ollama integration tests with llama3.1:8b-instruct-q4_0~~ ci: enables ollama integration tests Sep 19, 2024

codefromthecrypt marked this pull request as ready for review September 19, 2024 09:58

codefromthecrypt force-pushed the ollama-integration branch from e175077 to d599f66 Compare September 19, 2024 09:59

codefromthecrypt commented Sep 19, 2024

View reviewed changes

anuraaga reviewed Sep 19, 2024

View reviewed changes

michaelneale approved these changes Sep 19, 2024

View reviewed changes

codefromthecrypt force-pushed the ollama-integration branch from 3bb1540 to 0ea3af4 Compare September 20, 2024 23:56

codefromthecrypt commented Sep 21, 2024

View reviewed changes

baxen and others added 4 commits September 23, 2024 14:16

ci: enables ollama integration tests

118475a

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>

clarity

41ecc90

Co-authored-by: Anuraag (Rag) Agrawal <anuraaga@gmail.com>

feedback

e93319f

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>

smaller model

7cb44d1

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>

codefromthecrypt force-pushed the ollama-integration branch from 26e48dc to 7cb44d1 Compare September 23, 2024 04:18

drift

9aee5f3

Signed-off-by: Adrian Cole <adrian.cole@elastic.co>

michaelneale merged commit 5b34bc5 into square:main Sep 23, 2024

Comments

Conversation

codefromthecrypt commented Sep 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codefromthecrypt commented Sep 2, 2024

Uh oh!

Uh oh!

michaelneale commented Sep 2, 2024

Uh oh!

codefromthecrypt commented Sep 3, 2024

Uh oh!

codefromthecrypt commented Sep 3, 2024

Uh oh!

codefromthecrypt commented Sep 3, 2024

Uh oh!

codefromthecrypt commented Sep 3, 2024

Uh oh!

codefromthecrypt commented Sep 3, 2024

Uh oh!

codefromthecrypt commented Sep 4, 2024

Uh oh!

codefromthecrypt commented Sep 4, 2024

Uh oh!

codefromthecrypt commented Sep 19, 2024

Uh oh!

codefromthecrypt Sep 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codefromthecrypt commented Sep 19, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anuraaga Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codefromthecrypt commented Sep 19, 2024

Uh oh!

anuraaga commented Sep 19, 2024

Uh oh!

michaelneale left a comment

Choose a reason for hiding this comment

Uh oh!

codefromthecrypt commented Sep 20, 2024

Uh oh!

codefromthecrypt commented Sep 21, 2024

Uh oh!

codefromthecrypt Sep 21, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codefromthecrypt commented Sep 2, 2024 •

edited

Loading

codefromthecrypt Sep 19, 2024 •

edited

Loading