Skip to content

Route token counting via gateway#60

Merged
rowan-stein merged 8 commits into
mainfrom
noa/issue-59-2
Apr 25, 2026
Merged

Route token counting via gateway#60
rowan-stein merged 8 commits into
mainfrom
noa/issue-59-2

Conversation

@casey-brooks
Copy link
Copy Markdown
Contributor

Summary

  • route AGN token counting calls through the gateway RPC path
  • update token counting test stubs (unit + e2e) to use the gateway service
  • switch the e2e workflow to the centralized agynio/e2e runner with a built agn binary

Notes

Testing

  • go vet ./...
  • go test ./...
  • go test -tags e2e ./test/e2e

Issue

Closes #59

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary

  • route token counting client calls through the gateway RPC path
  • update unit and e2e token counting servers to expose the gateway service
  • switch the e2e workflow to the centralized agynio/e2e runner with a built agn binary
  • note: depends on feat(go-agn-cli): add e2e suite e2e#25 for the agn-binary input

Testing

  • go vet ./...
  • go test ./...
  • go test -tags e2e ./test/e2e

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Test & Lint Summary

  • go vet ./... (lint: no errors)
  • go test ./... (passed: 10 packages, failed: 0, skipped: 2 no test files)
  • go test -tags e2e ./test/e2e (passed: 1 package, failed: 0, skipped: 0)

Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes.

Main blockers:

  • CI workflow: ensure the build step creates dist/ before go build; pin agynio/e2e action (no @main) and pass a pinned ref into agynio/bootstrap provision for determinism.
  • grpc gateway stubs: the TokenCountingGateway ServiceDesc/handler boilerplate is duplicated across unit + e2e tests (and the client hardcodes the method separately). Please centralize into a single typed stub and reuse.

Non-blocking:

  • README still mentions Go 1.24 while the repo/workflow are on 1.25.

Comment thread .github/workflows/e2e.yml Outdated
Comment thread .github/workflows/e2e.yml Outdated
Comment thread .github/workflows/e2e.yml
Comment thread .github/workflows/e2e.yml Outdated
Comment thread internal/tokencounting/client_test.go Outdated
@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary

  • centralized the TokenCountingGateway stub/constant in internal/tokencounting and reused it in unit + e2e tests
  • pinned the e2e workflow dependencies and created dist/ before building the agn binary

Testing

  • go vet ./...
  • go test ./...
  • go test -tags e2e ./test/e2e

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Test & Lint Summary

  • go vet ./... (lint: no errors)
  • go test ./... (passed: 10 packages, failed: 0, skipped: 2 no test files)
  • go test -tags e2e ./test/e2e (passed: 1 package, failed: 0, skipped: 0)

noa-lucent
noa-lucent previously approved these changes Apr 24, 2026
Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review complete.

  • E2E workflow is now deterministic (pinned action SHAs + pinned bootstrap/e2e refs) and the dist/agn build step creates dist/.
  • TokenCountingGateway gRPC stub is centralized (gateway_stub.go) and reused by unit + e2e tests; client uses the shared method const.

Non-blocking: README still says Go 1.24.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary

  • disabled smoke tags in the centralized e2e runner invocation
  • bumped the pinned run-tests action ref to 026c6071b9fe798c2b1a5bb421c3d100cf268c4d

Testing

  • go vet ./...
  • go test ./...
  • go test -tags e2e ./test/e2e

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Test & Lint Summary

  • go vet ./... (lint: no errors)
  • go test ./... (passed: 10 packages, failed: 0, skipped: 2 no test files)
  • go test -tags e2e ./test/e2e (passed: 1 package, failed: 0, skipped: 0)

noa-lucent
noa-lucent previously approved these changes Apr 25, 2026
Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review complete.

  • e2e runner pin updated and still fully deterministic (action SHA + matching ref).
  • include-smoke: false is now set as per issue #59 to run only the svc_agn_cli suite.

Non-blocking: README still mentions Go 1.24.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary:

  • Switched token counting to in-process tokenizer (tiktoken/pdfcpu) and removed gateway/mock server paths.
  • Deprecated legacy token_counting config validation and updated config/e2e tests accordingly.
  • Added tokenizer dependencies and wired local counter into cmd/agn.

Tests/Lint:

  • go vet ./...
  • go test ./...
  • go test -tags e2e ./test/e2e

Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Thanks for the substantial rework to in-process token counting — this is directionally aligned with issue #59 and agynio/architecture.

Blockers

  1. BPE merges must not be downloaded at runtime. As implemented, tiktoken-go will fetch o200k_base.tiktoken from openaipublic unless env overrides are set. Please vendor/embed the merges into the agn binary and load them deterministically (no runtime config), ideally preloading at startup.

  2. image_url should be HTTPS-only per the architecture spec.

  3. Audio is currently silently under-counted (0 tokens) via input_audio and audio file extensions. Either implement correct audio counting or fail explicitly as unsupported.

  4. Add unit tests for the new tokenizer package (parse shapes, BPE token counts, image tiling math, PDF path). E2E coverage alone won’t protect this new low-level logic.

Non-blocking

README still references Go 1.24 while CI/go.mod are on 1.25.

Comment thread internal/tokencounting/tokenizer/bpe_loader.go
Comment thread internal/tokencounting/tokenizer/image.go Outdated
Comment thread internal/tokencounting/tokenizer/count.go
Comment thread internal/tokencounting/tokenizer/tokenizer.go
@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary:

  • Switched tokenizer BPE loading to offline embedded loader (no runtime download).
  • Enforced HTTPS-only image_url fetching and now error on audio inputs/files.
  • Added tokenizer unit coverage for parsing, BPE counts, image tiling, and PDF flow.

Tests/Lint:

  • go vet ./...
  • go test ./...
  • go test -tags e2e ./test/e2e

noa-lucent
noa-lucent previously approved these changes Apr 25, 2026
Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review complete.

Verified the previously requested blockers are addressed:

  • BPE merges now load offline via tiktoken-go-loader (no runtime download).
  • image_url is https-only.
  • audio (input_audio + audio file extensions) now fails explicitly instead of silently counting 0.
  • Added focused unit tests covering parse shapes, BPE token counts, image tiling, and PDF extraction.

Non-blocking: PR description and README still reference the older gateway-based approach / Go 1.24.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary

  • update agynio/e2e run-tests action/ref to dbe74e80d6bbf8dde6cd4cac89dd15e8c72247ab for the no-mocks suite

Tests

  • go test ./...
  • go test -v -count=1 -tags e2e ./test/e2e

Lint

  • go vet ./...

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary

  • update agynio/e2e run-tests action/ref to dbe74e80d6bbf8dde6cd4cac89dd15e8c72247ab for the no-mocks suite

Tests

  • go test ./... (passed)
  • go test -v -count=1 -tags e2e ./test/e2e (passed)

Lint

  • go vet ./... (no issues)

noa-lucent
noa-lucent previously approved these changes Apr 25, 2026
Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review complete.

Verified the workflow now pins agynio/e2e run-tests action to dbe74e80d6bbf8dde6cd4cac89dd15e8c72247ab and passes the matching ref input, keeping CI deterministic while using the no-mocks runner.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary

  • update agynio/e2e run-tests action/ref to 0a4f88cc9d2392234dd2faa88427d3051e2722d0 (main merge commit)

Tests

  • go test ./... (passed)
  • go test -v -count=1 -tags e2e ./test/e2e (passed)

Lint

  • go vet ./... (no issues)

noa-lucent
noa-lucent previously approved these changes Apr 25, 2026
Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review complete.

Workflow-only change verified: agynio/e2e run-tests action is now pinned to 0a4f88cc9d2392234dd2faa88427d3051e2722d0 and the ref: input matches.

@casey-brooks
Copy link
Copy Markdown
Contributor Author

Summary

  • fix run-tests input to use include_smoke=false so smoke suite stays disabled

Tests

  • go test ./... (passed)
  • go test -v -count=1 -tags e2e ./test/e2e (passed)

Lint

  • go vet ./... (no issues)

Copy link
Copy Markdown

@noa-lucent noa-lucent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review complete.

Workflow-only change verified: the run-tests action input is include_smoke (underscore) in agynio/e2e, and the workflow now passes include_smoke: false accordingly.

@rowan-stein rowan-stein merged commit a79218d into main Apr 25, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch agn-cli CI E2E to centralized agynio/e2e suite

3 participants