Follow-up to #2 (fixed in bb31e8f).
The fix shipped a hardcoded 24,000-rune cap in two layers:
MaxEmbeddingTextRunes in internal/store/embedding_tasks.go (primary, participates in the content hash so future cap changes force re-embed).
maxEmbeddingInputRunes in internal/openai/client.go (defensive cap before each request).
Sized for OpenAI's 8192-token limit at a ~3 chars/token floor. Works for every current OpenAI embedding model (text-embedding-3-small, text-embedding-3-large, text-embedding-ada-002 all share 8192).
Limitations of the constant approach:
- New models with different limits need a code change.
- Non-OpenAI providers behind
GITCRAWL_OPENAI_BASE_URL may have different caps.
- Conservative ratio wastes ~25% of available context on typical English (~4 chars/token).
Options to consider:
- Static
model → token_limit table. Cheapest; one line per new model.
tiktoken-go for exact pre-flight token counting (still needs a model→limit table). +1.5 MB BPE table.
- Probe-and-cache from 400 error responses (
maximum input length is N tokens). Adaptive but relies on OpenAI's error string format.
- Opportunistically read
context_length from /v1/models for compatible providers (LiteLLM, vLLM); fall back to table.
OpenAI itself does not expose token limits via API.
Follow-up to #2 (fixed in bb31e8f).
The fix shipped a hardcoded 24,000-rune cap in two layers:
MaxEmbeddingTextRunesininternal/store/embedding_tasks.go(primary, participates in the content hash so future cap changes force re-embed).maxEmbeddingInputRunesininternal/openai/client.go(defensive cap before each request).Sized for OpenAI's 8192-token limit at a ~3 chars/token floor. Works for every current OpenAI embedding model (
text-embedding-3-small,text-embedding-3-large,text-embedding-ada-002all share 8192).Limitations of the constant approach:
GITCRAWL_OPENAI_BASE_URLmay have different caps.Options to consider:
model → token_limittable. Cheapest; one line per new model.tiktoken-gofor exact pre-flight token counting (still needs a model→limit table). +1.5 MB BPE table.maximum input length is N tokens). Adaptive but relies on OpenAI's error string format.context_lengthfrom/v1/modelsfor compatible providers (LiteLLM, vLLM); fall back to table.OpenAI itself does not expose token limits via API.