-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
-
Use
@xenova/transformersto load theXenova/grok-1-tokenizerfrom Hugging Face.-
For Grok models:
- Grok-1: SentencePiece-based tokenizer with 131 072 vocab.
- Grok-2 / Grok-2-Vision: Confirm when available; assume same tokenizer or adjust if xAI releases a separate package.
-
-
Implement dynamic import:
```js const { AutoTokenizer } = await import('@xenova/transformers'); const tok = await AutoTokenizer.fromPretrained('Xenova/grok-1-tokenizer'); return (txt) => tok.encode(txt).length; ```- Determine Grok’s model context limits (e.g., 131 k for Grok-1) and save in config.
- In
tokenizers/index.ts, whentenant === 'grok', return(text) => tok.encode(text).length. - Add fallback approximation (
text.length / 4) if loading fails, and prefix meter with≈. - Create a small unit test: load a 1 000 token Grok prompt and confirm the count aligns with xAI’s published numbers within ±5 %.