Add xAI Grok Tokenizer Integration

* Use `@xenova/transformers` to load the `Xenova/grok-1-tokenizer` from Hugging Face.
     * For Grok models:

       * **Grok-1:** SentencePiece-based tokenizer with 131 072 vocab.
       * **Grok-2 / Grok-2-Vision:** Confirm when available; assume same tokenizer or adjust if xAI releases a separate package.
* Implement dynamic import:

       ```js
       const { AutoTokenizer } = await import('@xenova/transformers');
       const tok = await AutoTokenizer.fromPretrained('Xenova/grok-1-tokenizer');
       return (txt) => tok.encode(txt).length;
       ```
     * Determine Grok’s model context limits (e.g., 131 k for Grok-1) and save in config.
     * In `tokenizers/index.ts`, when `tenant === 'grok'`, return `(text) => tok.encode(text).length`.
     * Add fallback approximation (`text.length / 4`) if loading fails, and prefix meter with `≈`.
     * Create a small unit test: load a 1 000 token Grok prompt and confirm the count aligns with xAI’s published numbers within ±5 %.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add xAI Grok Tokenizer Integration #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add xAI Grok Tokenizer Integration #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions