feat: integrate fastokens BPE tokenizer backend by biswapanda · Pull Request #7387 · ai-dynamo/dynamo

biswapanda · 2026-03-14T21:52:07Z

Overview:

Add the fastokens crate (v0.1.0 from github.com/Atero-ai/fastokens) as an always-on workspace dependency for high-performance BPE encoding.

Related PR: #7388

Details:

Core integration:

lib/llm/src/tokenizers/fast.rs: hybrid FastTokenizer that encodes with fastokens and decodes with HuggingFace, with 4 unit tests
lib/llm/src/model_card.rs: tokenizer() checks DYN_TOKENIZER=fastokens env var, falls back to HuggingFace on load failure

Frontend CLI:

--tokenizer flag / DYN_TOKENIZER env var with values "default" (HuggingFace) or "fastokens"

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

closes GitHub issue: DIS-1569, DIS-1570

Summary by CodeRabbit

New Features
- Added --dyn-tokenizer-backend command-line option to select different tokenizer backends.
- New high-performance tokenizer implementation now available.
Tests
- Added tokenizer test suite and validation data.

Add the fastokens crate (v0.1.0 from github.com/Atero-ai/fastokens) as an always-on workspace dependency for high-performance BPE encoding. Core integration: - lib/llm/src/tokenizers/fast.rs: hybrid FastTokenizer that encodes with fastokens and decodes with HuggingFace, with 4 unit tests - lib/llm/src/model_card.rs: tokenizer() checks DYN_TOKENIZER_BACKEND=fasttokens env var, falls back to HuggingFace on load failure Frontend CLI: - --dyn-tokenizer-backend flag / DYN_TOKENIZER_BACKEND env var with values "default" (HuggingFace) or "fasttokens"

coderabbitai · 2026-03-14T22:06:19Z

Walkthrough

This pull request introduces support for a new "fasttokens" tokenizer backend. Changes include: adding the fastokens workspace dependency from a Git repository, exposing CLI configuration for backend selection, implementing a hybrid FastTokenizer struct that uses fastokens for encoding and HuggingFaceTokenizer for decoding, and providing test data for validation.

Changes

Cohort / File(s)	Summary
Workspace Dependencies `Cargo.toml`, `lib/llm/Cargo.toml`	Added `fastokens` workspace dependency from GitHub repository for high-performance tokenization.
CLI Configuration `components/src/dynamo/frontend/frontend_args.py`	Added `--dyn-tokenizer-backend` CLI argument and `tokenizer_backend` field to FrontendConfig with environment variable support.
Environment Setup `components/src/dynamo/frontend/main.py`	Added environment variable propagation for DYN_TOKENIZER_BACKEND when backend is set to "fasttokens".
Tokenizer Abstraction `lib/llm/src/tokenizers.rs`	Exposed new `fast` module and `FastTokenizer` type in public API.
Tokenizer Implementation `lib/llm/src/tokenizers/fast.rs`	Implemented hybrid `FastTokenizer` using fastokens for encoding with batch processing via rayon and HuggingFaceTokenizer for decoding, including comprehensive unit tests.
Dynamic Backend Selection `lib/llm/src/model_card.rs`	Added environment-driven tokenizer selection logic, attempting FastTokenizer when DYN_TOKENIZER_BACKEND=fasttokens before falling back to HuggingFace.
Test Data `lib/llm/tests/data/sample-models/minimal-bpe/tokenizer.json`	Added minimal BPE tokenizer configuration file for testing round-trip encode/decode workflows.
Project Metadata `lib/bindings/python/pyproject.toml`	Commented out explicit license and license-files metadata declarations.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hop, hop, the tokens fly so fast,
With fastokens here at last!
Encoding swift with fuzzy cheer,
New backends blooming far and near,
The code now hops with speed so bright! 🚀✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 72.73% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: integrate fastokens BPE tokenizer backend' accurately summarizes the main change: integrating a new high-performance BPE tokenizer backend throughout the codebase.
Description check	✅ Passed	The PR description covers the required sections (Overview, Details, Related Issues) and provides clear context for the changes, though the related issues reference uses a placeholder format.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (1)

lib/llm/src/tokenizers/fast.rs (1)

125-143: Make the decode-stream test assert emitted text, not just absence of errors.

Right now this only proves step() doesn't fail. Empty or duplicated chunks would still pass. Please accumulate the returned chunks and compare them with a reference decode for the continuation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/llm/src/tokenizers/fast.rs` around lines 125 - 143, The test
test_fast_with_decode_stream currently only checks that stream.step() doesn't
error; change it to accumulate the emitted chunks from
wrapper.decode_stream(&prompt_ids, true) by collecting each step(...) return
(concatenate non-empty chunks) into a single string, then obtain the expected
text by decoding the continuation (e.g. via wrapper.decode(cont_ids) or
wrapper.decode(continuation)) and assert equality (or assert that the
accumulated string contains the expected continuation); update references in the
test around FastTokenizer::from_file, TokenizerWrapper::from,
wrapper.decode_stream, and stream.step to perform this accumulation and
comparison instead of a no-op loop.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Cargo.toml`:
- Line 49: The fastokens Git dependency is floating and needs an immutable ref;
update the Cargo.toml dependency line for fastokens (the entry fastokens = { git
= "https://github.com/Atero-ai/fastokens", version = "0.1.0" }) to include a rev
(or tag) field with the specific commit hash or tag you want pinned (e.g., rev =
"COMMIT_SHA") so the manifest is reproducible; ensure you use the exact commit
SHA or tag from the fastokens repo and run cargo update -p fastokens if needed
to regenerate the lockfile.

In `@components/src/dynamo/frontend/frontend_args.py`:
- Around line 429-440: The CLI currently allows any string for the
--dyn-tokenizer-backend argument; update the add_argument call that defines
flag_name="--dyn-tokenizer-backend" (dest="tokenizer_backend") to restrict
values to the supported set by adding an explicit choices constraint (e.g.,
["default","fasttokens"]) so parsing fails fast for invalid CLI/env values and
documents the accepted options in the help text.

In `@components/src/dynamo/frontend/main.py`:
- Around line 168-169: The current logic only sets
os.environ["DYN_TOKENIZER_BACKEND"] = "fasttokens" when config.tokenizer_backend
== "fasttokens" and never clears it; update the branch around
config.tokenizer_backend in main.py so that when config.tokenizer_backend ==
"default" you remove/unset DYN_TOKENIZER_BACKEND (e.g., pop or del from
os.environ if present), keep setting it for "fasttokens" as before, and ensure
no stale environment value remains when the default backend is chosen.

In `@lib/bindings/python/pyproject.toml`:
- Around line 25-26: Uncomment and replace the outdated dict license entries in
pyproject.toml by adding explicit SPDX and license-files fields: remove the
commented lines and add license = "Apache-2.0" and license-files = ["LICENSE"]
so the project uses the PEP 639-compatible string expression and explicit
license file declaration (update the existing commented/old keys related to
license and license-files).

In `@lib/llm/src/model_card.rs`:
- Around line 384-386: The current logic silently treats any non-"fasttokens"
DYN_TOKENIZER_BACKEND as the default behavior; change the handling to explicitly
match allowed values: if the env var is "fasttokens" set use_fast = true, if it
is "default" (or an explicit "slow"/"rust" token value you support) set use_fast
= false, and for any other value emit a clear warning or return an error so
misconfiguration is visible; update the code that reads DYN_TOKENIZER_BACKEND
(the use_fast assignment) to perform a match on the string and log/propagate an
error on unsupported values rather than silently falling back.
- Around line 394-399: When attempting the fast tokenizer, don't call
p.to_str().ok_or_else(...) which returns early on non-UTF-8 paths and prevents
the HuggingFace fallback; instead check p.to_str() with an if let/ match and
only call crate::tokenizers::FastTokenizer::from_file(path_str) when to_str()
returns Some. If to_str() is None, skip the fast-tokenizer attempt (do not
return an error) so the existing HF loader/fallback logic can run; also when
FastTokenizer::from_file fails, allow the code to continue to the HuggingFace
fallback rather than short-circuiting.

---

Nitpick comments:
In `@lib/llm/src/tokenizers/fast.rs`:
- Around line 125-143: The test test_fast_with_decode_stream currently only
checks that stream.step() doesn't error; change it to accumulate the emitted
chunks from wrapper.decode_stream(&prompt_ids, true) by collecting each
step(...) return (concatenate non-empty chunks) into a single string, then
obtain the expected text by decoding the continuation (e.g. via
wrapper.decode(cont_ids) or wrapper.decode(continuation)) and assert equality
(or assert that the accumulated string contains the expected continuation);
update references in the test around FastTokenizer::from_file,
TokenizerWrapper::from, wrapper.decode_stream, and stream.step to perform this
accumulation and comparison instead of a no-op loop.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 14f36a69-69af-4531-bb2b-0dfeeb99d059

📥 Commits

Reviewing files that changed from the base of the PR and between 0b66515 and 13eccdd.

⛔ Files ignored due to path filters (2)

Cargo.lock is excluded by !**/*.lock
lib/bindings/python/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (9)

Cargo.toml
components/src/dynamo/frontend/frontend_args.py
components/src/dynamo/frontend/main.py
lib/bindings/python/pyproject.toml
lib/llm/Cargo.toml
lib/llm/src/model_card.rs
lib/llm/src/tokenizers.rs
lib/llm/src/tokenizers/fast.rs
lib/llm/tests/data/sample-models/minimal-bpe/tokenizer.json

biswapanda · 2026-03-14T23:17:08Z

There are cargo-deny related CI failures due to upstream transitive dependencies and a PR for fast-tokens is being reviewed by Crusoe team - crusoecloud/fastokens#5

Issue: The fastokens crate (v0.1.0) declares hf-hub = "0.4.3" with default features, which pulls in native-tls and openssl-sys. Dynamo's deny.toml explicitly bans both crates (lines 63-64), causing cargo-deny to fail in CI.

The dependency chain is:

fastokens -> hf-hub (default features) -> ureq -> native-tls -> openssl-sys

…rged

biswapanda self-assigned this Mar 14, 2026

biswapanda requested a review from a team as a code owner March 14, 2026 21:52

biswapanda requested a review from a team March 14, 2026 21:52

biswapanda requested a review from a team as a code owner March 14, 2026 21:52

pull-request-size Bot added the size/L label Mar 14, 2026

github-actions Bot added feat frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` labels Mar 14, 2026

biswapanda enabled auto-merge (squash) March 14, 2026 21:57

biswapanda changed the title ~~feat: integrate fasttokens high-performance BPE tokenizer backend~~ feat: integrate fasttokens BPE tokenizer backend Mar 14, 2026

coderabbitai Bot reviewed Mar 14, 2026

View reviewed changes

nnshah1 reviewed Mar 14, 2026

View reviewed changes

Comment thread components/src/dynamo/frontend/frontend_args.py Outdated

update

ec22387

biswapanda added 3 commits March 14, 2026 16:58

fix

07c9f3c

rename fasttokens to fastokens

7af421e

fix CI issue. point to HEAD of crusoecloud/fastokens#5 until PR is me…

70d9065

…rged

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 00:19 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 00:22 Inactive

biswapanda added 3 commits March 14, 2026 18:09

address comment

45c1362

address comment

516cc2e

update

076f94d

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 01:11 Inactive

update

ae96da0

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 01:20 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 01:21 Inactive

biswapanda mentioned this pull request Mar 15, 2026

deps: reduce hf-hub's transitive dependencies crusoecloud/fastokens#5

Merged

biswapanda requested a review from nnshah1 March 15, 2026 01:24

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 04:46 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 04:47 Inactive

cargo toml/lock update

dbc6a55

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 05:05 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 05:06 Inactive

rename to fastokens

9e21cf5

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 05:29 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 05:30 Inactive

Merge branch 'main' into bis/fast-tokens-dynamo

6b419ac

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 05:47 Inactive

biswapanda disabled auto-merge March 15, 2026 10:48

update reference to https://github.com/Atero-ai/fastokens

b2b1c6b

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 17:35 Inactive

static linking lib-pcre2 for fastokens

ddef6f2

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 19:03 Inactive

fatokens src: from github to crates.io

fea3e82

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 20:52 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB March 15, 2026 20:53 Inactive

biswapanda requested a review from hutm March 15, 2026 20:56

biswapanda enabled auto-merge (squash) March 15, 2026 22:13

nv-anants reviewed Mar 15, 2026

View reviewed changes

Comment thread Cargo.toml

nv-anants reviewed Mar 15, 2026

View reviewed changes

Comment thread .cargo/config.toml

nv-anants approved these changes Mar 15, 2026

View reviewed changes

biswapanda merged commit da810a2 into main Mar 15, 2026
150 checks passed

biswapanda deleted the bis/fast-tokens-dynamo branch March 15, 2026 23:49

ShounakRay pushed a commit to ShounakRay/fuzzy-dynamo that referenced this pull request Mar 20, 2026

feat: integrate fastokens BPE tokenizer backend (ai-dynamo#7387)

f393c02

yao531441 pushed a commit to yao531441/dynamo that referenced this pull request May 13, 2026

feat: integrate fastokens BPE tokenizer backend (ai-dynamo#7387)

15939b3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: integrate fastokens BPE tokenizer backend#7387

feat: integrate fastokens BPE tokenizer backend#7387
biswapanda merged 18 commits into
mainfrom
bis/fast-tokens-dynamo

biswapanda commented Mar 14, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Mar 14, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

biswapanda commented Mar 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

biswapanda commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

biswapanda commented Mar 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

biswapanda commented Mar 14, 2026 •

edited

Loading

coderabbitai Bot commented Mar 14, 2026 •

edited

Loading