feat: Add basic text generation support with native models, initially supporting Gemma3 by kijai · Pull Request #12392 · Comfy-Org/ComfyUI

kijai · 2026-02-10T16:44:54Z

This adds generic text generation support that currently tested and works with:

Gemma3 12B
Gemma3 4B (needs new model file to support images)

Generation itself also works with at least Qwen VL 2.5, but the model loading part needs figuring out how to handle the lm_head weight so that it's not loaded if text generation isn't used, this isn't an issue with Gemma3 as it doesn't have separate lm_head.

For example with LTX2, same Gemma3 12B model can be used as text encoder and prompt enhancer:

Previously with long prompt the outputs would start repeating

Should fix the corruption with long prompts

comfyanonymous · 2026-02-14T03:08:14Z

        return self.transformer.load_state_dict(sd, strict=False, assign=getattr(self, "can_assign_sd", False))

+    def generate(self, tokens, do_sample, max_length, temperature, top_k, top_p, min_p, repetition_penalty, seed, stop_tokens=[]):
+        if isinstance(tokens, dict):


Why do you need to handle dicts?

comfyanonymous · 2026-02-14T03:10:12Z

        return {}

+    def decode(self, token_ids, skip_special_tokens=True):
+        if torch.is_tensor(token_ids):


To make things consistent and easier the token_ids should always be in a single data type. If they can be both lists or tensors it makes things less maintainable.

True, it can always stay as a list of ints, that's cleaner. Fixed.

nick-morhun · 2026-02-15T08:12:32Z

+        comfy.ops.uncast_bias_weight(module, weight, None, offload_stream)
+        return x
+
+    def generate(self, embeds=None, do_sample=True, max_length=256, temperature=1.0, top_k=50, top_p=0.9, min_p=0.0, repetition_penalty=1.0, seed=42, stop_tokens=[], initial_tokens=[], execution_dtype=None, min_tokens=0):


Where did you get these default numbers?

They are just placeholders, actual defaults used come from the node used.

nick-morhun · 2026-02-15T08:13:17Z

+            images = []
+        else:
+            samples = image.movedim(-1, 1)
+            total = int(896 * 896)


It's the default for Gemma3, as it states on their model page:
Images, normalized to 896 x 896 resolution and encoded to 256 tokens each

nick-morhun · 2026-02-15T08:13:39Z

+            embed_count = 0
+            for r in text_tokens:
+                for i, token in enumerate(r):
+                    if token[0] == 262144 and embed_count < len(images):


Why 262144?

This is the token id for <image_soft_token>

nick-morhun · 2026-02-15T08:14:28Z

+    def generate(self, tokens, do_sample, max_length, temperature, top_k, top_p, min_p, repetition_penalty, seed):
+        tokens_only = [[t[0] for t in b] for b in tokens]
+        embeds, _, _, embeds_info = self.process_tokens(tokens_only, self.execution_device)
+        embeds = comfy.utils.normalize_image_embeddings(embeds, embeds_info, target_std=0.0156)


Why 0.0156?

Hmm this one could be done better, changed it to the proper calculation.

zwukong · 2026-02-19T05:33:54Z

So great PR ,very useful,we don't need to load and call VL again anymore. And how about Qwen VL 2.5 or 3 using video as input,describe video

kijai added 26 commits January 27, 2026 02:06

initial commit

5f749bf

Add min_p and option to skip sampling

b380ebe

restructure and cleanup

c5ebe99

generalize more, support some other models too

d040e14

Add test node

2010b72

Update llama.py

768a128

initial commit

2578768

Add min_p and option to skip sampling

40b11a4

restructure and cleanup

82296a1

generalize more, support some other models too

f7d8024

Add test node

7775034

Update llama.py

18de7cb

Adapt kv_cache to the new implementation

42039b2

Merge branch 'gemma3' of https://github.com/kijai/ComfyUI into gemma3

f2f251b

Fix tokenization of multiple special tokens

128d4bb

Merge remote-tracking branch 'upstream/master' into gemma3

9950c2a

Pre-allocate kv_cache, cleanup

719ab94

Merge remote-tracking branch 'upstream/master' into gemma3

b0fd9e1

Update llama.py

750aa7f

Merge remote-tracking branch 'upstream/master' into gemma3

9ede00f

Restructure, support Gemma3_4B vision

bd2c4e4

Remove extra whitespaces

af0a67e

Update llama.py

2d84cb7

Merge remote-tracking branch 'upstream/master' into gemma3

e55d28d

Merge remote-tracking branch 'upstream/master' into gemma3

1d4bc39

Add LTX2 prompt enhance node

013b480

kijai requested review from Kosinkadink, comfyanonymous and guill as code owners February 10, 2026 16:44

Temporary fix for GGUF dtype mismatch

b6092e8

comfyanonymous reviewed Feb 10, 2026

View reviewed changes

Comment thread comfy/text_encoders/llama.py Outdated

kijai added 6 commits February 11, 2026 00:29

Attempt to fix sliding attention with kv_cache

959cf54

Previously with long prompt the outputs would start repeating

Merge remote-tracking branch 'upstream/master' into gemma3

4ea0a43

Better sliding window/KV cache handling

114469b

Don't apply sliding window when generating...

04f576c

Should fix the corruption with long prompts

Merge remote-tracking branch 'upstream/master' into gemma3

237ca59

Fix tokenizer special token handling and revert sliding attention change

aeccf62

comfyanonymous reviewed Feb 14, 2026

View reviewed changes

Comment thread comfy/text_encoders/lt.py Outdated

comfyanonymous reviewed Feb 14, 2026

View reviewed changes

kijai added 4 commits February 14, 2026 13:00

Merge remote-tracking branch 'upstream/master' into gemma3

cbd8750

Update lt.py

d4c6321

Support Gemma3_12B standalone without LTX2 specific layers

a0820d5

Unify and simplify generated token_ids to always be list of ints

8f9be69

nick-morhun reviewed Feb 15, 2026

View reviewed changes

kijai added 2 commits February 16, 2026 16:39

Scale image embeds better

02f897c

Merge remote-tracking branch 'upstream/master' into gemma3

f9b0a10

comfyanonymous merged commit 6d11cc7 into Comfy-Org:master Feb 19, 2026
12 checks passed

This was referenced Mar 1, 2026

Add native support for LLMs #12481

Closed

Doesn't work with new "TextGenerate" node (add mmproj support) city96/ComfyUI-GGUF#424

Open

Conversation

kijai commented Feb 10, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zwukong commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zwukong commented Feb 19, 2026 •

edited

Loading