Multi Model Decoder Modification by chilukam-qti · Pull Request #1 · CodeLinaro/onnxruntime-genai

chilukam-qti · 2026-02-10T07:20:24Z

Contains Following Modifications:

KV Cache Updated after inference of model conatins KV Cache as output
1. KV cache Indices added to pipelines containing kv cache as inputs or outputs (previously done only for kv cache as input)
2. KV cache updates post inference of model containing kv cache as output
Memcopy of embeddings considers float datatype and uint16 datatype for count in copy_n
Generalised pixel_values tensor name based on genai_config
Optimized Sliding Window based KVCache copy from present to past by copying only cache for seqlen instead of entire context length

1. KV Cache Updated after inference of model conatins KV Cache as output a. KV cache Indices added to pipelines containing kv cache as inputs or outputs (previously done only for kv cache as input) b. KV cache updates post inference of model containing kv cache as output 2. Memcopy of embeddings considers float datatype and uint16 datatype for count :in copy_n

Optimized Sliding Window based KVCache copy from present to past by copying only cache for seqlen instead of entire context length

chilukam-qti added 2 commits February 10, 2026 12:44

Generalised pixel_values tensor name based on genai_config

8b61ec4

chilukam-qti requested review from gnedanur, prudhvi-qti and qc-tbhardwa February 10, 2026 07:20

KV Cache optimization Based on SeqLen

1b73c32

Optimized Sliding Window based KVCache copy from present to past by copying only cache for seqlen instead of entire context length

prudhvi-qti approved these changes Feb 10, 2026

View reviewed changes

prudhvi-qti merged commit 96dfb53 into prudhvi-qti/multimodel-decoding-pipeline Feb 10, 2026
1 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi Model Decoder Modification#1

Multi Model Decoder Modification#1
prudhvi-qti merged 3 commits intoprudhvi-qti/multimodel-decoding-pipelinefrom
chilukam/multimodel-decoding-pipeline

chilukam-qti commented Feb 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chilukam-qti commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contains Following Modifications:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chilukam-qti commented Feb 10, 2026 •

edited

Loading