Skip to content

Multi Model Decoder Modification#1

Merged
prudhvi-qti merged 3 commits intoprudhvi-qti/multimodel-decoding-pipelinefrom
chilukam/multimodel-decoding-pipeline
Feb 10, 2026
Merged

Multi Model Decoder Modification#1
prudhvi-qti merged 3 commits intoprudhvi-qti/multimodel-decoding-pipelinefrom
chilukam/multimodel-decoding-pipeline

Conversation

@chilukam-qti
Copy link
Collaborator

@chilukam-qti chilukam-qti commented Feb 10, 2026

Contains Following Modifications:

  1. KV Cache Updated after inference of model conatins KV Cache as output
    1. KV cache Indices added to pipelines containing kv cache as inputs or outputs (previously done only for kv cache as input)
    2. KV cache updates post inference of model containing kv cache as output
  2. Memcopy of embeddings considers float datatype and uint16 datatype for count in copy_n
  3. Generalised pixel_values tensor name based on genai_config
  4. Optimized Sliding Window based KVCache copy from present to past by copying only cache for seqlen instead of entire context length

chilukam-qti added 2 commits February 10, 2026 12:44
1. KV Cache Updated after inference of model conatins KV Cache as output
	a. KV cache Indices added to pipelines containing kv cache as inputs
	or outputs (previously done only for kv cache as input)
	b. KV cache updates post inference of model containing kv cache
	as output
2. Memcopy of embeddings considers float datatype and uint16 datatype
   for count :in copy_n
Optimized Sliding Window based KVCache copy from present to past by
copying only cache for seqlen instead of entire context length
@prudhvi-qti prudhvi-qti merged commit 96dfb53 into prudhvi-qti/multimodel-decoding-pipeline Feb 10, 2026
1 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants