Skip to content

Added MDP generation to QEff Compile#930

Open
quic-mohmeh wants to merge 3 commits intoquic:mainfrom
quic-mohmeh:mdp
Open

Added MDP generation to QEff Compile#930
quic-mohmeh wants to merge 3 commits intoquic:mainfrom
quic-mohmeh:mdp

Conversation

@quic-mohmeh
Copy link
Copy Markdown

This PR adds the MDP generation required in case of disaggregated serving for Prefill. This supports both Pipeline Prefill + Tensor Slicing and also supports passing custom cores to the MDP generator

Signed-off-by: Mohit Mehta <mohmeh@qti.qualcomm.com>
Signed-off-by: Mohit Mehta <mohmeh@qti.qualcomm.com>
Signed-off-by: Mohit Mehta <mohmeh@qti.qualcomm.com>
@quic-mohmeh
Copy link
Copy Markdown
Author

Tested and working on the following model classes

  • CodeLlama-7b-Instruct
  • falcon-7b-instruct
  • gemma-2-9b-it
  • gpt-oss-20b
  • granite-3.1-8b-instruct
  • Llama-3.2-1B-Instruct
  • Llama-3.2-3B
  • Phi-3-mini-4k-instruct

@quic-rishinr
Copy link
Copy Markdown
Contributor

@mamtsing @ochougul please review the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants