What feature would you like to request?
it would be nice to have an optional parameter for custom models to specify which model output to use.
right now i'm using fastembed for testing qwen3, and to make the model fastembed compatible i have to remove the first model output from the graph since i can't specify which model output to use.
this could probably result in small performance gains too, since i assume doing pooling and normalization in optimized onnx (as is sometimes the case with the secondary output) should be slightly faster
Is there any additional information you would like to provide?
No response
What feature would you like to request?
it would be nice to have an optional parameter for custom models to specify which model output to use.
right now i'm using fastembed for testing qwen3, and to make the model fastembed compatible i have to remove the first model output from the graph since i can't specify which model output to use.
this could probably result in small performance gains too, since i assume doing pooling and normalization in optimized onnx (as is sometimes the case with the secondary output) should be slightly faster
Is there any additional information you would like to provide?
No response