[Feature Request]: Allow specification of a custom model inference method for a RunInference ModelHandler

### What would you like to happen?

The current implementation of RunInference provides model handlers for PyTorch and Sklearn models.  These handlers assume that the method to call for inference is fixed:
- Pytorch: Do a forward pass by calling the `__call__` method -> `output = torch_model(input)`
- Sklearn: call the model's `predict` method -> `output = sklearn_model.predict(input)`

However in some cases we want to provide a custom method for RunInference to call.
Two examples:
1. A number of pretrained models loaded with the Huggingface transformers library recommend using the `generate()` method. [From the Huggingface docs on the T5 mode](https://huggingface.co/docs/transformers/model_doc/t5#inference):

      > At inference time, it is recommended to use [generate()](https://huggingface.co/docs/transformers/v4.21.0/en/main_classes/text_generation#transformers.generation_utils.GenerationMixin.generate). This method takes care of encoding the input and feeding the encoded hidden states via cross-attention layers to the decoder and auto-regressively generates the decoder output.
      
      ```from transformers import T5Tokenizer, T5ForConditionalGeneration
      
      tokenizer = T5Tokenizer.from_pretrained("t5-small")
      model = T5ForConditionalGeneration.from_pretrained("t5-small")
      
      input_ids = tokenizer("translate English to German: The house is wonderful.", return_tensors="pt").input_ids
      outputs = model.generate(input_ids)
      print(tokenizer.decode(outputs[0], skip_special_tokens=True))
      Das Haus ist wunderbar.
      ```

2. Using [OpenAI's CLIP model](https://github.com/openai/CLIP/blob/main/clip/model.py) which is implemented as a torch model we might not want to execute the normal forward pass to encode both images and text `image_embedding, text_embedding = clip_model(image, text)` but instead only compute the image embeddings `image_embedding = clip_model.encode_image(image)`.


**Solution:** Allowing the user to specify the `inference_fn` when creating a ModelHandler would enable this usage.


### Issue Priority

Priority: 2

### Issue Component

Component: sdk-py-core

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Allow specification of a custom model inference method for a RunInference ModelHandler #22572

What would you like to happen?

Issue Priority

Issue Component

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request]: Allow specification of a custom model inference method for a RunInference ModelHandler #22572

Description

What would you like to happen?

Issue Priority

Issue Component

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions