Skip to content

Conversation

@pluflou
Copy link
Collaborator

@pluflou pluflou commented Jun 9, 2025

What this PR solves

TorchModel's evaluate function expects a dictionary input, where each value is an input tensor of shape [n_batch, n_sample] or [n_samples] if batch size is 1.

If a user wants to create a TorchModel with an nn.Module that contains convolution layers or RNN layers for example, the expected input shape for that model will be different.

This PR adds a method to check for different model architectures, and adjusts the input shape accordingly to correctly run the supported architectures.

This adjustment is needed for running the CoAD NN.

Implementation

  • Models with linear, or recurrent (RNN, LSTM, GRU) layers are currently supported. Models with 2D and 3D convolutional layers or Transformer layers are not supported at this time.

  • The TorchModule now has a method that will permute the input shape as needed to create a dictionary that follows the TorchModel's evaluate function requirements, as well as reconstruct the output tensor from the output dict correctly to return the expected shape for each architecture. The expected shape depends on the model type and inherits from the original nn.Module:

    • For non-conv/non-rnn models: [n_batch, n_samples, n_dim]
    • For 1D conv models: [n_batch, n_dim, n_samples]
    • For rnn models: [n_batch, n_samples, n_dim], or [n_samples, n_batch, n_dim] if the RNN's batch_first attribute is set to False (it is by default).
  • For TorchModel, the input is still expected to be a dictionary input, where each value is an input tensor of shape [n_batch, n_sample] or [n_samples] if batch size is 1. When constructing the torch tensor before passing the input to the underlying nn.Module, and after obtaining the tensor from the model output, there is now a method that permutes the tensor as needed to pass the correct shape based on the model type (list above).

  • I also adjusted a bug in the TorchModel._arrange_inputs method that was adding a dimension to the final tensor, which was causing issues when trying to permute dimensions.

  • I did not implement any tests yet, but should at some point.

@pluflou pluflou requested a review from roussel-ryan June 13, 2025 20:30
This method ensures that the input tensor has the correct shape for the model, especially for CNN
and RNN architectures.
- For non-conv/non-rnn models: [n_batch, n_samples, n_dim]
- For 1D conv models: [n_batch, n_dim, n_samples]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[n_batch, n_samples,n_dim, L]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm taking a look at this now, and I'm confused. The nn.Conv1d expects a 3-dimensional input tensor in the shape (batch_size, in_channels, sequence_length). Here we are calling those (n_batch, n_dim, n_samples) which may be confusing (I can adjust the naming). But I don't understand how we'd pass the 4-d input you're suggesting to nn.Conv1d?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants