|
encoder_hidden_states ( `torch.LongTensor` of shape `(batch size, encoder_hidden_states dim)`, *optional*): |
|
Conditional embeddings for cross attention layer. If not given, cross-attention defaults to |
|
self-attention. |
The shape is (batch_size, seq_length, embedding_dim), isn't it? Also it's supposed to be float tensor. Or maybe it supports both long and float?