-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Dear authors,
Thank you so much for your great work. May I ask what is the real meaning of attention weights? Do they mean the feature maps of different attention layers or the query, key, value weights or parameters? I am so confused with this, I do not exactly know what I should visualize, should I save the feature maps from different attention layers and then draw the heat maps for these layers? I think it seems like this in the figure shown in your paper. Look forward to your reply. Thank you so much.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels