-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
Hi @chijames, thanks so much for this wonderful project!
After digging into the code, I have two questions:
-
Is there any special reason why masking is not implemented in this section?
Lines 72 to 78 in e5299e3
def dot_attention(self, q, k, v): # q: [bs, poly_m, dim] or [bs, res_cnt, dim] # k=v: [bs, length, dim] or [bs, poly_m, dim] attn_weights = torch.matmul(q, k.transpose(2, 1)) # [bs, poly_m, length] attn_weights = F.softmax(attn_weights, -1) output = torch.matmul(attn_weights, v) # [bs, poly_m, dim] return output -
Can we speed up the construction of
poly_code_embeddingsby usingnn.Parameters? In this way, we don't need to createpoly_idsand move it to GPU in every batches.
Thanks for your reply!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels