Skip to content

About the implementation of Poly Encoder #13

@Hannibal046

Description

@Hannibal046

Hi @chijames, thanks so much for this wonderful project!
After digging into the code, I have two questions:

  • Is there any special reason why masking is not implemented in this section?

    Poly-Encoder/encoder.py

    Lines 72 to 78 in e5299e3

    def dot_attention(self, q, k, v):
    # q: [bs, poly_m, dim] or [bs, res_cnt, dim]
    # k=v: [bs, length, dim] or [bs, poly_m, dim]
    attn_weights = torch.matmul(q, k.transpose(2, 1)) # [bs, poly_m, length]
    attn_weights = F.softmax(attn_weights, -1)
    output = torch.matmul(attn_weights, v) # [bs, poly_m, dim]
    return output

  • Can we speed up the construction of poly_code_embeddings by using nn.Parameters? In this way, we don't need to create poly_ids and move it to GPU in every batches.

Thanks for your reply!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions