About the implementation of Poly Encoder

Hi @chijames,  thanks so much for this wonderful project!
After digging into the code, I have two questions:
-  Is there any special reason why masking is not implemented in this section? https://github.com/chijames/Poly-Encoder/blob/e5299e319c73666485667e8277d8ff0e2b7e253e/encoder.py#L72-L78

- Can we speed up the construction of `poly_code_embeddings` by using `nn.Parameters`? In this way, we don't need to create `poly_ids` and move it to GPU in every batches.

Thanks for your reply!

	def dot_attention(self, q, k, v):
	# q: [bs, poly_m, dim] or [bs, res_cnt, dim]
	# k=v: [bs, length, dim] or [bs, poly_m, dim]
	attn_weights = torch.matmul(q, k.transpose(2, 1)) # [bs, poly_m, length]
	attn_weights = F.softmax(attn_weights, -1)
	output = torch.matmul(attn_weights, v) # [bs, poly_m, dim]
	return output

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About the implementation of Poly Encoder #13

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

About the implementation of Poly Encoder #13

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions