Are the transformers of bi-encoder trained separately?

_(To be honest, I'm not used to "deep learning coding" (PyTorch, Huggingface, etc...), so this might be a silly question. Keep in mind I'm a beginner.)_

The original paper said that context encoder and candidate encoder are trained separately.


<img width="634" alt="スクリーンショット 2020-10-24 9 19 18" src="https://user-images.githubusercontent.com/36184621/97063333-0ff99200-15da-11eb-99d6-5143eaba790b.png">

<img width="902" alt="スクリーンショット 2020-10-24 9 20 15" src="https://user-images.githubusercontent.com/36184621/97063354-328bab00-15da-11eb-9ec9-ded040443d7f.png">

However I found in your code that both transformers are called as `self.bert()`.

https://github.com/chijames/Poly-Encoder/blob/master/encoder.py#L20-L27

<br>

Is it OK? I doubt these two encoders have different weights after training.

FYI: In the official implementation of BLINK(https://arxiv.org/pdf/1911.03814.pdf ) paper, they prepare different methods. https://github.com/facebookresearch/BLINK/blob/master/blink/biencoder/biencoder.py#L37-L48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are the transformers of bi-encoder trained separately? #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Are the transformers of bi-encoder trained separately? #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions