Relation Extraction task on DDI 2013 Bio dataset
- DDI (Drug-Drug Interaction) 2013 dataset (link)
- Relation Extraction task on Bioinformatics
- 175 MEDLINE abstracts and 730 DrugBank documents
- 5 DDI types (Negative, Mechanism, Effect, Advice, Int)
- Use the preprocessed dataset from this repo
- Didn't replace the name of drug to
DRUG0,DRUG1, orDRUGN, comparing to other researches
>>> from transformers import BertModel, BertTokenizer
>>> model = BertModel.from_pretrained('monologg/biobert_v1.1_pubmed')
>>> tokenizer = BertTokenizer.from_pretrained('monologg/biobert_v1.1_pubmed')- python>=3.5
- torch==1.1.0
- transformers>=2.2.2
- scikit-learn>=0.20.0
$ pip3 install -r requirements.txtYou must give --do_lower_case option if pretrained model is uncased model.
$ python3 main.py --do_train --do_evalF1 micro score on 4 Positive types (Mechanism, Effect, Advice, Int)
| F1 micro (%) | |
|---|---|
| CNN | 69.75 |
| AB-LSTM | 69.39 |
| MCCNN | 70.21 |
| GCNN | 72.55 |
| Recursive NN | 73.50 |
| RHCNN | 75.48 |
| SMGCN | 76.64 |
| BIO-R-BERT | 82.66 |
Using R-BERT architecture, with different pretrained weights
| F1 Micro (%) | |
|---|---|
| Random Init | 47.04 |
| bert-base-cased | 80.62 |
| scibert-scivocab-uncased | 81.30 |
| biobert_v1.0_pubmed_pmc | 82.30 |
| biobert_v1.1_pubmed | 82.66 |
