Counter-GAP

Counter-GAP is a coreference resolution-based bias diagnostic dataset. data/C-GAP.tsv is the dataset used in our experiments where dots after titles (e.g., "." in "Mr." and "Mrs.") are removed; data/C-GAP-withdot.tsv is the original dataset with dots in titles.

The implementation of coreference resolution models in coref/ are directly copied from BERT and SpanBERT for Coreference Resolution with minor changes in coref/gap_to_jsonlines.py.

Bias Mitigation

First, download the OntoNotes datasets (following the instructions in coref/) and generate the aCDA/nCDA version of the training set (assuming the training set train.english.v4_gold_conll is in data/):

cd cda
python -c "from name import *;m=NameMapping()"
python word_swapper.py > acda-train.english.v4_gold_conll
python word_swapper.py --name > ncda-train.english.v4_gold_conll

Next, subsitute the original training set (train.english.v4_gold_conll) with acda-train.english.v4_gold_conll or ncda-train.english.v4_gold_conll, and follow the instructions in coref/ to fine-tune a pre-trained BERT/SpanBERT on them to obtain the debiased checkpoints.

Evaluation on Counter-GAP

For each debiasing_type in ("none", "acda", "ncda"), download the checkpoints for each model_name in ("bert_base", "bert_large", "spanbert_base", "spanbert_large") to the corresponding dirs in data/, and run the following commands:

cd coref
python gap_to_jsonlines.py ../data/C-GAP.tsv ../data/${model_name}/vocab.txt
CUDA_VISIBLE_DEVICES=0 python predict.py ${model_name} ../data/C-GAP.jsonlines ../data/${model_name}_output.jsonlines
python to_gap_tsv.py ../data/${model_name}_output.jsonlines
mv ../data/${model_name}_output.tsv ../results/${debiasing_type}/${model_name}_output.tsv
mv ../data/C-GAP.tsv ../results/${debiasing_type}/C-GAP.tsv

Next, calculate the evaluation metrics:

cd ..
python scorer.py --model ${model_name} --debias ${debiasing_type}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Counter-GAP

Bias Mitigation

Evaluation on Counter-GAP

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
cda		cda
coref		coref
data		data
results		results
LICENSE		LICENSE
README.md		README.md
scorer.py		scorer.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

Counter-GAP

Bias Mitigation

Evaluation on Counter-GAP

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages