Skip to content

Conversation

@lintangsutawika
Copy link

added template_config_name arg so that the dataset and prompt template source can be different, example: prompt from XNLI En but data to eval from XNLI Fr

…e source can be different, example: prompt from XNLI En but data to eval from XNLI Fr
@lintangsutawika
Copy link
Author

Example to run eval on XNLI

CHECKPOINT_PATH="bigscience/bloom-350m"
OUTPUT_DIR="bloom-xnli"
dataset_name="xnli"
template_config_name="en"
dataset_config_name="fr"

python t-zero/evaluation/run_eval.py \
        --dataset_name $dataset_name \
        --dataset_config_name $dataset_config_name \
        --template_config_name $template_config_name \
        --model_name_or_path $CHECKPOINT_PATH \
        --output_dir $OUTPUT_DIR \
        --template_name 'GPT-3 style'

@thomasw21

Copy link
Member

@VictorSanh VictorSanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @lintangsutawika! lgtm! feel free to merge!

"--template_config_name",
type=str,
default=None,
help="The name of the dataset_config_name of the template we want to use, example: use XNLI En prompts for XNLI Fr",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is using english prompts on the french XNLI something good? Like I understand it's some sort of measure multilinguality, but I would have expected to have a bunch of frnech prompts for the XNLI fr ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought using english prompts for XNLI is indeed what we wanted to accomplish?

Anyway, evaluating XNLI with prompts and dataset in the same language only requires the Promptsource part be updated. So --template_config_name adds more flexibility while not needing to change too much of the code to be able to eval on multiple multilingual settings.

@thomasw21
Copy link
Member

Looks good to me besides the fact that prompts are english prompts (I expected to have them in their languages).

@VictorSanh
Copy link
Member

Looks good to me besides the fact that prompts are english prompts (I expected to have them in their languages).

i think this is fine. A bunch of the prompts coming from the eval hackathon are code switching

@thomasw21
Copy link
Member

Thanks @lintangsutawika

@thomasw21 thomasw21 merged commit 50c27b5 into bigscience-workshop:thomas/support_new_accelerate_api Jul 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants