Skip to content

如何调整train batch size #109

@Fu-Fu-Fu-Fu

Description

@Fu-Fu-Fu-Fu

在您提供的SFT命令中,有两个参数: --per_device_train_batch_size 1 --gradient_accumulation_steps 2
实际batch size就是 12显卡数。我试图增大训练的batchsize,也就是增加per_device_train_batch_size,但是会报错,我搜索显示貌似是open r1不推荐修改这个参数。如果增大gradient_accumulation_steps这个参数,是否又会影响性能呢?

我用4张卡,如果采用原参数,也就是batch size只为8,这是否太小了?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions