torchrun --standalone --nproc_per_node=2 train_sft.py \
--pretrain "/workspace/LLaMA/7B-bin" \
--model 'llama' \
--strategy ddp \
--log_interval 10 \
--save_path /workspace/ColossalAI/applications/Chat/coati-7b \
--dataset /workspace/ColossalAI/applications/Chat/data/instinwild_ch.json \
--batch_size 4 \
--accimulation_steps 8 \
--lr 2e-5 \
--max_datasets_size 512 \
--max_epochs 1 \
--lora_rank 4
File "/workspace/ColossalAI/applications/Chat/examples/train_sft.py", line 184, in <module>
train(args)
File "/workspace/ColossalAI/applications/Chat/examples/train_sft.py", line 158, in train
trainer.save_model(path=args.save_path, only_rank0=True, tokenizer=tokenizer)
File "/opt/conda/lib/python3.9/site-packages/coati/trainer/sft.py", line 158, in save_model
self.strategy.save_model(model=self.model, path=path, only_rank0=only_rank0, tokenizer=tokenizer)
TypeError: save_model() got an unexpected keyword argument 'tokenizer'
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /workspace/ColossalAI/applications/Chat/examples/train_sft.py:184 in <module> │
│ │
│ 181 │ parser.add_argument('--lr', type=float, default=5e-6) │
│ 182 │ parser.add_argument('--accimulation_steps', type=int, default=8) │
│ 183 │ args = parser.parse_args() │
│ ❱ 184 │ train(args) │
│ 185 │
│ │
│ /workspace/ColossalAI/applications/Chat/examples/train_sft.py:158 in train │
│ │
│ 155 │ trainer.fit(logger=logger, log_interval=args.log_interval) │
│ 156 │ │
│ 157 │ # save model checkpoint after fitting on only rank0 │
│ ❱ 158 │ trainer.save_model(path=args.save_path, only_rank0=True, tokenizer=tokenizer) │
│ 159 │ # save optimizer checkpoint on all ranks │
│ 160 │ if args.need_optim_ckpt: │
│ 161 │ │ strategy.save_optimizer(trainer.optimizer, │
│ │
│ /opt/conda/lib/python3.9/site-packages/coati/trainer/sft.py:158 in save_model │
│ │
│ 155 │ │ │ │ path: str, │
│ 156 │ │ │ │ only_rank0: bool = False, │
│ 157 │ │ │ │ tokenizer: Optional[PreTrainedTokenizerBase] = None) -> None: │
│ ❱ 158 │ │ self.strategy.save_model(model=self.model, path=path, only_rank0=only_rank0, tok │
│ 159 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: save_model() got an unexpected keyword argument 'tokenizer'
Python 3.9.16
torch 1.13.1
transformers 4.28.0.dev0
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
🐛 Describe the bug
I'm trying to reproduce ColossalChat, and here are the commands I used
log:
Environment
nvcc --version