Skip to content

[BUG]: Chat GPT inference. Error in load_state_dict for BloomForCasualLM #3061

@sdvies

Description

@sdvies

🐛 Describe the bug

Hello!

Bug description

I've reproduced the ChatGPT Stages described in https://github.com/hpcaitech/ColossalAI/tree/main/applications/ChatGPT/examples#inference-exampleafter-stage3.

I've obtained the rm_checkpoint.pt from Stage 2 and I could obtain the actor_checkpoint_prompts.pt as well from Stage 3 (based on rm_checkpoint.pt).

The problem occurs when I try to run the inference process because the following error raises:

RuntimeError: Error(s) in loading state_dict for BloomForCausalLM:
	Unexpected key(s) in state_dict: "transformer.h.2.input_layernorm.weight", "transformer.h.2.input_layernorm.bias", "transformer.h.2.self_attention.query_key_value.weight", "transformer.h.2.self_attention.query_key_value.bias", "transformer.h.2.self_attention.query_key_value.lora_A", "transformer.h.2.self_attention.query_key_value.lora_B" ...

To reproduce

In order to obtain the rm_checkpoint.pt I executed the following:
python train_reward_model.py --pretrain bigscience/bloom-560m --model bloom --batch_size 1 --lora_rank 16 --strategy naive

The actor_checkpoint_prompts.pt was obtain by executing:

python train_prompts.py prompts.csv --pretrain bigscience/bloom-560m --strategy naive --model bloom --train_batch_size 1 --lora_rank 16 --experience_batch_size 1 --max_epochs

And finally I tried to do the inference by:
python inference.py --model_path actor_checkpoint_prompts.pt --pretrain bigscience/bloom-560m --model bloom

Thank you very much!

Environment

I include the environment information as follows:

  • CUDA: 11.1
  • cuDNN: 8
  • Python: 3.8.8
  • PyTorch: 1.13.1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions