Skip to content

Diffusers example train_text_to_image_lora.py broken gradients? #6277

@jaymefosa

Description

@jaymefosa

Describe the bug

Gradient won't backprop when running the example lora training.

Reproduction

running the command as specified:

export MODEL_NAME="CompVis/stable-diffusion-v1-4"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

accelerate launch --mixed_precision="fp16" train_text_to_image_lora.py
--pretrained_model_name_or_path=$MODEL_NAME
--dataset_name=$DATASET_NAME --caption_column="text"
--resolution=512 --random_flip
--train_batch_size=1
--num_train_epochs=100 --checkpointing_steps=5000
--learning_rate=1e-04 --lr_scheduler="constant" --lr_warmup_steps=0
--seed=42
--output_dir="sd-pokemon-model-lora"
--validation_prompt="cute dragon creature"

Logs

File "train_text_to_image_lora.py", line 950, in <module>
    main()
  File "train_text_to_image_lora.py", line 801, in main
    accelerator.backward(loss)
  File "/home/fsa/anaconda3/envs/manimate/lib/python3.8/site-packages/accelerate/accelerator.py", line 1903, in backward
    self.scaler.scale(loss).backward(**kwargs)
  File "/home/fsa/anaconda3/envs/manimate/lib/python3.8/site-packages/torch/_tensor.py", line 492, in backward
    torch.autograd.backward(
  File "/home/fsa/anaconda3/envs/manimate/lib/python3.8/site-packages/torch/autograd/__init__.py", line 251, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

System Info

torch 2.1.2
transformers 4.36.2
peft 0.7.1
diffusers 0.24.0
accelerate 0.25.0

Who can help?

@sayakpaul

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions