[Dreambooth] number of channels error in train_dreambooth.py 

### Describe the bug

I am trying to use train_dreambooth.py  to train a personalized model by following https://github.com/huggingface/diffusers/tree/main/examples/dreambooth. I got the following error:
```
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\conv.py", line 458, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [320, 4, 3, 3], expected input[1, 3, 512, 512] to have 4 channels, but got 3 channels instead
```

### A bit of context: 

The error is related to the mismatch of number of channels between con2d weights and input. The input has 3 channels and conv2d expects 4. However, when I run train_dreambooth_lora.py with the same input, no such mismatch occurred. In fact, the same set of images are used in dreambooth_sdxl_lora without problems.   


### Reproduction

I ran this powershell script
```
$env:MODEL_NAME = "jzli/majicMIX-realistic-7"
$env:INSTANCE_DIR = "dog"
$env:OUTPUT_DIR = "dreambooth-majicMIX-dog"

& accelerate launch train_dreambooth.py `
  --pretrained_model_name_or_path $env:MODEL_NAME `
  --instance_data_dir $env:INSTANCE_DIR `
  --output_dir $env:OUTPUT_DIR `
  --mixed_precision "fp16" `
  --instance_prompt "a photo of a [V] dog" `
  --class_prompt "a photo of a dog" `
  --resolution 512 `
  --train_batch_size 1 `
  --gradient_accumulation_steps 2 `
  --learning_rate 5e-6 `
  --report_to "wandb" `
  --lr_scheduler "constant" `
  --gradient_checkpointing `
  --use_8bit_adam `
  --train_text_encoder `
  --lr_warmup_steps 0 `
  --max_train_steps 800 `
  --checkpointing_steps 50 `
  --validation_prompt "A photo of a [V] dog in a bucket" `
  --seed "0"

```


### Logs

```shell
C:\DL\diffusers\examples\dreambooth\train_dreambooth.py:602: UserWarning: You need not use --class_prompt without --with_prior_preservation.
  warnings.warn("You need not use --class_prompt without --with_prior_preservation.")
04/09/2024 13:43:45 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda

Mixed precision type: fp16

You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'dynamic_thresholding_ratio', 'variance_type', 'rescale_betas_zero_snr', 'thresholding', 'clip_sample_range', 'sample_max_value'} was not found in config. Values will be initialized to default values.
{'reverse_transformer_layers_per_block'} was not found in config. Values will be initialized to default values.
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose "Don't visualize my results"
wandb: Tracking run with wandb version 0.16.5
wandb: W&B syncing is set to `offline` in this directory.
wandb: Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
04/09/2024 13:44:02 - INFO - __main__ - ***** Running training *****
04/09/2024 13:44:02 - INFO - __main__ -   Num examples = 5
04/09/2024 13:44:02 - INFO - __main__ -   Num batches each epoch = 5
04/09/2024 13:44:02 - INFO - __main__ -   Num Epochs = 267
04/09/2024 13:44:02 - INFO - __main__ -   Instantaneous batch size per device = 1
04/09/2024 13:44:02 - INFO - __main__ -   Total train batch size (w. parallel, distributed & accumulation) = 2
04/09/2024 13:44:02 - INFO - __main__ -   Gradient Accumulation steps = 2
04/09/2024 13:44:02 - INFO - __main__ -   Total optimization steps = 800
Steps:   0%|                                                                                                      | 0/800 [00:00<?, ?it/s]img dim:  torch.Size([1, 3, 512, 512])
Traceback (most recent call last):
  File "C:\DL\diffusers\examples\dreambooth\train_dreambooth.py", line 1440, in <module>
    main(args)
  File "C:\DL\diffusers\examples\dreambooth\train_dreambooth.py", line 1270, in main
    model_pred = unet(
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\utils\operations.py", line 822, in forward
    return model_forward(*args, **kwargs)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\utils\operations.py", line 810, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\amp\autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
  File "C:\DL\diffusers\src\diffusers\models\unets\unet_2d_condition.py", line 1169, in forward
    sample = self.conv_in(sample)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\conv.py", line 462, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\conv.py", line 458, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [320, 4, 3, 3], expected input[1, 3, 512, 512] to have 4 channels, but got 3 channels instead
wandb: You can sync this run to the cloud by running:
wandb: wandb sync C:\DL\diffusers\examples\dreambooth\wandb\offline-run-20240409_134401-ligzdtih
wandb: Find logs at: .\wandb\offline-run-20240409_134401-ligzdtih\logs
Traceback (most recent call last):
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\WPy64-31090\python-3.10.9.amd64\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\commands\accelerate_cli.py", line 46, in main
    args.func(args)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\commands\launch.py", line 1057, in launch_command
    simple_launcher(args)
  File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\commands\launch.py", line 673, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\\WPy64-31090\\python-3.10.9.amd64\\python.exe', 'train_dreambooth.py', '--pretrained_model_name_or_path', 'jzli/majicMIX-realistic-7', '--instance_data_dir', 'dog', '--output_dir', 'dreambooth-majicMIX-dog', '--mixed_precision', 'fp16', '--instance_prompt', 'a photo of a [V] dog', '--class_prompt', 'a photo of a dog', '--resolution', '512', '--train_batch_size', '1', '--gradient_accumulation_steps', '2', '--learning_rate', '5e-6', '--report_to', 'wandb', '--lr_scheduler', 'constant', '--gradient_checkpointing', '--use_8bit_adam', '--train_text_encoder', '--lr_warmup_steps', '0', '--max_train_steps', '800', '--checkpointing_steps', '50', '--validation_prompt', 'A photo of a [V] dog in a bucket', '--seed', '0']' returned non-zero exit status 1.
```

### System Info

- `diffusers` version: 0.28.0.dev0
- Platform: Windows-10-10.0.22631-SP0
- Python version: 3.10.9
- PyTorch version (GPU?): 2.2.2+cu121 (True)
- Huggingface_hub version: 0.22.2
- Transformers version: 4.39.2
- Accelerate version: 0.28.0
- xFormers version: 0.0.25.post1
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>

### Who can help?

 @sayakpaul

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Dreambooth] number of channels error in train_dreambooth.py #7619

Describe the bug

A bit of context:

Reproduction

Logs

System Info

Who can help?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Dreambooth] number of channels error in train_dreambooth.py #7619

Description

Describe the bug

A bit of context:

Reproduction

Logs

System Info

Who can help?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions