-
Notifications
You must be signed in to change notification settings - Fork 6.7k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I am trying to use train_dreambooth.py to train a personalized model by following https://github.com/huggingface/diffusers/tree/main/examples/dreambooth. I got the following error:
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\conv.py", line 458, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [320, 4, 3, 3], expected input[1, 3, 512, 512] to have 4 channels, but got 3 channels instead
A bit of context:
The error is related to the mismatch of number of channels between con2d weights and input. The input has 3 channels and conv2d expects 4. However, when I run train_dreambooth_lora.py with the same input, no such mismatch occurred. In fact, the same set of images are used in dreambooth_sdxl_lora without problems.
Reproduction
I ran this powershell script
$env:MODEL_NAME = "jzli/majicMIX-realistic-7"
$env:INSTANCE_DIR = "dog"
$env:OUTPUT_DIR = "dreambooth-majicMIX-dog"
& accelerate launch train_dreambooth.py `
--pretrained_model_name_or_path $env:MODEL_NAME `
--instance_data_dir $env:INSTANCE_DIR `
--output_dir $env:OUTPUT_DIR `
--mixed_precision "fp16" `
--instance_prompt "a photo of a [V] dog" `
--class_prompt "a photo of a dog" `
--resolution 512 `
--train_batch_size 1 `
--gradient_accumulation_steps 2 `
--learning_rate 5e-6 `
--report_to "wandb" `
--lr_scheduler "constant" `
--gradient_checkpointing `
--use_8bit_adam `
--train_text_encoder `
--lr_warmup_steps 0 `
--max_train_steps 800 `
--checkpointing_steps 50 `
--validation_prompt "A photo of a [V] dog in a bucket" `
--seed "0"
Logs
C:\DL\diffusers\examples\dreambooth\train_dreambooth.py:602: UserWarning: You need not use --class_prompt without --with_prior_preservation.
warnings.warn("You need not use --class_prompt without --with_prior_preservation.")
04/09/2024 13:43:45 - INFO - __main__ - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cuda
Mixed precision type: fp16
You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
{'dynamic_thresholding_ratio', 'variance_type', 'rescale_betas_zero_snr', 'thresholding', 'clip_sample_range', 'sample_max_value'} was not found in config. Values will be initialized to default values.
{'reverse_transformer_layers_per_block'} was not found in config. Values will be initialized to default values.
wandb: (1) Create a W&B account
wandb: (2) Use an existing W&B account
wandb: (3) Don't visualize my results
wandb: Enter your choice: 3
wandb: You chose "Don't visualize my results"
wandb: Tracking run with wandb version 0.16.5
wandb: W&B syncing is set to `offline` in this directory.
wandb: Run `wandb online` or set WANDB_MODE=online to enable cloud syncing.
04/09/2024 13:44:02 - INFO - __main__ - ***** Running training *****
04/09/2024 13:44:02 - INFO - __main__ - Num examples = 5
04/09/2024 13:44:02 - INFO - __main__ - Num batches each epoch = 5
04/09/2024 13:44:02 - INFO - __main__ - Num Epochs = 267
04/09/2024 13:44:02 - INFO - __main__ - Instantaneous batch size per device = 1
04/09/2024 13:44:02 - INFO - __main__ - Total train batch size (w. parallel, distributed & accumulation) = 2
04/09/2024 13:44:02 - INFO - __main__ - Gradient Accumulation steps = 2
04/09/2024 13:44:02 - INFO - __main__ - Total optimization steps = 800
Steps: 0%| | 0/800 [00:00<?, ?it/s]img dim: torch.Size([1, 3, 512, 512])
Traceback (most recent call last):
File "C:\DL\diffusers\examples\dreambooth\train_dreambooth.py", line 1440, in <module>
main(args)
File "C:\DL\diffusers\examples\dreambooth\train_dreambooth.py", line 1270, in main
model_pred = unet(
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\utils\operations.py", line 822, in forward
return model_forward(*args, **kwargs)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\utils\operations.py", line 810, in __call__
return convert_to_fp32(self.model_forward(*args, **kwargs))
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\amp\autocast_mode.py", line 16, in decorate_autocast
return func(*args, **kwargs)
File "C:\DL\diffusers\src\diffusers\models\unets\unet_2d_condition.py", line 1169, in forward
sample = self.conv_in(sample)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\conv.py", line 462, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\nn\modules\conv.py", line 458, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [320, 4, 3, 3], expected input[1, 3, 512, 512] to have 4 channels, but got 3 channels instead
wandb: You can sync this run to the cloud by running:
wandb: wandb sync C:\DL\diffusers\examples\dreambooth\wandb\offline-run-20240409_134401-ligzdtih
wandb: Find logs at: .\wandb\offline-run-20240409_134401-ligzdtih\logs
Traceback (most recent call last):
File "C:\WPy64-31090\python-3.10.9.amd64\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\WPy64-31090\python-3.10.9.amd64\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "C:\WPy64-31090\python-3.10.9.amd64\Scripts\accelerate.exe\__main__.py", line 7, in <module>
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\commands\accelerate_cli.py", line 46, in main
args.func(args)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\commands\launch.py", line 1057, in launch_command
simple_launcher(args)
File "C:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\accelerate\commands\launch.py", line 673, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['C:\\WPy64-31090\\python-3.10.9.amd64\\python.exe', 'train_dreambooth.py', '--pretrained_model_name_or_path', 'jzli/majicMIX-realistic-7', '--instance_data_dir', 'dog', '--output_dir', 'dreambooth-majicMIX-dog', '--mixed_precision', 'fp16', '--instance_prompt', 'a photo of a [V] dog', '--class_prompt', 'a photo of a dog', '--resolution', '512', '--train_batch_size', '1', '--gradient_accumulation_steps', '2', '--learning_rate', '5e-6', '--report_to', 'wandb', '--lr_scheduler', 'constant', '--gradient_checkpointing', '--use_8bit_adam', '--train_text_encoder', '--lr_warmup_steps', '0', '--max_train_steps', '800', '--checkpointing_steps', '50', '--validation_prompt', 'A photo of a [V] dog in a bucket', '--seed', '0']' returned non-zero exit status 1.System Info
diffusersversion: 0.28.0.dev0- Platform: Windows-10-10.0.22631-SP0
- Python version: 3.10.9
- PyTorch version (GPU?): 2.2.2+cu121 (True)
- Huggingface_hub version: 0.22.2
- Transformers version: 4.39.2
- Accelerate version: 0.28.0
- xFormers version: 0.0.25.post1
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working