Skip to content

Comments

Add skip_image_resolution to deduplicate multi-resolution dataset#2273

Open
woct0rdho wants to merge 5 commits intokohya-ss:mainfrom
woct0rdho:min-max-orig-reso
Open

Add skip_image_resolution to deduplicate multi-resolution dataset#2273
woct0rdho wants to merge 5 commits intokohya-ss:mainfrom
woct0rdho:min-max-orig-reso

Conversation

@woct0rdho
Copy link
Contributor

This PR is an alternative to #2270 .

I propose to add a dataset property min_orig_resolution, so we can write a multi-resolution dataset config like

[general]
bucket_no_upscale = true

[[datasets]]
resolution = 768
[[datasets.subsets]]
image_dir = 'path/to/image/dir'

[[datasets]]
resolution = 1024
min_orig_resolution = 768
[[datasets.subsets]]
image_dir = 'path/to/image/dir'

[[datasets]]
resolution = 1280
min_orig_resolution = 1024
[[datasets.subsets]]
image_dir = 'path/to/image/dir'

I've also added max_orig_resolution because it looks natural to have one.

We filter the images by their original resolutions in BaseDataset.make_buckets, and update num_train_images and num_reg_images. For DreamBoothDataset, we rebalance the number of regularization images after the filter. For ControlNetDataset, we check missing conditioning images after the filter, and ignore extra conditioning images.

There is no overhead if the user does not set min_orig_resolution and max_orig_resolution.

@kohya-ss
Copy link
Owner

Thank you for this PR!

However, this option seems a bit complicated and confusing. Please tell me why #2270's skip_image_resolution is not enough.

@woct0rdho
Copy link
Contributor Author

woct0rdho commented Feb 20, 2026

min_orig_resolution is exactly your skip_image_resolution but I renamed it because I think min_orig_resolution is more self-evident.

I can rename it back to skip_image_resolution and remove max_orig_resolution if you think that's better.

The code is indeed more complicated than what I thought at first, but this is the best way I (and AI tools I use) can find to implement:

  1. The filter using original resolution, which can be done only after we know the original resolution in make_buckets
  2. Make regularization images work correctly with the filter
  3. Make conditioning images work correctly with the filter

@kohya-ss
Copy link
Owner

Thanks for the explanation, I understand now.

skip_image_resolution explicitly states that images of that resolution will not be included, but I don't think min_orig_resolution explicitly states whether they will be included or not.

I'll try to find out if there's a simpler way to implement this.

@woct0rdho woct0rdho changed the title Add min_orig_resolution and max_orig_resolution to deduplicate multi-resolution dataset Add skip_image_resolution to deduplicate multi-resolution dataset Feb 20, 2026
@kohya-ss
Copy link
Owner

kohya-ss commented Feb 22, 2026

Thank you for update!

I think we could simply filter images with the following code.
Note that skip_image_resolution should be a tuple, just like resolution.

                            size_set_count += 1
                    logger.info(f"set image size from cache files: {size_set_count}/{len(img_paths)}")

            # from here
            if self.skip_image_resolution is not None:
                filtered_img_paths = []
                filtered_sizes = []
                skip_image_area = self.skip_image_resolution[0] * self.skip_image_resolution[1]
                for img_path, size in zip(img_paths, sizes):
                    if size is None:  # no latents cache file, get image size by reading image file (slow)
                        size = self.get_image_size(img_path)
                    if size[0] * size[1] <= skip_image_area:
                        continue
                    filtered_img_paths.append(img_path)
                    filtered_sizes.append(size)
                img_paths = filtered_img_paths
                sizes = filtered_sizes
                # add some logging here
            # to here

            # We want to create a training and validation split. This should be improved in the future
            # to allow a clearer distinction between training and validation. This can be seen as a

In FineTuningDataset, we can use the image size from the metadata.

@woct0rdho
Copy link
Contributor Author

Yes this makes the PR simpler. I've moved the filtering from make_buckets to __init__.

@kohya-ss
Copy link
Owner

Thank you for update! I will create a test dataset and review/test this sooner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants