CDC-FM (Carré du Champ Flow Matching) Implementation by rockerBOO · Pull Request #2221 · kohya-ss/sd-scripts

rockerBOO · 2025-10-09T22:25:56Z

Add support for CDC-FM, a geometry-aware noise generation method that improves diffusion model training by adapting noise to the local geometry of the latent space. CDC-FM replaces standard Gaussian noise with geometry-informed noise that better preserves the structure of the data manifold.

Note: Only implemented for Flux network training so far. Can be expanded to other flow matching models SD3, Lumina Image 2.

Deep generative models often face a fundamental tradeoff: high sample qualitycan come at the cost of memorisation, where the model reproduces training datarather than generalising across the underlying data geometry. We introduce Carr´edu champ flow matching (CDC-FM), a generalisation of flow matching (FM), thatimproves the quality-generalisation tradeoff by regularising the probability pathwith a geometry-aware noise. Our method replaces the homogeneous, isotropicnoise in FM with a spatially varying, anisotropic Gaussian noise whose covari-ance captures the local geometry of the latent data manifold. We prove that thisgeometric noise can be optimally estimated from the data and is scalable to largedata. Further, we provide an extensive experimental evaluation on diverse datasets(synthetic manifolds, point clouds, single-cell genomics, animal motion capture,and images) as well as various neural network architectures (MLPs, CNNs, andtransformers). We demonstrate that CDC-FM consistently offers a better quality-generalisation tradeoff. We observe significant improvements over standard FMin data-scarce regimes and in highly non-uniformly sampled datasets, which areoften encountered in AI for science applications. Our work provides a mathemat-ical framework for studying the interplay between data geometry, generalisationand memorisation in generative models, as well as a robust and scalable algorithmthat can be readily integrated into existing flow matching pipelines.

https://arxiv.org/abs/2510.05930

Screenshot 2025-10-09 at 18-21-47 Carr_'e du champ flow matching better quality-generalisation tradeoff in generative models - 2510 05930v1 pdf

Screenshot 2025-10-09 at 18-21-29 Carr_'e du champ flow matching better quality-generalisation tradeoff in generative models - 2510 05930v1 pdf

Note: Written with AI but I guided how it was implemented.

Recommended Configurations:

Single Resolution (e.g., all 512×512):

  --use_cdc_fm \
  --cdc_k_neighbors 256 \
  --cdc_k_bandwidth 8 \
  --cdc_d_cdc 8 \
  --cdc_gamma 1.0

Multi-Resolution with Bucketing (FLUX/SDXL):

  --use_cdc_fm \
  --cdc_k_neighbors 256 \
  --cdc_adaptive_k \
  --cdc_min_bucket_size 16 \
  --cdc_k_bandwidth 8 \
  --cdc_d_cdc 8 \
  --cdc_gamma 0.5

Small Dataset (<1000 images):

   --use_cdc_fm \
  --cdc_k_neighbors 128 \
  --cdc_adaptive_k \
  --cdc_min_bucket_size 8 \
  --cdc_k_bandwidth 8 \
  --cdc_d_cdc 8 \
  --cdc_gamma 1.5

Parameter Guide:

--cdc_k_neighbors

Recommended: 256 (based on paper's CIFAR-10 experiments)
Small datasets (<1000): 128
Medium datasets (1000-10k): 256
Large datasets (>10k): 256-512
Rule: k = min(256, dataset_size / 4)

--cdc_adaptive_k

Recommended: Enable for multi-resolution/bucketed training
Without flag (default): Strict paper methodology - skips buckets with < k_neighbors samples
With flag: Pragmatic approach - uses k = min(k_neighbors, bucket_size - 1) for buckets ≥ min_bucket_size
When to use:
- Multi-resolution training (FLUX with various aspect ratios)
- Training with bucketing enabled
- Datasets where resolution distribution varies widely
When not to use:
- Single resolution datasets (all images same size)
- When you want strict adherence to paper's methodology
- Academic/research settings requiring exact paper reproduction

--cdc_min_bucket_size

Recommended: 16 (default)
Only relevant when --cdc_adaptive_k is enabled
Buckets below this threshold use Gaussian fallback (no CDC)
Range: 8-32 depending on dataset
Lower values (8-12): More buckets get CDC, but less stable for very small buckets
Higher values (24-32): More conservative, only well-populated buckets get CDC

--cdc_k_bandwidth

Recommended: 8 (paper uses this consistently)
Don't change unless you have specific reasons
This determines variable-bandwidth Gaussian kernels

--cdc_gamma

Small datasets (<1000): 1.0-2.0 (stronger regularization)
Medium datasets (1000-5000): 0.8-1.0
Large datasets (>5000): 0.5-0.8
Paper showed γ=2.0 optimal for 250 samples, γ=0.5-1.0 for 2000-5000 samples

--cdc_d_cdc

Recommended: 8-16 for high-dimensional image data
Paper tested 2, 4, 8, 16 - found trade-off between quality and generalization
Higher values capture more geometric structure but may include noise

Implements geometry-aware noise generation for FLUX training based on arXiv:2510.05930v1.

- Cache all shapes during GammaBDataset initialization - Eliminates file I/O on every training step (9.5M accesses/sec) - Reduces get_shape() from file operation to dict lookup - Memory overhead: ~126 bytes/sample (~12.6 MB per 100k images)

- Create apply_cdc_noise_transformation() for better modularity - Implement fast path for batch processing when all shapes match - Implement slow path for per-sample processing on shape mismatch - Clone noise tensors in fallback path for gradient consistency

- Remove @torch.no_grad() decorator from compute_sigma_t_x() - Gradients now properly flow through CDC transformation during training - Add comprehensive gradient flow tests for fast/slow paths and fallback - All 25 CDC tests passing

- Track warned samples in global set to prevent log spam - Each sample only warned once per training session - Prevents thousands of duplicate warnings during training - Add tests to verify throttling behavior

- Check that noise and CDC matrices are on same device - Automatically transfer noise if device mismatch detected - Warn user when device transfer occurs - Add tests to verify device handling

- Treat cuda and cuda:0 as compatible devices - Only warn on actual device mismatches (cuda vs cpu) - Eliminates warning spam during multi-subset training

Fixes shape mismatch bug in multi-subset training where CDC preprocessing and training used different index calculations, causing wrong CDC data to be loaded for samples. Changes: - CDC cache now stores/loads data using image_key strings instead of integer indices - Training passes image_key list instead of computed integer indices - All CDC lookups use stable image_key identifiers - Improved device compatibility check (handles "cuda" vs "cuda:0") - Updated all 30 CDC tests to use image_key-based access Root cause: Preprocessing used cumulative dataset indices while training used sorted keys, resulting in mismatched lookups during shuffled multi-subset training.

- Add --cdc_debug flag to enable verbose bucket-by-bucket output - When debug=False (default): Show tqdm progress bar, concise logging - When debug=True: Show detailed bucket information, no progress bar - Improves user experience during CDC cache generation

kohya-ss · 2025-10-09T22:55:39Z

Thank you for this! This seems to be effective when the data set is limited, so it looks very good.

I plan to merge the sd3 branch into main soon, so I'd like to merge this (and a few other PRs) before then.

- Add ss_use_cdc_fm, ss_cdc_k_neighbors, ss_cdc_k_bandwidth, ss_cdc_d_cdc, ss_cdc_gamma - Ensures CDC-FM training parameters are tracked in model metadata - Enables reproducibility and model provenance tracking

- Add --cdc_adaptive_k flag to enable adaptive k based on bucket size - Add --cdc_min_bucket_size to set minimum bucket threshold (default: 16) - Fixed mode (default): Skip buckets with < k_neighbors samples - Adaptive mode: Use k=min(k_neighbors, bucket_size-1) for buckets >= min_bucket_size - Update CDCPreprocessor to support adaptive k per bucket - Add metadata tracking for adaptive_k and min_bucket_size - Add comprehensive pytest tests for adaptive k behavior This allows CDC-FM to work effectively with multi-resolution bucketing where bucket sizes may vary widely. Users can choose between strict paper methodology (fixed k) or pragmatic approach (adaptive k).

rockerBOO · 2025-10-10T03:34:15Z

Issue right now is we are caching the neighbors into a file but saving it into the output_dir. this means each run we make a new file. We could:

Only save this in memory and not to a file.
Allow users to set the cache file location.

I'd usually set it with the dataset but if multiple subsets are set it isn't one place.

- Make FAISS import optional with try/except - CDCPreprocessor raises helpful ImportError if FAISS unavailable - train_util.py catches ImportError and returns None - train_network.py checks for None and warns user - Training continues without CDC-FM if FAISS not installed - Remove benchmark file (not needed in repo) This allows users to run training without FAISS dependency. CDC-FM will be automatically disabled with a warning if FAISS is missing.

FurkanGozukara · 2025-10-11T08:51:00Z

@rockerBOO i plan to test this

this is only for flux lora?

- Add explicit warning and tracking for multiple unique latent shapes - Simplify test imports by removing unused modules - Minor formatting improvements in print statements - Ensure log messages provide clear context about dimension mismatches

- Merged redundant test files - Removed 'comprehensive' from file and docstring names - Improved test organization and clarity - Ensured all tests continue to pass - Simplified test documentation

rockerBOO · 2025-10-12T05:53:18Z

@rockerBOO i plan to test this

this is only for flux lora?

Yes only Flux LoRA for the moment

…t set

recris · 2025-11-04T22:08:42Z

flux_train_network.py

+        # If CDC is enabled, this will transform the noise with geometry-aware covariance
+        noisy_model_input, timesteps, sigmas = flux_train_utils.get_noisy_model_input_and_timestep(
+            args, noise_scheduler, latents, noise, accelerator.device, weight_dtype,
+            gamma_b_dataset=gamma_b_dataset, latents_npz_paths=latents_npz_paths, timestep_index=timestep_index


timestep_index doesn't seem to be defined?

Thank you @recris. Removed. Was from another PR.

rockerBOO added 11 commits October 9, 2025 18:28

Add CDC-FM (Carré du Champ Flow Matching) support

f552f9a

Implements geometry-aware noise generation for FLUX training based on arXiv:2510.05930v1.

Add warning throttling for CDC shape mismatches

ce17007

- Track warned samples in global set to prevent log spam - Each sample only warned once per training session - Prevents thousands of duplicate warnings during training - Add tests to verify throttling behavior

Add device consistency validation for CDC transformation

ee8ceee

- Check that noise and CDC matrices are on same device - Automatically transfer noise if device mismatch detected - Warn user when device transfer occurs - Add tests to verify device handling

Fix: Prevent false device mismatch warnings for cuda vs cuda:0

4bea582

- Treat cuda and cuda:0 as compatible devices - Only warn on actual device mismatches (cuda vs cpu) - Eliminates warning spam during multi-subset training

Use logger instead of print for CDC loading messages

7a7110c

Formatting cleanup

f128f5a

rockerBOO force-pushed the cdc_fm branch from a0f1678 to f128f5a Compare October 9, 2025 22:29

Add faiss to github action

20c6ae5

rockerBOO added 2 commits October 9, 2025 22:51

Add CDC-FM parameters to model metadata

f450443

- Add ss_use_cdc_fm, ss_cdc_k_neighbors, ss_cdc_k_bandwidth, ss_cdc_d_cdc, ss_cdc_gamma - Ensures CDC-FM training parameters are tracked in model metadata - Enables reproducibility and model provenance tracking

rockerBOO added 3 commits October 11, 2025 16:15

Slight cleanup

aa3a216

Consolidate and simplify CDC test files

1f79115

- Merged redundant test files - Removed 'comprehensive' from file and docstring names - Improved test organization and clarity - Ensured all tests continue to pass - Simplified test documentation

rockerBOO added 7 commits October 18, 2025 14:07

Remove faiss, save per image cdc file

83c17de

Fix CDC tests to new format and deprecate old tests

c820ace

Remove deprecated cdc cache path

0dfafb4

Fix multi-resolution support in cached files

b4e5d09

Add multi-resolution test

03947ca

Fix cdc cache file validation

3772998

Add error if with CDC if cache_latents or cache_latents_to_disk is no…

7a08c52

…t set

recris reviewed Nov 4, 2025

View reviewed changes

rockerBOO added 2 commits November 17, 2025 11:26

Remove timestep_index

cc0e4ac

Fix tests

4888327

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

CDC-FM (Carré du Champ Flow Matching) Implementation#2221

CDC-FM (Carré du Champ Flow Matching) Implementation#2221
rockerBOO wants to merge 27 commits intokohya-ss:sd3from
rockerBOO:cdc_fm

rockerBOO commented Oct 9, 2025 •

edited

Loading

Uh oh!

kohya-ss commented Oct 9, 2025

Uh oh!

rockerBOO commented Oct 10, 2025

Uh oh!

FurkanGozukara commented Oct 11, 2025

Uh oh!

rockerBOO commented Oct 12, 2025

Uh oh!

recris Nov 4, 2025

Uh oh!

rockerBOO Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Comments

Conversation

rockerBOO commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Recommended Configurations:

Parameter Guide:

Uh oh!

kohya-ss commented Oct 9, 2025

Uh oh!

rockerBOO commented Oct 10, 2025

Uh oh!

FurkanGozukara commented Oct 11, 2025

Uh oh!

rockerBOO commented Oct 12, 2025

Uh oh!

recris Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

rockerBOO Nov 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

rockerBOO commented Oct 9, 2025 •

edited

Loading