Generating Synthetic Cardiac Tomography Images with Latent Diffusion Models
This study investigates the application of fine-tuning techniques to enhance pretrained latent diffusion models for generating synthetic cardiac tomography images based on text prompts related to congenital heart disease. The study explores two fine-tuning methods, DreamBooth and Textual Inversion, in low-data and low-compute environments. Using a private dataset of cardiac tomography scans and corresponding reports, the experiments reveal that while U-Net fine-tuning with DreamBooth shows promise when trained with specific image sets, fine-tuning the text-encoder component leads to overfitting. Textual Inversion fails to produce satisfactory results. The study sheds light on the limitations of fine-tuning strategies in improving the representation of cardiac tomography concepts, emphasizing the need for further exploration in this domain. The findings contribute to the understanding of generative models in medical imaging and guide future research on enhancing synthesis techniques for both augmenting supervised machine learning pipelines in healthcare and generating images based on medical reports.