TY - GEN
T1 - Data Augmentation Techniques Using Text-to-Image Diffusion Models for Enhanced Data Diversity
AU - Shin, Jeongmin
AU - Jang, Hyeryung
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Data augmentation is a widely used technique to enhance the performance of deep learning models. However, traditional augmentation methods, dependent solely on original data, often fall short in maintaining data diversity and generalization capabilities. In this paper, we propose a novel data augmentation approach leveraging pretrained text-to-image diffusion models to generate diverse and contextually rich images. Our approach integrates three advanced techniques: rich-text prompts, multi-object image generation, and inpainting. We demonstrate the effectiveness of these methods through extensive experiments on the Oxford-IIIT Pets and Caltech-101 datasets, where our diffusion-based augmentations significantly improved downstream classification accuracy and model generalization. No-tably, the inpainting technique excels in handling class imbalances by balancing the diversity and structural integrity of original data, while rich-text prompts and multi-object generation offer substantial gains by enhancing diversity and realism. Additionally, our methods show enhanced generalization to unseen data, proving their robustness and applicability to various deep learning tasks.
AB - Data augmentation is a widely used technique to enhance the performance of deep learning models. However, traditional augmentation methods, dependent solely on original data, often fall short in maintaining data diversity and generalization capabilities. In this paper, we propose a novel data augmentation approach leveraging pretrained text-to-image diffusion models to generate diverse and contextually rich images. Our approach integrates three advanced techniques: rich-text prompts, multi-object image generation, and inpainting. We demonstrate the effectiveness of these methods through extensive experiments on the Oxford-IIIT Pets and Caltech-101 datasets, where our diffusion-based augmentations significantly improved downstream classification accuracy and model generalization. No-tably, the inpainting technique excels in handling class imbalances by balancing the diversity and structural integrity of original data, while rich-text prompts and multi-object generation offer substantial gains by enhancing diversity and realism. Additionally, our methods show enhanced generalization to unseen data, proving their robustness and applicability to various deep learning tasks.
UR - http://www.scopus.com/inward/record.url?scp=85217671096&partnerID=8YFLogxK
U2 - 10.1109/ICTC62082.2024.10827311
DO - 10.1109/ICTC62082.2024.10827311
M3 - Conference contribution
AN - SCOPUS:85217671096
T3 - International Conference on ICT Convergence
SP - 2027
EP - 2032
BT - ICTC 2024 - 15th International Conference on ICT Convergence
PB - IEEE Computer Society
T2 - 15th International Conference on Information and Communication Technology Convergence, ICTC 2024
Y2 - 16 October 2024 through 18 October 2024
ER -