Ruta de navegación

Conoce la Facultad de Informática

Conoce la Facultad de Informática de la UPV/EHU

El centro de referencia en la formación y conocimiento técnico/científico en informática e inteligencia artificial.

Conócenos

Localización y contacto

Aplicaciones anidadas

Destacado - MOVILIDAD

Destacado - EMPRESAS

Difusio@

Defensa tesis doctoral: Contribution to Data Augmentation for Image Classification and Segmentation

Fecha de primera publicación: 12/07/2024

Autor: Danyang Sun

Tesis: "Contribution to Data Augmentation for Image Classification and Segmentation"

Director: Fadi Dornaika

Día: 19 de julio de 2024
Hora: 10:30h
Lugar: Ada Lovelace aretoa

Abstract:

"Deep learning has spurred remarkable advancements in vision tasks. Particularly, large-scale models with a high number of trainable parameters can significantly boost performance. However, as the size and complexity of models increase, the required training data often escalates in tandem, especially for Vision Transformers. Yet, collecting and annotating data is often time-consuming, costly, or impractical. Overfitting, a common occurrence in the face of data scarcity, poses a significant challenge for deep learning. This challenge is particularly pronounced in medical image segmentation tasks. Addressing the issue of data scarcity and mitigating the resulting overfitting phenomenon are crucial endeavors in the field of deep learning, especially when dealing with large-scale models.

To address the challenge of data scarcity, data augmentation strategies stand out as the most widely recognized solution, known for their effectiveness and efficiency. These strategies aim to combat data scarcity and overfitting by generating additional training samples. Various methodologies exist, including basic data augmentation, Generative Adversarial Network (GAN)-based data augmentation, automatic augmentation, and regional dropout regularization data augmentation methods. Basic data augmentation is straightforward but offers limited diversity in the augmented space. GAN-based data augmentation, on the other hand, can produce high-quality and plausible samples, although it heavily relies on GAN models that pose challenges due to their substantial overhead. Automatic augmentation, which achieves superior performance through automated augmentation policy search, often requires trade-offs between complexity, cost, and performance. Meanwhile, regional dropout regularization data augmentation has demonstrated effectiveness, but existing methods have some shortcomings. These include: (i) Existing methods perform cutting and pasting with square-shaped or rectangle-shaped regions, resulting in incomplete object-part information for classification and loss of contour information for segmentation. (ii) Current regional dropout regularization data augmentation techniques are primarily designed for classification tasks, with limited research conducted on the effectiveness of the cut-and-paste strategy in segmentation. (iii) Most methods only utilize global semantics along with image-level constraints and overlook local context constraints. (iv) For the classification task, generated labels often inconsistently match the augmented images, leading to a mismatch between augmented images and their labels. (v) For the segmentation task, existing regional dropout augmentations do not fully utilize prior knowledge from segmentation masks or images.

In this thesis, we propose a range of innovative data augmentation methods for image classification and segmentation. The primary objective of this thesis is to propose contour-aware and local-aware regional dropout regularization data augmentation approaches for vision tasks with superpixel-grid-based mixing. Our motivation is to alleviate the data scarcity and resulting overfitting issues, as well as to address the limitations of existing dropout regularization data augmentation methods. Several data augmentation approaches have been proposed, focusing on either image classification or image segmentation. Additionally, numerous loss functions have been developed to improve the efficacy of the proposed data augmentation techniques. The main contributions of the thesis are outlined below.

(1) We present contour-aware regional dropout data augmentation techniques for both image classification and segmentation tasks, employing superpixel-grid-based mixing.

(2) We introduce local-aware regional dropout regularization data augmentation methods and incorporate local constraints to encourage the model to prioritize local regions. The associated loss functions have been shown to significantly improve the effectiveness of data augmentation techniques.

(3) We present efficient attention-guided superpixel-based data augmentation methods for classification tasks, ensuring consistency between augmented images and generated labels. Our approach utilizes attention mechanisms in both image space and label space, respectively. 

(4) We suggest regional dropout regularization data augmentation methods tailored for medical image segmentation. To our knowledge, this represents the first instance of contour-aware superpixel-mixing-based data augmentation specifically designed for segmentation tasks. 

(5) We extensively leverage the prior segmentation mask knowledge of the training samples and investigate loss functions that can enhance the training process. In particular, this thesis introduces the novel concepts of superpixel-wise adaptive focal margin classification loss and reconstruction loss on mixed images for the first time.

(6) Comprehensive experiments have conclusively shown the superior performance of the proposed methods across various image datasets and benchmarks, encompassing both classification and semantic segmentation tasks.

Keywords: Deep learning, data augmentation, superpixel, image classification, image segmentation."


Contenido 7 - Sellos