Benchmarking Contrastive Learning for Multimodal Medical Imaging

Silva, Martim Dourado da

http://hdl.handle.net/10400.5/102399

Use this identifier to reference this record.

Name:	Description:	Size:	Format:
TM_Martim_Silva.pdf		32.12 MB	Adobe PDF	Download

Send Feedback

Authors

Silva, Martim Dourado da

Advisor(s)

Garcia, Nuno Cruz

Abstract(s)

Deep learning has achieved remarkable success in complex tasks, but its reliance on large, annotated datasets limits scalability in medical imaging, where expert labeling is costly and scarce. Contrastive learning, a self-supervised approach, offers a way to learn useful visual representations from unlabeled data by training models to distinguish between different images while aligning augmented views of the same instance. This thesis investigates the effectiveness of three state-of-the-art contrastive learning frameworks - SimCLR, MoCo, and BYOL - in generating transferable representations from medical images for two downstream tasks: multiclass classification and binary segmentation of breast tissue. It also examines whether combining ultrasound and mammography images during pretraining supports or hinders model generalization, reflecting real-world multimodal diagnostic workflows. Using seven public datasets, three modality-specific pretraining sets were constructed (ultrasound, mammography, and a balanced multimodal mix), each used to train the three frameworks. The resulting models were then fine-tuned for each downstream task. All networks shared a ResNet-18 backbone, and segmentation models used U-Net architectures with pretrained encoders. Results show that contrastive pretraining improves classification performance, particularly with ultrasound data and using BYOL or MoCo. These models outperformed randomly initialized baselines. For segmentation, however, random initialization yielded superior results, suggesting that standard contrastive objectives and augmentations do not capture the spatial precision needed for pixel-wise tasks. Mammography images posed further challenges due to small lesion size and detail loss from uniform resizing. This work underscores both the promise and the limitations of contrastive learning in clinical imaging. While effective for classification with limited labels, adapting contrastive methods for segmentation requires the design of specialized modality-aware pipelines.

Description

Tese de mestrado, Ciência de Dados, 2025, Universidade de Lisboa, Faculdade de Ciências

Keywords

Visão Computacional Aprendizagem Auto-Supervisionada Contrastiva Análise de Imagens Médicas Deteção/Diagnóstico do Cancro da Mama Teses de mestrado - 2025