Översikt
- Datum:Startar 12 januari 2026, 10:00Slutar 12 januari 2026, 12:00
- Plats:EDIT Room Analysen
- Opponent:Filip Malmberg
- AvhandlingLäs avhandlingen (Öppnas i ny flik)
Medical images are a crucial part of healthcare, but require the time and effort of trained experts to analyze. Machine learning based methods have the potential to decrease this workload, but their practical adoption remains challenging. In particular, practitioners often have limited access to training data. Furthermore, labeling the data can be difficult when the assessment is subjective in nature, leading to disagreements among experts. In this thesis, we address these challenges in several ways.
First, we construct a comparison-based image annotation system and evaluate it against standard rating-based annotation in a study with six clinicians, finding that it significantly increases inter-annotator agreement. In follow-up work, we mitigate the increased annotation cost of comparisons by leveraging per-item features such as image content. We introduce GURO, a novel criterion for selecting informative comparisons, and show that incorporating item attributes significantly improves sample efficiency, making it a more scalable solution for large-scale annotation.
Finally, we compare methods for leveraging radiology reports to train image-only classifiers more efficiently. We find that existing methods are overwhelmingly evaluated on diagnostic labels, overlooking tasks such as prognosis, where the label is less directly correlated with the report. This distinction is important, as we observe that text-supervised models do not show the same benefits over self-supervised models in the non-diagnostic setting. Additionally, we explore the potential of using reports when fine-tuning, a previously neglected aspect, through generalized distillation. We find that this can lead to significant improvements in the data-scarce setting, depending on the task.
This thesis offers practical guidance for developing medical image models and introduces annotation methods that reduce label disagreement while maintaining low annotation effort.
First, we construct a comparison-based image annotation system and evaluate it against standard rating-based annotation in a study with six clinicians, finding that it significantly increases inter-annotator agreement. In follow-up work, we mitigate the increased annotation cost of comparisons by leveraging per-item features such as image content. We introduce GURO, a novel criterion for selecting informative comparisons, and show that incorporating item attributes significantly improves sample efficiency, making it a more scalable solution for large-scale annotation.
Finally, we compare methods for leveraging radiology reports to train image-only classifiers more efficiently. We find that existing methods are overwhelmingly evaluated on diagnostic labels, overlooking tasks such as prognosis, where the label is less directly correlated with the report. This distinction is important, as we observe that text-supervised models do not show the same benefits over self-supervised models in the non-diagnostic setting. Additionally, we explore the potential of using reports when fine-tuning, a previously neglected aspect, through generalized distillation. We find that this can lead to significant improvements in the data-scarce setting, depending on the task.
This thesis offers practical guidance for developing medical image models and introduces annotation methods that reduce label disagreement while maintaining low annotation effort.
