Author ORCID Identifier
Date Available
12-19-2023
Year of Publication
2023
Document Type
Doctoral Dissertation
Degree Name
Doctor of Philosophy (PhD)
College
Engineering
Department/School/Program
Computer Science
Advisor
Brent Harrison
Co-Director of Graduate Studies
Nathan Jacobs
Abstract
Attention mechanism, an approach to maintain the local and global features over the input, is the crucial element of the Transformer. This dissertation explores structured attention for image analysis, proposing attention-based methods for multi-label learning and Alzheimer’s Disease (AD) diagnosis.
For the multi-label learning task, I present two works under the Vision Transformer (ViT) framework. The first work focuses on supervised learning of multi-label classification. I address the problems of the multi-label classification and propose a model named AssocFormer, which adopts the association module to access the objects’ association relationship to improve the model performance. The second work addresses the semi-supervised learning of multi-label classification. I work on Single-Positive Multi-Label Learning (SPML), an extremely challenging task in which only one positive label is known with the rest annotations unknown. I present VLPL, a novel and efficient frame-work that leverages the similarity of the visual and text embeddings to get the pseudo-label of the given image.
In the context of AD diagnosis, this study works on two tasks. The first task centers on efficient training using 3D brain images of AD. A novel module is proposed, which transforms 3D brain images into 2D fused images across the slice dimension. This conversion reduces input image dimensions, enhancing training efficiency. The second work combines different positron emission tomography (PET) modalities under the ViT Structure for AD diagnosis, namely ADViT.
Throughout my work, a collection of novel methods rooted in the attention framework is proposed. The results demonstrate the significant enhancements of these methods in computer vision and medical imaging analysis.
Digital Object Identifier (DOI)
https://doi.org/10.13023/etd.2023/477
Recommended Citation
Xing, Xin, "Structured Attention for Image Analysis" (2023). Theses and Dissertations--Computer Science. 140.
https://uknowledge.uky.edu/cs_etds/140