Author ORCID Identifier

https://orcid.org/0000-0001-7207-5149

Date Available

12-19-2023

Year of Publication

2023

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Engineering

Department/School/Program

Computer Science

Advisor

Brent Harrison

Co-Director of Graduate Studies

Nathan Jacobs

Abstract

Attention mechanism, an approach to maintain the local and global features over the input, is the crucial element of the Transformer. This dissertation explores structured attention for image analysis, proposing attention-based methods for multi-label learning and Alzheimer’s Disease (AD) diagnosis.
For the multi-label learning task, I present two works under the Vision Transformer (ViT) framework. The first work focuses on supervised learning of multi-label classification. I address the problems of the multi-label classification and propose a model named AssocFormer, which adopts the association module to access the objects’ association relationship to improve the model performance. The second work addresses the semi-supervised learning of multi-label classification. I work on Single-Positive Multi-Label Learning (SPML), an extremely challenging task in which only one positive label is known with the rest annotations unknown. I present VLPL, a novel and efficient frame-work that leverages the similarity of the visual and text embeddings to get the pseudo-label of the given image.
In the context of AD diagnosis, this study works on two tasks. The first task centers on efficient training using 3D brain images of AD. A novel module is proposed, which transforms 3D brain images into 2D fused images across the slice dimension. This conversion reduces input image dimensions, enhancing training efficiency. The second work combines different positron emission tomography (PET) modalities under the ViT Structure for AD diagnosis, namely ADViT.
Throughout my work, a collection of novel methods rooted in the attention framework is proposed. The results demonstrate the significant enhancements of these methods in computer vision and medical imaging analysis.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2023/477

Recommended Citation

Xing, Xin, "Structured Attention for Image Analysis" (2023). Theses and Dissertations--Computer Science. 140.
https://uknowledge.uky.edu/cs_etds/140

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Theses and Dissertations--Computer Science

Structured Attention for Image Analysis

Author ORCID Identifier

Date Available

Year of Publication

Document Type

Degree Name

College

Department/School/Program

Advisor

Co-Director of Graduate Studies

Abstract

Digital Object Identifier (DOI)

Recommended Citation

Included in

Search

Browse by Author

Author Corner

Connect

Theses and Dissertations--Computer Science

Structured Attention for Image Analysis

Author

Author ORCID Identifier

Date Available

Year of Publication

Document Type

Degree Name

College

Department/School/Program

Advisor

Co-Director of Graduate Studies

Abstract

Digital Object Identifier (DOI)

Recommended Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect