Archived
This content is available here for research, reference, and/or recordkeeping.
Author ORCID Identifier
https://orcid.org/0000-0002-7746-4247
Date Available
6-1-2026
Year of Publication
2026
Document Type
Doctoral Dissertation
Degree Name
Doctor of Philosophy (PhD)
College
Engineering
Department/School/Program
Computer Science
Faculty
Qiang Cheng
Faculty
Simone Silvestri
Abstract
The effective utilization of structured data is fundamental to modern machine learning, yet it presents distinct challenges in both predictive analysis and generative modeling. Traditional deep learning architectures, particularly Transformers, often suffer from quadratic computational complexity when processing long sequences. This dissertation addresses these limitations by introducing novel architectures based on State-Space Models (SSMs) and Diffusion Models. In the area of predictive analysis, we focus on overcoming the computational bottlenecks of attention mechanisms for tabular and time-series data. First, we introduce MambaTab, a selective state-space architecture designed for efficient tabular classification. By leveraging the linear complexity of SSMs, MambaTab significantly reduces memory overhead while maintaining high accuracy. Second, for temporal data, we propose TimeMachine, a scalable architecture for long-term time-series forecasting. This model captures extended temporal dependencies efficiently, addressing the scalability issues inherent in Transformer-based forecasters. Additionally, we present TSCMamba, a specialized framework for multi-view time-series classification that optimizes feature extraction across diverse temporal benchmarks. Transitioning to generative modeling, we address the challenge of synthesizing strictly constrained structured data. We introduce RefiDiff, a diffusion-based framework for missing data imputation. RefiDiff employs a progressive refinement strategy that bridges predictive initialization with generative synthesis to recover lost information in tabular datasets accurately. Finally, we propose Mol-CADiff, a causality-aware autoregressive diffusion model for molecular graph generation. Unlike standard generative models that often produce similar structures, Mol-CADiff ensures chemical diversity by learning through a causality-integrated diffusion process. Collectively, these contributions demonstrate that specialized State-Space and Diffusion architectures provide superior efficiency and accuracy for structured data tasks.
Digital Object Identifier (DOI)
https://doi.org/10.13023/etd.2026.313
Archival?
Archival
Recommended Citation
Ahamed, Md Atik, "EFFECTIVE DEEP LEARNING ARCHITECTURES FOR STRUCTURED DATA ANALYSIS AND GENERATION" (2026). Theses and Dissertations--Computer Science. 161.
https://uknowledge.uky.edu/cs_etds/161
