Author ORCID Identifier
Year of Publication
Doctor of Philosophy (PhD)
Arts and Sciences
Dr. Qiang Ye
The two main areas of Deep Learning are Unsupervised and Supervised Learning. Unsupervised Learning studies a class of data processing problems in which only descriptions of objects are known, without label information. Generative Adversarial Networks (GANs) have become among the most widely used unsupervised neural net models. GAN combines two neural nets, generative and discriminative, that work simultaneously. We introduce a new family of discriminator loss functions that adopts a weighted sum of real and fake parts, which we call adaptive weighted loss functions. Using the gradient information, we can adaptively choose weights to train a discriminator in the direction that benefits the GAN's stability. Also, we propose several improvements to the GAN training schemes. One is self-correcting optimization for training a GAN discriminator on Speech Enhancement tasks, which helps avoid ``harmful'' training directions for parts of the discriminator loss. The other improvement is a consistency loss, which targets the inconsistency in time and time-frequency domains caused by Fourier Transforms. Contrary to Unsupervised Learning, Supervised Learning uses labels for each object, and it is required to find the relationship between objects and labels. Building computing methods to interpret and represent human language automatically is known as Natural Language Processing which includes tasks such as word prediction, machine translation, etc. In this area, we propose a novel Neumann-Cayley Gated Recurrent Unit (NC-GRU) architecture based on a Neumann series-based Scaled Cayley transformation. The NC-GRU uses orthogonal matrices to prevent exploding gradient problems and enhance long-term memory on various prediction tasks. In addition, we propose using our newly introduced NC-GRU unit inside Neural Nets model to create neural molecular fingerprints. Integrating novel NC-GRU fingerprints and Multi-Task Deep Neural Networks schematics help to improve the performance of several molecular-related tasks. We also introduce a new normalization method - Assorted-Time Normalization, that helps to preserve information from multiple consecutive time steps and normalize using them in Recurrent Nets like architectures. Finally, we propose a Symmetry Structured Convolutional Neural Network (SCNN), an architecture with 2D structured symmetric features over spatial dimensions, that generates and preserves the symmetry structure in the network's convolutional layers.
Digital Object Identifier (DOI)
This research was supported in parts by the National Science Foundation under Division of Mathematical Sciences grants (no.: 1620082, 1821144, 2053284, 2151802, 2208314), the National Science Foundation under Office of Integrative Activities grand (no.: 2040665), and the National Institutes of Health (no.: UH3 NS100606-05).
Zadorozhnyy, Vasily I., "Novel Architectures and Optimization Algorithms for Training Neural Networks and Applications" (2023). Theses and Dissertations--Mathematics. 97.
Artificial Intelligence and Robotics Commons, Numerical Analysis and Scientific Computing Commons, Other Applied Mathematics Commons, Other Mathematics Commons