Author ORCID Identifier

https://orcid.org/0000-0002-0379-7348

Date Available

4-30-2023

Year of Publication

2023

Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation

College

Arts and Sciences

Department/School/Program

Mathematics

First Advisor

Dr. Qiang Ye

Abstract

The two main areas of Deep Learning are Unsupervised and Supervised Learning. Unsupervised Learning studies a class of data processing problems in which only descriptions of objects are known, without label information. Generative Adversarial Networks (GANs) have become among the most widely used unsupervised neural net models. GAN combines two neural nets, generative and discriminative, that work simultaneously. We introduce a new family of discriminator loss functions that adopts a weighted sum of real and fake parts, which we call adaptive weighted loss functions. Using the gradient information, we can adaptively choose weights to train a discriminator in the direction that benefits the GAN's stability. Also, we propose several improvements to the GAN training schemes. One is self-correcting optimization for training a GAN discriminator on Speech Enhancement tasks, which helps avoid ``harmful'' training directions for parts of the discriminator loss. The other improvement is a consistency loss, which targets the inconsistency in time and time-frequency domains caused by Fourier Transforms. Contrary to Unsupervised Learning, Supervised Learning uses labels for each object, and it is required to find the relationship between objects and labels. Building computing methods to interpret and represent human language automatically is known as Natural Language Processing which includes tasks such as word prediction, machine translation, etc. In this area, we propose a novel Neumann-Cayley Gated Recurrent Unit (NC-GRU) architecture based on a Neumann series-based Scaled Cayley transformation. The NC-GRU uses orthogonal matrices to prevent exploding gradient problems and enhance long-term memory on various prediction tasks. In addition, we propose using our newly introduced NC-GRU unit inside Neural Nets model to create neural molecular fingerprints. Integrating novel NC-GRU fingerprints and Multi-Task Deep Neural Networks schematics help to improve the performance of several molecular-related tasks. We also introduce a new normalization method - Assorted-Time Normalization, that helps to preserve information from multiple consecutive time steps and normalize using them in Recurrent Nets like architectures. Finally, we propose a Symmetry Structured Convolutional Neural Network (SCNN), an architecture with 2D structured symmetric features over spatial dimensions, that generates and preserves the symmetry structure in the network's convolutional layers.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2023.125

Funding Information

This research was supported in parts by the National Science Foundation under Division of Mathematical Sciences grants (no.: 1620082, 1821144, 2053284, 2151802, 2208314), the National Science Foundation under Office of Integrative Activities grand (no.: 2040665), and the National Institutes of Health (no.: UH3 NS100606-05).

Recommended Citation

Zadorozhnyy, Vasily I., "Novel Architectures and Optimization Algorithms for Training Neural Networks and Applications" (2023). Theses and Dissertations--Mathematics. 97.
https://uknowledge.uky.edu/math_etds/97

Download

Included in

Artificial Intelligence and Robotics Commons, Numerical Analysis and Scientific Computing Commons, Other Applied Mathematics Commons, Other Mathematics Commons

COinS

Theses and Dissertations--Mathematics

Novel Architectures and Optimization Algorithms for Training Neural Networks and Applications

Author ORCID Identifier

Date Available

Year of Publication

Degree Name

Document Type

College

Department/School/Program

First Advisor

Abstract

Digital Object Identifier (DOI)

Funding Information

Recommended Citation

Included in

Search

Browse by Author

Author Corner

Connect

Theses and Dissertations--Mathematics

Novel Architectures and Optimization Algorithms for Training Neural Networks and Applications

Author

Author ORCID Identifier

Date Available

Year of Publication

Degree Name

Document Type

College

Department/School/Program

First Advisor

Abstract

Digital Object Identifier (DOI)

Funding Information

Recommended Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect