Author ORCID Identifier
Date Available
4-17-2023
Year of Publication
2023
Document Type
Doctoral Dissertation
Degree Name
Doctor of Philosophy (PhD)
College
Arts and Sciences
Department/School/Program
Mathematics
Advisor
Dr. Qiang Ye
Abstract
Normalization methods have proven to be an invaluable tool in the training of deep neural networks. In particular, Layer and Batch Normalization are commonly used to mitigate the risks of exploding and vanishing gradients. This work presents two methods which are related to these normalization techniques. The first method is Batch Normalized Preconditioning (BNP) for recurrent neural networks (RNN) and graph convolutional networks (GCN). BNP has been suggested as a technique for Fully Connected and Convolutional networks for achieving similar performance benefits to Batch Normalization by controlling the condition number of the Hessian through preconditioning on the gradients. We extend this work by applying it to Recurrent Neural Networks and Graph Convolutional Networks, two architectures which are prone to high computational costs and therefore benefit from the training acceleration provided by BNP. The second method is Assorted-Time Normalization (ATN). ATN is a normalization technique designed for use in sequential problems. It combines information from the hidden layers of the model with temporal data across the sequence dimension, which remedies a weakness of Layer Normalization in these applications.
Digital Object Identifier (DOI)
https://doi.org/10.13023/etd.2023.090
Funding Information
National Science Foundation Grant DMS-1821144 2021
Recommended Citation
Pospisil, Cole, "Normalization Techniques for Sequential and Graphical Data" (2023). Theses and Dissertations--Mathematics. 95.
https://uknowledge.uky.edu/math_etds/95