Author ORCID Identifier

https://orcid.org/0000-0002-9510-6662

Date Available

4-27-2018

Year of Publication

2018

Document Type

Master's Thesis

Degree Name

Master of Science (MS)

College

Engineering

Department/School/Program

Computer Science

Advisor

Dr. Sally Ellingson

Co-Director of Graduate Studies

Dr. Nathan Jacobs

Abstract

In order to reduce the time associated with and the costs of drug discovery, machine learning is being used to automate much of the work in this process. However the size and complex nature of molecular data makes the application of machine learning especially challenging. Much work must go into the process of engineering features that are then used to train machine learning models, costing considerable amounts of time and requiring the knowledge of domain experts to be most effective. The purpose of this work is to demonstrate data driven approaches to perform the feature selection and extraction steps in order to decrease the amount of expert knowledge required to model interactions between proteins and drug molecules.

Digital Object Identifier (DOI)

https://doi.org/10.13023/ETD.2018.137

Share

COinS