Author ORCID Identifier

https://orcid.org/0009-0000-2101-6237

Date Available

5-10-2023

Year of Publication

2023

Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation

College

Engineering

Department/School/Program

Computer Science

First Advisor

Brent Harrison

Abstract

We introduce a novel approach for learning behaviors using human-provided feedback that is subject to systematic bias. Our method, known as BASIL, models the feedback signal as a combination of a heuristic evaluation of an action's utility and a probabilistically-drawn bias value, characterized by unknown parameters. We present both the general framework for our technique and specific algorithms for biases drawn from a normal distribution. We evaluate our approach across various environments and tasks, comparing it to interactive and non-interactive machine learning methods, including deep learning techniques, using human trainers and a synthetic oracle with feedback distorted to varying degrees. We demonstrate that our algorithm can rapidly learn even in the presence of normally distributed bias, which other methods struggle with, while also exhibiting some resistance to other types of distortion.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2023.162

Funding Information

This research was supported by a scholarship provided by the University of Kentucky from 2016 to 2023.

Share

COinS