Author ORCID Identifier
Date Available
5-10-2023
Year of Publication
2023
Degree Name
Doctor of Philosophy (PhD)
Document Type
Doctoral Dissertation
College
Engineering
Department/School/Program
Computer Science
First Advisor
Brent Harrison
Abstract
We introduce a novel approach for learning behaviors using human-provided feedback that is subject to systematic bias. Our method, known as BASIL, models the feedback signal as a combination of a heuristic evaluation of an action's utility and a probabilistically-drawn bias value, characterized by unknown parameters. We present both the general framework for our technique and specific algorithms for biases drawn from a normal distribution. We evaluate our approach across various environments and tasks, comparing it to interactive and non-interactive machine learning methods, including deep learning techniques, using human trainers and a synthetic oracle with feedback distorted to varying degrees. We demonstrate that our algorithm can rapidly learn even in the presence of normally distributed bias, which other methods struggle with, while also exhibiting some resistance to other types of distortion.
Digital Object Identifier (DOI)
https://doi.org/10.13023/etd.2023.162
Funding Information
This research was supported by a scholarship provided by the University of Kentucky from 2016 to 2023.
Recommended Citation
Watson, Jonathan Indigo, "The BASIL technique: Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback" (2023). Theses and Dissertations--Computer Science. 134.
https://uknowledge.uky.edu/cs_etds/134
Included in
Artificial Intelligence and Robotics Commons, Data Science Commons, Gender and Sexuality Commons, Graphics and Human Computer Interfaces Commons, Theory and Algorithms Commons