Theses and Dissertations--Computer Science

The BASIL technique: Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback

Jonathan Indigo Watson, University of KentuckyFollow

Author ORCID Identifier

https://orcid.org/0009-0000-2101-6237

Date Available

5-10-2023

Year of Publication

2023

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Engineering

Department/School/Program

Computer Science

Advisor

Brent Harrison

Abstract

We introduce a novel approach for learning behaviors using human-provided feedback that is subject to systematic bias. Our method, known as BASIL, models the feedback signal as a combination of a heuristic evaluation of an action's utility and a probabilistically-drawn bias value, characterized by unknown parameters. We present both the general framework for our technique and specific algorithms for biases drawn from a normal distribution. We evaluate our approach across various environments and tasks, comparing it to interactive and non-interactive machine learning methods, including deep learning techniques, using human trainers and a synthetic oracle with feedback distorted to varying degrees. We demonstrate that our algorithm can rapidly learn even in the presence of normally distributed bias, which other methods struggle with, while also exhibiting some resistance to other types of distortion.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2023.162

Funding Information

This research was supported by a scholarship provided by the University of Kentucky from 2016 to 2023.

Recommended Citation

Watson, Jonathan Indigo, "The BASIL technique: Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback" (2023). Theses and Dissertations--Computer Science. 134.
https://uknowledge.uky.edu/cs_etds/134

Download

Included in

Artificial Intelligence and Robotics Commons, Data Science Commons, Gender and Sexuality Commons, Graphics and Human Computer Interfaces Commons, Theory and Algorithms Commons

COinS

Theses and Dissertations--Computer Science

The BASIL technique: Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback

Author ORCID Identifier

Date Available

Year of Publication

Document Type

Degree Name

College

Department/School/Program

Advisor

Abstract

Digital Object Identifier (DOI)

Funding Information

Recommended Citation

Included in

Search

Browse by Author

Author Corner

Connect

Theses and Dissertations--Computer Science

The BASIL technique: Bias Adaptive Statistical Inference Learning Agents for Learning from Human Feedback

Author

Author ORCID Identifier

Date Available

Year of Publication

Document Type

Degree Name

College

Department/School/Program

Advisor

Abstract

Digital Object Identifier (DOI)

Funding Information

Recommended Citation

Included in

Share

Search

Browse by Author

Author Corner

Connect