Date Available

8-19-2025

Year of Publication

2025

Document Type

Doctoral Dissertation

Degree Name

Doctor of Philosophy (PhD)

College

Engineering

Department/School/Program

Computer Science

Faculty

Dr. Brent Harrison

Faculty

Dr. Simone Silvestri

Faculty

Dr. Stephen G. Ware

Abstract

While artificial intelligence (AI) and machine learning (ML) have proven effective at addressing many of the challenges that we face in our everyday lives, there are many situations in which these methods struggle. Examples include environments where AI or ML systems must perform complex behaviors or those where rewards are difficult to calculate. To address this limitation, interactive machine learning (IML) techniques have been introduced, which incorporate machine-understandable human feedback into traditional ML approaches. This feedback is often given as a discrete, positive or negative numeric value. This feedback is typically provided as often as possible to convey a dense signal that an AI or machine learning system can easily learn from. One issue with this approach is that it is difficult to elicit this type of feedback from a human trainer. In addition, humans can struggle to give this feedback in a way that is useful for an AI or ML system since it is not how humans communicate complex information.

One promising approach to address this limitation is to utilize natural language as a way to incorporate human intelligence into AI and ML systems. Natural language has the potential to be a useful tool in augmenting AI and ML systems because it can be used to convey complex ideas concisely. Additionally, this would enable human trainers to provide feedback naturally, just as they would when communicating with another person. In this dissertation, we investigate how information encoded in natural language can enhance the learning capabilities of various AI systems. Specifically, we focus on three language elements: semantic information, procedural knowledge, and context-aware texts. Using these components, we develop three distinct approaches to examine their impact in different problem environments.

In the first phase of this work, we use real-world images for the Visual Question Answering (VQA) task. We investigate whether semantic information extracted from images, expressed in natural language, enables a machine learning system to better comprehend the image and provide more accurate answers. We found that semantic descriptions, used as an external knowledge source, significantly enhanced the AI system’s learning efficacy. In the subsequent phases, we transitioned from episodic to sequential tasks. For the first sequential task, we explore how natural language can be used to provide dense feedback in an arcade game environment. This work shows how natural language feedback can be used to help train an agent to navigate a complex, dynamic environment in both dense and sparse reward settings.

The limitation of this approach is that feedback must still be given every time step. In the final phase of this dissertation, we address this limitation by utilizing goal-level feedback rather than action-level feedback. We show that descriptions of the agent's current goal can be used to guide agents while not requiring constant human feedback. These context-aware goal texts specifically guide the agent through subgoals (or milestones) that are required in order to complete the overall task and reach the final objective. Our experiments showed that goal-defining natural language advice significantly improves RL agents' learning performance in complex, sparse-reward environments.

Digital Object Identifier (DOI)

https://doi.org/10.13023/etd.2025.443

Share

COinS