Year of Publication


Degree Name

Doctor of Philosophy (PhD)

Document Type

Doctoral Dissertation




Computer Science

First Advisor

Dr. Nathan Jacobs


Images provide direct evidence for the position and orientation of the camera in space, known as camera pose. Traditionally, the problem of estimating the camera pose requires reference data for determining image correspondence and leveraging geometric relationships between features in the image. Recent advances in deep learning have led to a new class of methods that regress the pose directly from a single image.

This thesis proposes methods for absolute camera pose regression. Absolute pose regression estimates the pose of a camera from a single image as the output of a fixed computation pipeline. These methods have many practical benefits over traditional methods, such as constant inference speed and simplicity of use. However, they also have severe limitations, the most significant of which are high pose error and the fact that a network must be trained for each new scene. Despite the negatives, absolute pose regression is an exciting line of research with many potential use cases.

Our work focuses on three areas. First, we investigate the use of absolute pose regression across multiple scenes. We propose a method for using a mostly shared network to perform pose regression across multiple scenes without significant increase in pose error relative to per-scene networks. With this approach, we also show how the features learned during multi-scene training do not transfer to pose regression in new scenes. Next, we propose a new convolutional network to improve the accuracy of absolute pose regression. The new network takes inspiration from traditional methods to design a network explicitly for camera pose regression. As opposed to the black box approaches used by other methods, out method results in a significant decrease in pose error. Finally, we show an application of the new method to share network weights to estimate camera pose in multiple scenes. Due to the more explicit design of the network, it is naturally partitioned into scene-dependent and scene-agnostic layers, allowing us to transfer pretrained weights to novel scenes without needing to retrained the entire network.

The contribution of this thesis is a novel architecture for absolute pose regression which directly uses well known geometric relations that results in higher pose accuracy and allows for localization within novel scenes without needing to retrain the full network.

Digital Object Identifier (DOI)

Funding Information

Teaching Assistant. Department of Computer Science. Fall 2016-Spring 2017

Research Assistant. Department of Computer Science. Fall 2017-Summer 2021