Author ORCID Identifier
Date Available
7-25-2019
Year of Publication
2019
Degree Name
Doctor of Philosophy (PhD)
Document Type
Doctoral Dissertation
College
Engineering
Department/School/Program
Computer Science
First Advisor
Dr. Nathan Jacobs
Abstract
The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Billions of images that capture this complex relationship are uploaded to social-media websites every day and often are associated with precise time and location metadata. This rich source of data can be beneficial to improve our understanding of the globe. In this work, we propose a general framework that uses these publicly available images for constructing dense maps of different ground-level attributes from overhead imagery. In particular, we use well-defined probabilistic models and a weakly-supervised, multi-task training strategy to provide an estimate of the expected visual and auditory ground-level attributes consisting of the type of scenes, objects, and sounds a person can experience at a location. Through a large-scale evaluation on real data, we show that our learned models can be used for applications including mapping, image localization, image retrieval, and metadata verification.
Digital Object Identifier (DOI)
https://doi.org/10.13023/etd.2019.340
Recommended Citation
Salem, Tawfiq, "Learning to Map the Visual and Auditory World" (2019). Theses and Dissertations--Computer Science. 86.
https://uknowledge.uky.edu/cs_etds/86
Included in
Artificial Intelligence and Robotics Commons, Databases and Information Systems Commons, Software Engineering Commons, Theory and Algorithms Commons