University of Kentucky Doctoral Dissertations

Archived

This content is available here for research, reference, and/or recordkeeping.

NEW BIOINFORMATIC TECHNIQUES FOR THE ANALYSIS OF LARGE DATASETS

Justin Clay Harris, University of KentuckyFollow

Date Available

12-14-2011

Year of Publication

2007

Document Type

Dissertation

College

Arts and Sciences

Department/School/Program

Chemistry

Faculty

Robert A. Lodder

Abstract

A new era of chemical analysis is upon us. In the past, a small number of samples were selected from a population for use as a statistical representation of the entire population. More recently, advancements in data collection rate, computer memory, and processing speed have allowed entire populations to be sampled and analyzed. The result is massive amounts of data that convey relatively little information, even though they may contain a lot of information. These large quantities of data have already begun to cause bottlenecks in areas such as genetics, drug development, and chemical imaging. The problem is straightforward: condense a large quantity of data into only the useful portions without ignoring or discarding anything important. Performing the condensation in the hardware of the instrument, before the data ever reach a computer is even better. The research proposed tests the hypothesis that clusters of data may be rapidly identified by linear fitting of quantile-quantile plots produced from each principal component of principal component analysis. Integrated Sensing and Processing (ISP) is tested as a means of generating clusters of principal component scores from samples in a hyperspectral near-field scanning optical microscope. Distances from the centers of these multidimensional cluster centers to all other points in hyperspace can be calculated. The result is a novel digital staining technique for identifying anomalies in hyperspectral microscopic and nanoscopic imaging of human atherosclerotic tissue. This general method can be applied to other analytical problems as well.

Recommended Citation

Harris, Justin Clay, "NEW BIOINFORMATIC TECHNIQUES FOR THE ANALYSIS OF LARGE DATASETS" (2007). University of Kentucky Doctoral Dissertations. 544.
https://uknowledge.uky.edu/gradschool_diss/544

Download

COinS

University of Kentucky Doctoral Dissertations

Archived

NEW BIOINFORMATIC TECHNIQUES FOR THE ANALYSIS OF LARGE DATASETS

Date Available

Year of Publication

Document Type

College

Department/School/Program

Faculty

Abstract

Recommended Citation

Search

Browse by Author

Author Corner

Connect

University of Kentucky Doctoral Dissertations

Archived

NEW BIOINFORMATIC TECHNIQUES FOR THE ANALYSIS OF LARGE DATASETS

Author

Date Available

Year of Publication

Document Type

College

Department/School/Program

Faculty

Abstract

Recommended Citation

Share

Search

Browse by Author

Author Corner

Connect