Date Available
11-28-2011
Year of Publication
2011
Document Type
Thesis
Degree Name
Master of Science (MS)
College
Engineering
Department
Computer Science
Advisor
Dr. Jun Zhang
Abstract
A term weighting scheme measures the importance of a term in a collection. A document ranking model uses these term weights to find the rank or score of a document in a collection. We present a series of cluster-based term weighting and document ranking models based on the TF-IDF and Okapi BM25 models. These term weighting and document ranking models update the inter-cluster and intra-cluster frequency components based on the generated clusters. These inter-cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components. In this thesis, we will show how these models outperform the TF-IDF and Okapi BM25 models in document clustering and ranking.
Recommended Citation
Murugesan, Keerthiram, "CLUSTER-BASED TERM WEIGHTING AND DOCUMENT RANKING MODELS" (2011). University of Kentucky Master's Theses. 651.
https://uknowledge.uky.edu/gradschool_theses/651