Year of Publication
Master of Science (MS)
Dr. Jun Zhang
A term weighting scheme measures the importance of a term in a collection. A document ranking model uses these term weights to find the rank or score of a document in a collection. We present a series of cluster-based term weighting and document ranking models based on the TF-IDF and Okapi BM25 models. These term weighting and document ranking models update the inter-cluster and intra-cluster frequency components based on the generated clusters. These inter-cluster and intra-cluster frequency components are used for weighting the importance of a term in addition to the term and document frequency components. In this thesis, we will show how these models outperform the TF-IDF and Okapi BM25 models in document clustering and ranking.
Murugesan, Keerthiram, "CLUSTER-BASED TERM WEIGHTING AND DOCUMENT RANKING MODELS" (2011). University of Kentucky Master's Theses. 651.