Document clustering is an important tool for applications such as Web search engines. Document clustering can be defined as the process of organizing documents into groups. The groups thus formed have a high degree of association between members within the same group and a low degree of association between members of different groups. The goal of this paper is to present an experiment on one of the most widely used document clustering algorithms, namely, the agglomerative hierarchical algorithm. In our experiment, two set of graduate theses are clustered based on the key phrases assigned to each document by the author(s). Overall, the clustering results of our clustering scheme are considered to be very good.
Wang, Jau-Hwang and Hsieh, Ju-Cheng, "Clustering Graduate Theses Based on Key Phrases Using Agglomerative Hierarchical Methods: An Experiment" (2002). ICEB 2002 Proceedings (Taipei, Taiwan). 112.