Abstract

Keywords represent concepts of a document and form a summary of that document. They can be utilized in the indexing process of information retrieval to assist users to locate desired information. This paper presents an approach, called CBKE, to extracting keywords based on the idea of lexical cohesion, i.e., terms are collocated to reinforce sub-themes in a document. CBKE clusters primary terms in a document to form subtopics. Association strength between terms and subtopics are then analyzed. Keywords are identified as those that relate to many subtopics strongly. An example of extracting keywords using our proposed approach is illustrated. It shows that our proposed approach can discover essential terms in the document.

Share

COinS