Location

Grand Wailea, Hawaii

Event Website

https://hicss.hawaii.edu/

Start Date

8-1-2019 12:00 AM

End Date

11-1-2019 12:00 AM

Description

Topic extraction is a major field in text mining. Key noun-phrases play a very important role in identifying the important document topic because the primary information of a document is described in nounphrases. In this paper, we propose a new topic extraction schema to identify the key noun-phrases by constructing a context free grammar (CFG) from input documents. In our new method, documents are reconstructed as a set of CFG rules using an existing algorithm called Sequitur. The Sequitur algorithm infers the resulting context-free grammatical rules, which can be considered as a hierarchical structure, from a sequence of discrete symbols. The resulting hierarchical structure exposes the underlying structure of input sequence that can help us capture meaningful regularity. Based on this hierarchical structure of the input document, we designed a new algorithm to identify noun-phrases and extract key noun-phrases.

Share

COinS
 
Jan 8th, 12:00 AM Jan 11th, 12:00 AM

A Context Free Gramma for Key Noun-Phrase Extraction from Text

Grand Wailea, Hawaii

Topic extraction is a major field in text mining. Key noun-phrases play a very important role in identifying the important document topic because the primary information of a document is described in nounphrases. In this paper, we propose a new topic extraction schema to identify the key noun-phrases by constructing a context free grammar (CFG) from input documents. In our new method, documents are reconstructed as a set of CFG rules using an existing algorithm called Sequitur. The Sequitur algorithm infers the resulting context-free grammatical rules, which can be considered as a hierarchical structure, from a sequence of discrete symbols. The resulting hierarchical structure exposes the underlying structure of input sequence that can help us capture meaningful regularity. Based on this hierarchical structure of the input document, we designed a new algorithm to identify noun-phrases and extract key noun-phrases.

https://aisel.aisnet.org/hicss-52/da/data_text_web_mining/9