This study develops an abstractive text summarization for multi-document inputs using a two-fold graph-based approach. First, key opinions in the form of topic sentences and feature words are identified using the Stochastic Block Model, a graph-based topic modelling technique. This process filters noisy and non-relevant data from the inputs. Importantly, key aspects are identified to provide labels for the sentence classification task. The second graph-based method is used in the sentence generation process, in which words and their co-occurrent relationship are presented as nodes and directed edges. Word-clusters capture the major opinion for each aspect, and paths shared by multiple sentences are ranked and modelled to generate summary sentences. We demonstrate this approach in summarizing Vietnamese reviews using Tripadvisor and VnExpess datasets in two case studies.
Dinh, Minh N.; Tang, An Quang; and La, Ngoc Minh, "Abstractive summarization of public opinion in Vietnamese using graph-based method" (2022). PACIS 2022 Proceedings. 154.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.