Paper Type
Complete
Abstract
Text summarization is critical in modern information management, enabling organizations to extract valuable information from vast text data. However, current summarization approaches face several limitations, such as inadequate interpretability, generating factually inconsistent content, and heavy dependence on large language models (LLMs). These limitations present obstacles in environments where efficiency and reliability are paramount, or resources are constrained. To address these challenges, we introduce DocuSage, an interpretable text summarization framework designed to overcome the limitations of current methods. DocuSage combines novel sentence extraction techniques with LLM-based abstraction to reduce hallucinations, maintain context, and reduce computational overhead while mimicking human summarization through hierarchical clustering and tree structures. Empirical tests on news articles show that their summaries closely align with human judgment and outperform baseline models. By using LLMs primarily to enhance coherence, DocuSage enables the deployment of a smaller, cost-effective model, offering a scalable, transparent, and adaptable solution for organizational information systems.
Paper Number
1748
Recommended Citation
Sadmanee, Akib; Hong, Sukhwa; Leigh, Jason; and Belcaid, Mahdi, "Harnessing Hierarchical Clustering in Salience-Driven Text Summarization" (2025). AMCIS 2025 Proceedings. 16.
https://aisel.aisnet.org/amcis2025/sig_osra/sig_osra/16
Harnessing Hierarchical Clustering in Salience-Driven Text Summarization
Text summarization is critical in modern information management, enabling organizations to extract valuable information from vast text data. However, current summarization approaches face several limitations, such as inadequate interpretability, generating factually inconsistent content, and heavy dependence on large language models (LLMs). These limitations present obstacles in environments where efficiency and reliability are paramount, or resources are constrained. To address these challenges, we introduce DocuSage, an interpretable text summarization framework designed to overcome the limitations of current methods. DocuSage combines novel sentence extraction techniques with LLM-based abstraction to reduce hallucinations, maintain context, and reduce computational overhead while mimicking human summarization through hierarchical clustering and tree structures. Empirical tests on news articles show that their summaries closely align with human judgment and outperform baseline models. By using LLMs primarily to enhance coherence, DocuSage enables the deployment of a smaller, cost-effective model, offering a scalable, transparent, and adaptable solution for organizational information systems.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.
Comments
SIGOSRA