Paper Type

Complete

Abstract

Text summarization is critical in modern information management, enabling organizations to extract valuable information from vast text data. However, current summarization approaches face several limitations, such as inadequate interpretability, generating factually inconsistent content, and heavy dependence on large language models (LLMs). These limitations present obstacles in environments where efficiency and reliability are paramount, or resources are constrained. To address these challenges, we introduce DocuSage, an interpretable text summarization framework designed to overcome the limitations of current methods. DocuSage combines novel sentence extraction techniques with LLM-based abstraction to reduce hallucinations, maintain context, and reduce computational overhead while mimicking human summarization through hierarchical clustering and tree structures. Empirical tests on news articles show that their summaries closely align with human judgment and outperform baseline models. By using LLMs primarily to enhance coherence, DocuSage enables the deployment of a smaller, cost-effective model, offering a scalable, transparent, and adaptable solution for organizational information systems.

Paper Number

1748

Author Connect URL

https://authorconnect.aisnet.org/conferences/AMCIS2025/papers/1748

Comments

SIGOSRA

Author Connect Link

Share

COinS
 
Aug 15th, 12:00 AM

Harnessing Hierarchical Clustering in Salience-Driven Text Summarization

Text summarization is critical in modern information management, enabling organizations to extract valuable information from vast text data. However, current summarization approaches face several limitations, such as inadequate interpretability, generating factually inconsistent content, and heavy dependence on large language models (LLMs). These limitations present obstacles in environments where efficiency and reliability are paramount, or resources are constrained. To address these challenges, we introduce DocuSage, an interpretable text summarization framework designed to overcome the limitations of current methods. DocuSage combines novel sentence extraction techniques with LLM-based abstraction to reduce hallucinations, maintain context, and reduce computational overhead while mimicking human summarization through hierarchical clustering and tree structures. Empirical tests on news articles show that their summaries closely align with human judgment and outperform baseline models. By using LLMs primarily to enhance coherence, DocuSage enables the deployment of a smaller, cost-effective model, offering a scalable, transparent, and adaptable solution for organizational information systems.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.