PACIS 2021 Proceedings

Loading...

Media is loading
 

Paper Type

FP

Paper Number

119

Abstract

BERT has attained state-of-the-art performance for extractive overview tasks on the CNN/Daily-Mail dataset. We discuss a few variants of the BERT model and articulate a novel approach to regulate fine-tuning at the sentence-level in pre-trained embeddings. This paper focuses on solving the extractive text summarization task with the help of the BERTSUM model. For better performance, the authors strive to improve BERTSUM in three directions: First is using different summarization layers after BERT (classifier or transformer). The second is not using the final layer's output as the summarizer input but the output of the penultimate or anti-penultimate layer and, finally, freezing the first three BERT layers when fine-tuning the model, thereby allowing the model to verify in the initial layers the absence of catastrophic forgetting. Our proposed, BERTSUM+Classifier and BERTSUM Penultimate+Transformer Models outperform all baselines w.r.t ROUGE-1, ROUGE-2, and ROUGE-L F1 scores.

Share

COinS
 

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.