Paper Type

Complete

Paper Number

1386

Description

The Fog index, as a measure of text readability, is widely employed in assessing the readability of corporate disclosures. However, the accuracy of estimating average sentence length and complex word count for corporate disclosures has been questioned. To address these issues, we propose noise-robust readability (NRR), a module for tidying corporate disclosure texts, utilizing language models and natural language processing techniques. In contrast to rule-based methods adopted in previous literature, our NRR is designed to reduce noise extraneous to disclosure, thereby mitigating its negative impact on readability measurement. Experimental results indicate that our two text tidying approaches can provide noise-reduced documents for computing the Fog index, with the approach based on the Text-to-Text Transfer Transformer (T5) demonstrating better explanatory power for valuation-relevant information. After using the NRR, even with the inclusion of file size, another readability variable, the Fog index remains statistically significant in capturing the uncertainty of information dissemination.

Comments

AI

Share

COinS
 
Jul 2nd, 12:00 AM

Noise-Robust Readability for Corporate Disclosures

The Fog index, as a measure of text readability, is widely employed in assessing the readability of corporate disclosures. However, the accuracy of estimating average sentence length and complex word count for corporate disclosures has been questioned. To address these issues, we propose noise-robust readability (NRR), a module for tidying corporate disclosure texts, utilizing language models and natural language processing techniques. In contrast to rule-based methods adopted in previous literature, our NRR is designed to reduce noise extraneous to disclosure, thereby mitigating its negative impact on readability measurement. Experimental results indicate that our two text tidying approaches can provide noise-reduced documents for computing the Fog index, with the approach based on the Text-to-Text Transfer Transformer (T5) demonstrating better explanatory power for valuation-relevant information. After using the NRR, even with the inclusion of file size, another readability variable, the Fog index remains statistically significant in capturing the uncertainty of information dissemination.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.