Paper Type

Complete

Paper Number

1074

Description

Addressing the pervasive challenge of missing data in data science and research, this paper introduces a novel approach for imputing missing values. Our method heuristically integrates the missing data mechanism into traditional supervised imputation, by utilizing nonparametric estimation of conditional distributions for the target incomplete variable. Analytical proofs, complemented by simulation results, demonstrate the superior performance of our method compared to traditional imputation approaches that neglect the missing data mechanism. Real-world prediction applications, such as consumer credit default and firm earnings prediction, further validate the efficacy of our approach. By imputing crucial predictor variables—namely, third-party credit scores and financial analysts' consensus forecasts in the respective applications—our method consistently outperforms benchmark imputation methods, leading to enhanced prediction accuracy.

Comments

Analytics

Share

COinS
 
Jul 2nd, 12:00 AM

A Semi-Supervised Learning Approach to Handling Missing Data in Predictive Analytics

Addressing the pervasive challenge of missing data in data science and research, this paper introduces a novel approach for imputing missing values. Our method heuristically integrates the missing data mechanism into traditional supervised imputation, by utilizing nonparametric estimation of conditional distributions for the target incomplete variable. Analytical proofs, complemented by simulation results, demonstrate the superior performance of our method compared to traditional imputation approaches that neglect the missing data mechanism. Real-world prediction applications, such as consumer credit default and firm earnings prediction, further validate the efficacy of our approach. By imputing crucial predictor variables—namely, third-party credit scores and financial analysts' consensus forecasts in the respective applications—our method consistently outperforms benchmark imputation methods, leading to enhanced prediction accuracy.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.