Abstract
Online instructors’ performance datasets extracted from the learning management system (LMS) are usually imbalanced, where good performance cases severely outnumber low performance cases. Addressing data skew is crucial when training machine learning models for prediction. This study examines a dataset including 3731 online classes to predict instructors’ performance evaluation results. To identify the most effective imbalance handling techniques for online class dataset, the research conducts the comparative analysis on using four different Synthetic Minority Over-sampling Technique (SMOTE) in Logistic Regression, Decision Tree, Torch Neural Network, and Random Forest prediction models. The resampling techniques include SMOTE, SMOTE-Borderline, SMOTE-Enn, SMOTE-Tomek. The findings indicate that addressing class imbalance enhances the classification performance of majority models. Among the techniques assessed, the Random Forest classifier with SMOTE achieves the best predictive performance and is the most suitable for the online class dataset used in this study. The code for this research is accessible at https://github.com/Garyzhao231/AFM-prediction/.
Recommended Citation
Zhao, Gary Yu and Tu, Cindy Zhiling, "Comparative Analysis of Applying SMOTE Techniques to Imbalanced Data for Online Instructors’ Performance Prediction" (2025). MWAIS 2025 Proceedings. 29.
https://aisel.aisnet.org/mwais2025/29