Abstract

Student dropout is a significant challenge for educational institutions, affecting approximately 40% in Higher Education. This study analyzes data from students enrolled in Distance Education (EAD) courses at a Higher Education Institution (HEI). It employs statistical techniques and Machine Learning (ML) to categorize students and identify variables that influence dropout. Through data visualization, Logistic Regression, and Clustering (K-Means), the study identifies key factors affecting dropout and develops a model for grouping students based on their academic profiles. The authors also create a descriptive text for each group, facilitating the identification of students with a high-risk profile. However, limitations of the study include incomplete data input by the HEI, with important fields left unfilled.

Share

COinS