Predicting the final outcome of an ongoing process instance is a key problem in many real-life contexts. This problem has been addressed mainly by discovering a prediction model by using traditional machine learning methods and, more recently, deep learning methods, exploiting the supervision coming from outcome-class labels associated with historical log traces. However, a supervised learning strategy is unsuitable for important application scenarios where the outcome labels are known only for a small fraction of log traces. In order to address these challenging scenarios, a semi-supervised learning approach is proposed here, which leverages a multi-target DNN model supporting both outcome prediction and the additional auxiliary task of next-activity prediction. The latter task helps the DNN model avoid spurious trace embeddings and overfitting behaviors. In extensive experimentation, this approach is shown to outperform both fully-supervised and semi-supervised discovery methods using similar DNN architectures across different real-life datasets and label-scarce settings.
Folino, Francesco; Folino, Gianluigi; Guarascio, Massimo; and Pontieri, Luigi
"Semi-Supervised Discovery of DNN-Based Outcome Predictors from Scarcely-Labeled Process Logs,"
Business & Information Systems Engineering:
Vol. 64: Iss. 6, 729-749.
Available at: https://aisel.aisnet.org/bise/vol64/iss6/3