Abstract

This study empirically evaluates the performance of Chronos, a recent foundation model pre-trained on a large corpus of time series data, for the task of daily stock index forecasting. Using a rolling window framework on historical Nasdaq-100 and S&P 500 data from 1995 to early 2025, we compare zero-shot and fine-tuned Chronos variants against a diverse set of established forecasting methods, including statistical benchmarks (AutoARIMA, ETS), standard deep learning models (DeepAR, DLinear, SimpleFeedForward), other Transformer-based architectures (PatchTST), and ensemble approaches. Our results, based on standard forecasting metrics and simulated trading performance, indicate that zero-shot Chronos provides competitive forecasting accuracy. It is statistically comparable to the best traditional methods, but its derived trading performance lags top benchmarks. The fine-tuned Chronos variant statistically underperformed the zero-shot version in forecast accuracy. These findings highlight the potential of foundation models and underlines the significant challenges in effective fine-tuning.

Recommended Citation

Łaniewski, S. & Ślepaczuk, R. (2025). Evaluating the Chronos Foundation Model for Daily Stock Index ForecastingIn I. Luković, S. Bjeladinović, B. Delibašić, D. Barać, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Empowering the Interdisciplinary Role of ISD in Addressing Contemporary Issues in Digital Transformation: How Data Science and Generative AI Contributes to ISD (ISD2025 Proceedings). Belgrade, Serbia: University of Gdańsk, Department of Business Informatics & University of Belgrade, Faculty of Organizational Sciences. ISBN: 978-83-972632-1-5. https://doi.org/10.62036/ISD.2025.48

Paper Type

Short Paper

DOI

10.62036/ISD.2025.48

Share

COinS
 

Evaluating the Chronos Foundation Model for Daily Stock Index Forecasting

This study empirically evaluates the performance of Chronos, a recent foundation model pre-trained on a large corpus of time series data, for the task of daily stock index forecasting. Using a rolling window framework on historical Nasdaq-100 and S&P 500 data from 1995 to early 2025, we compare zero-shot and fine-tuned Chronos variants against a diverse set of established forecasting methods, including statistical benchmarks (AutoARIMA, ETS), standard deep learning models (DeepAR, DLinear, SimpleFeedForward), other Transformer-based architectures (PatchTST), and ensemble approaches. Our results, based on standard forecasting metrics and simulated trading performance, indicate that zero-shot Chronos provides competitive forecasting accuracy. It is statistically comparable to the best traditional methods, but its derived trading performance lags top benchmarks. The fine-tuned Chronos variant statistically underperformed the zero-shot version in forecast accuracy. These findings highlight the potential of foundation models and underlines the significant challenges in effective fine-tuning.