Abstract

Feature selection has great importance for simplifying machine learning and improving computational efficiency, especially when working with high-dimensional datasets. The rise of Large Language Models (LLMs) offers new opportunities in selecting predictive features. This paper aims to evaluate the potential of LLMs for feature selection tasks and examine whether a hybrid approach can lead to improved predictive performance. Using the DeepSeek-R1 model on publicly available datasets, the results show that LLM-driven feature selection holds significant promise. Furthermore, the performance of hybrid approaches highlights the value of LLMs as a complementary tool to traditional feature selection methods. Across the experiments, the hybrid approach either achieved the highest performance or ranked among the top-performing methods.

Recommended Citation

Đukić, M. & Sretenović, A. (2025). Feature Selection in the Age of Large Language Models: Insights from DeepSeekIn I. Luković, S. Bjeladinović, B. Delibašić, D. Barać, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Empowering the Interdisciplinary Role of ISD in Addressing Contemporary Issues in Digital Transformation: How Data Science and Generative AI Contributes to ISD (ISD2025 Proceedings). Belgrade, Serbia: University of Gdańsk, Department of Business Informatics & University of Belgrade, Faculty of Organizational Sciences. ISBN: 978-83-972632-1-5. https://doi.org/10.62036/ISD.2025.101

Paper Type

Poster

DOI

10.62036/ISD.2025.101

Share

COinS
 

Feature Selection in the Age of Large Language Models: Insights from DeepSeek

Feature selection has great importance for simplifying machine learning and improving computational efficiency, especially when working with high-dimensional datasets. The rise of Large Language Models (LLMs) offers new opportunities in selecting predictive features. This paper aims to evaluate the potential of LLMs for feature selection tasks and examine whether a hybrid approach can lead to improved predictive performance. Using the DeepSeek-R1 model on publicly available datasets, the results show that LLM-driven feature selection holds significant promise. Furthermore, the performance of hybrid approaches highlights the value of LLMs as a complementary tool to traditional feature selection methods. Across the experiments, the hybrid approach either achieved the highest performance or ranked among the top-performing methods.