Abstract
Feature selection plays a significant role in the development of categories of information systems related to decision support, such as diagnostic or recommendation systems. Such systems should ensure the possibility of identifying the most important features as well as analysing data from different locations, taking into account the specificity and characteristics of the local data sources. In the process of data analysis, the stage of data preparation, including the transformation of the attribute domain from continuous form to intervals, plays an important role, as the outcome of this process influences the subsequent stages of the analysis. In the paper, an approach to creating a global feature ranking that takes into account the specifics and characteristics of different discretisation algorithms was proposed. A new weight for the estimation of attribute importance was defined and compared with a measure that is implemented in the Python programming language library. Both types of weights were used to create a hierarchical structure of the global ranking of features. The experiments were carried out on datasets from the stylometry domain dedicated to the task of authorship attribution.
Paper Type
Full Paper
DOI
10.62036/ISD.2025.51
Feature Evaluation Through Decision Trees Structure
Feature selection plays a significant role in the development of categories of information systems related to decision support, such as diagnostic or recommendation systems. Such systems should ensure the possibility of identifying the most important features as well as analysing data from different locations, taking into account the specificity and characteristics of the local data sources. In the process of data analysis, the stage of data preparation, including the transformation of the attribute domain from continuous form to intervals, plays an important role, as the outcome of this process influences the subsequent stages of the analysis. In the paper, an approach to creating a global feature ranking that takes into account the specifics and characteristics of different discretisation algorithms was proposed. A new weight for the estimation of attribute importance was defined and compared with a measure that is implemented in the Python programming language library. Both types of weights were used to create a hierarchical structure of the global ranking of features. The experiments were carried out on datasets from the stylometry domain dedicated to the task of authorship attribution.
Recommended Citation
Zielosko, B., Stańczyk, U., Jabloński, K. & Moshkov, M. (2025). Feature Evaluation Through Decision Trees StructureIn I. Luković, S. Bjeladinović, B. Delibašić, D. Barać, N. Iivari, E. Insfran, M. Lang, H. Linger, & C. Schneider (Eds.), Empowering the Interdisciplinary Role of ISD in Addressing Contemporary Issues in Digital Transformation: How Data Science and Generative AI Contributes to ISD (ISD2025 Proceedings). Belgrade, Serbia: University of Gdańsk, Department of Business Informatics & University of Belgrade, Faculty of Organizational Sciences. ISBN: 978-83-972632-1-5. https://doi.org/10.62036/ISD.2025.51