This research-in-process is exploring a contingency approach to the construction and selection of data mining models for predictive classification. This approach considers the structure of the data set and the relationships between and among the various attributes characterizing the data set, with the goal of selecting a model that provides greater insight into the data – and therefore predicts most accurately -- given a particular data structure. Preliminary results obtained from analysis of hospital patient records indicate that concentration indices, commonly used to measure firm concentration within an industry, are useful in characterizing data set structures and therefore in guiding the model selection process. The eventual goal of this research is the construction of a decision support system that can aid decision makers in the model selection task.
Spangler, William E.; May, Jerold H.; Strum, David P.; and Vargas, Luis G., "The Impact of Data Characteristics on the Selection of Data Mining Methods for Predictive Classification" (2000). AMCIS 2000 Proceedings. 435.