In the age of E-Business many companies are faced with massive data sets that must be analysed for gaining a competitive edge. These data sets are in many instances incomplete and quite often not of very high quality. Although statistical analysis can be used to pre-process these data sets, this technique has its own limitations. In this paper we are presenting a system – and its underlying model – that can be used to investigate the integrity of existing data and pre-process the data into clearer data sets to be mined. LH5 is a rule -based system, capable of selflearning and is illustrated using a medical data set.
Parapadakis, Dimitris and El-Darzi, Elia, "The LH5 Model for Data Mining" (2001). ICEB 2001 Proceedings. 156.