Negative data is defined by observations of unsuccessful events or poor performance. Traditional wisdom dictates that negative data be eliminated from training data sets. This paper presents a three step method for incorporating negative data into the rule induction process. The first step is to deploy rule induction using a data set containing only positive data. This is traditionally how rule induction techniques such as ID3, C4.5 and CART are used. The second step is to create a training data set that contains all of the positive data from Step 1 and also incorporates negative data. The dependent variable from Step 1 becomes a dependent variable in the new data set, and a new performance-related independent variable is defined. Decision rules are generated using the same rule induction algorithm used in Step 1. The third and final step is to reconcile the two rule sets. A step-wise procedure for creating a final, robust rule set is proposed. An example application, related to Just-In-Time manufacturing, is presented in which decision rules are generated using the classification and regression tree (CART) technique.
Mathieu, Richard; Huntley, Christopher; and Wray, Barry, "A Method for Incorporating Negative Data into Rule Induction" (1998). AMCIS 1998 Proceedings. 62.