Prior art retrieval is the process of determining a set of possibly relevant prior arts for a specific patent or patent application. Such process is essential for various patent practices, e.g. patentability search, validity search, and infringement search. To support the automatic retrieval of prior arts, existing studies generally adopt the traditional information retrieval (IR) approach or extend the IR approach by incorporating additional information such as citations, classes of patents. Those approaches only exploit partial information of patents and thus may limit the performance of prior art retrieval. In response, we propose a novel approach which employs comprehensive information of patents and performs a supervised approach for prior art retrieval. Unlike traditional supervised learning approach which requires manual preparation of a set of positive and negative training examples, the proposed supervised technique includes a simple but effective mechanism for automatic generation of training examples. Our empirical evaluation on a large dataset consisted of 52,311 semiconductor-related patents indicates that the proposed supervised technique significantly outperforms the traditional full-text-based IR approach.
Chen, Hung-Chen; Lin, Yu-Kai; Wei, Chih-Ping; and Yang, Chin-Sheng, "Automatic Learning of A Supervised Classifier for Patent Prior Art Retrieval" (2010). PACIS 2010 Proceedings. 201.