Abstract

The sudden growth of the World Wide Web and its unprecedented popularity as a de facto global digital library exemplified both the strengths and weaknesses of the Information Retrieval techniques used by popular search engines. Most queries are short and incomplete attempts to describe or characterize the possible documents relevant to the query. It seems then natural to try and expand the queries with additional terms, which are semantically and/or statistically associated with the original query terms. In this paper we are looking at the mining of associations between terms for the exploration of the terminology of a corpus as well as for the automatic expansion of queries. The technique we use for the discovery of the associations is association rules mining [Agrawal 96]. The technique we propose is more flexible than previous techniques based on term co-occurrence since it takes into account not only the co-occurrence frequency but also the confidence and direction of the association rules. Our preliminary experiment results show we can get benefit from this novel technique.

Share

COinS