Abstract

The Internet and corporate intranets provide far more information than anybody can absorb. People use search engines to find the information they require. However, these systems tend to use only one fixed term weighting strategy regardless of the context to which it applies, posing serious performance problems when characteristics of different users, queries, and text collections are taken into consideration. In this paper, we argue that the term weighting strategy should be context specific, that is, different term weighting strategies should be applied to different contexts, and we propose a new systematic approach that can automatically generate term weighting strategies for different contexts based on genetic programming (GP). The new proposed framework was tested on TREC data and the results are very promising.

Share

COinS