Start Date

11-8-2016

Description

The proliferation of unstructured data is a growing threat to effective enterprise performance management. Enterprise search is a tool to help organizations more effectively manage this document-based information. The success of full-text enterprise search is limited by ambiguity in word meanings, which can result in many documents returned which are not relevant to the searcher. While early work by Zipf provided a first attempt at quantifying the impact of this issue on search, little work has been done to demonstrate the applicability of Zipf’s work to contemporary document collections. In this paper we examine whether the frequency-meaning relationship discovered by Zipf holds for contemporary document collections, and whether it consistently holds across different subject domains. We then discuss the implications of our results for the development and use of user-centered KPIs designed to measure the enterprise wide effectiveness of search activities.

Share

COinS
 
Aug 11th, 12:00 AM

Word Ambiguity and Search: Implications for Enterprise Performance Management

The proliferation of unstructured data is a growing threat to effective enterprise performance management. Enterprise search is a tool to help organizations more effectively manage this document-based information. The success of full-text enterprise search is limited by ambiguity in word meanings, which can result in many documents returned which are not relevant to the searcher. While early work by Zipf provided a first attempt at quantifying the impact of this issue on search, little work has been done to demonstrate the applicability of Zipf’s work to contemporary document collections. In this paper we examine whether the frequency-meaning relationship discovered by Zipf holds for contemporary document collections, and whether it consistently holds across different subject domains. We then discuss the implications of our results for the development and use of user-centered KPIs designed to measure the enterprise wide effectiveness of search activities.