Location

260-005, Owen G. Glenn Building

Start Date

12-15-2014

Description

Large-scale data generated by crowds provide a myriad of opportunities for monitoring and modeling people's intentions, preferences, and opinions. A crucial step in analyzing such "Big Data" is identifying the relevant data items that should be provided as input to the modeling process. Interestingly, this important step has received limited attention in previous research. This paper proposes a novel crowd-based approach to this data selection problem: leveraging crowds to amplify the predictive capacity of search trend data (Google Trends). We developed an online word association task that taps into people's "thought-collection" process when thinking about a focal term. We empirically tested this method in two domains that have been used as test-beds for prediction. The method yields predictions that are equivalent or superior to those obtained in previous studies (using alternative data selection methods) and to predictions obtained using various benchmark data selection methods.

Share

COinS
 
Dec 15th, 12:00 AM

Using Crowd-Based Data Selection to Improve the Predictive Power of Search Trend Data

260-005, Owen G. Glenn Building

Large-scale data generated by crowds provide a myriad of opportunities for monitoring and modeling people's intentions, preferences, and opinions. A crucial step in analyzing such "Big Data" is identifying the relevant data items that should be provided as input to the modeling process. Interestingly, this important step has received limited attention in previous research. This paper proposes a novel crowd-based approach to this data selection problem: leveraging crowds to amplify the predictive capacity of search trend data (Google Trends). We developed an online word association task that taps into people's "thought-collection" process when thinking about a focal term. We empirically tested this method in two domains that have been used as test-beds for prediction. The method yields predictions that are equivalent or superior to those obtained in previous studies (using alternative data selection methods) and to predictions obtained using various benchmark data selection methods.