Human Computer Interaction, Artificial Intelligence and Intelligent Augmentation

Loading...

Media is loading
 

Paper Type

Complete

Paper Number

2614

Description

As the most cutting-edge frontiers, AI agents bring many exciting opportunities for personalized promotions. The current state-of-art myopic targeting methods only try to optimize the current reward and hence ignore what effect the current promotions might bring to the future revenue. AI agents can stand at the company’s perspective and behave as a forward-looking manager which considers current reward and future revenue simultaneously when designing the targeting policy to maximize the company's long-term revenue. In the meanwhile, the AI agent can also account for managerial risk preferences (risk-seeking). In this study, we illustrate the design and implementation of a deep reinforcement learning (DRL)-based AI agent for sequential personalized promotions based on a large field experiment on randomized promotions in a major mobile app platform. The results suggest that AI agents with “risk-seeking” exploration can identify the optimal personalized promotion policy faster and reach a higher long-term revenue for the platform in comparison with a more conservative exploration. In addition, a forward-looking AI agent accounting for 90% of the future delayed reward works best among the different decision horizons. We further demonstrate that our DRL-based AI agent generates 27.8% more long-term revenue compared with non-personalized mass promotions, and 24.3% more long-term revenue compared with various myopic personalized approaches. Finally, we dig into what explains the differences in outcomes and interventions under the proposed dynamic DRL-based AI agent and other benchmark policies, we find that the comparative advantage is driven by incorporating intertemporal trade-offs.

Share

COinS
 
Dec 14th, 12:00 AM

AI Agents for Sequential Promotions: Combining Deep Reinforcement Learning and Dynamic Field Experimentation

As the most cutting-edge frontiers, AI agents bring many exciting opportunities for personalized promotions. The current state-of-art myopic targeting methods only try to optimize the current reward and hence ignore what effect the current promotions might bring to the future revenue. AI agents can stand at the company’s perspective and behave as a forward-looking manager which considers current reward and future revenue simultaneously when designing the targeting policy to maximize the company's long-term revenue. In the meanwhile, the AI agent can also account for managerial risk preferences (risk-seeking). In this study, we illustrate the design and implementation of a deep reinforcement learning (DRL)-based AI agent for sequential personalized promotions based on a large field experiment on randomized promotions in a major mobile app platform. The results suggest that AI agents with “risk-seeking” exploration can identify the optimal personalized promotion policy faster and reach a higher long-term revenue for the platform in comparison with a more conservative exploration. In addition, a forward-looking AI agent accounting for 90% of the future delayed reward works best among the different decision horizons. We further demonstrate that our DRL-based AI agent generates 27.8% more long-term revenue compared with non-personalized mass promotions, and 24.3% more long-term revenue compared with various myopic personalized approaches. Finally, we dig into what explains the differences in outcomes and interventions under the proposed dynamic DRL-based AI agent and other benchmark policies, we find that the comparative advantage is driven by incorporating intertemporal trade-offs.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.