Paper Number
2463
Paper Type
Complete
Abstract
Algorithmic agents are used in a variety of competitive decision settings, most notably to make pricing decisions in contexts that range from online retail to residential home rentals. We study the emergent behavior of bandit learning algorithms used by competing agents who have no information about the strategic interaction they are engaged in. We use a general-form repeated Prisoner's Dilemma game as our model of strategic interaction. Agents engage in online learning with no prior model of game structure and no knowledge of competitors' states or actions (e.g., no observation of competing prices). We show that context-free bandits with no knowledge of their game environment and no information about their opponents' choices or outcomes still will consistently learn collusive behavior, an outcome we call "naive collusion." We then analytically describe which characteristics of bandit learning algorithms will lead to collusion and which will not.
Recommended Citation
Douglas, Connor; Provost, Foster; and Sundarajan, Arun, "Naive Algorithmic Collusion: When Do Bandit Learners Cooperate and When Do They Compete?" (2024). ICIS 2024 Proceedings. 15.
https://aisel.aisnet.org/icis2024/aiinbus/aiinbus/15
Naive Algorithmic Collusion: When Do Bandit Learners Cooperate and When Do They Compete?
Algorithmic agents are used in a variety of competitive decision settings, most notably to make pricing decisions in contexts that range from online retail to residential home rentals. We study the emergent behavior of bandit learning algorithms used by competing agents who have no information about the strategic interaction they are engaged in. We use a general-form repeated Prisoner's Dilemma game as our model of strategic interaction. Agents engage in online learning with no prior model of game structure and no knowledge of competitors' states or actions (e.g., no observation of competing prices). We show that context-free bandits with no knowledge of their game environment and no information about their opponents' choices or outcomes still will consistently learn collusive behavior, an outcome we call "naive collusion." We then analytically describe which characteristics of bandit learning algorithms will lead to collusion and which will not.
When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.
Comments
10-AI