Document Type



An agent want to buy products from e-market often encounters unknown suppliers, he then must choose between maximizing its expected utility according to the known suppliers and trying to learn more about the unknown suppliers, since this may improve its future rewards. This issue is known as the trade-off between exploitation and exploration. In this research, we study the problem of an agent how to select suppliers from electronic markets with incomplete information. The agent has no knowledge about suppliers, so he needs to learn the information by consuming their product and his object is to maximize total utility. We consider two different scenarios. The first is an agent selects a single supplier at each time period. By the introduction of Gittins index, we show that by using Gittins index technology, the agent can achieve the optimal solution. The second is an agent can select several suppliers at each time period, we propose four heuristic policies and evaluate them by building up a simulation tool.