Abstract

In his book Superintelligence, Nick Bostrom points to several ways the development of Artificial Intelligence (AI) might fail, turn out to be malignant or even induce an existential catastrophe. He describes ‘Perverse Instantiations’ (PI) as cases, in which AI figures out how to satisfy some goal through unintended ways. For instance, AI could attempt to paralyze human facial muscles into constant smiles to achieve the goal of making humans smile. According to Bostrom, cases like this ought to be avoided since they include a violation of human designer’s intentions. However, AI finding solutions that its designers have not yet thought of and therefore could also not have intended is arguably one of the main reasons why we are so eager to use it on a variety of problems. In this paper, I aim to show that the concept of PI is quite vague, mostly due to ambiguities surrounding the term ‘intention’. Ultimately, this text aims to serve as a starting point for a further discussion of the research topic, the development of a research agenda and future improvement of the terminology.

Share

COinS