Abstract

Recent advances in artificial intelligence have led to the creation of applications and commercial tools that generate images based on text descriptions. These solutions rely on text-to-image models that have been trained on large datasets. It is anticipated that similar technology will soon enable users to generate videos from text prompts. These advancements, which process both text and visual data, have tremendous potential to enhance a variety of applications across many sectors. However, they are not without their issues and some scholars are increasingly concerned about the ethical implications of these models, particularly how issues in the training data can skew or affect the images they generate. Several studies have explored biases, inaccuracies, and representation issues in broader AI models and datasets (Alsudais, 2021; Mehrabi et al., 2021). In one paper, the authors studied how NLP models perpetuate stereotypes and cause representational harms (Narayanan Venkit et al., 2023), which is defined as “harms arise when a system (e.g., a search engine) represents some social groups in a less favorable light than others, demeans them, or fails to recognize their existence altogether” (Blodgett et al., 2020). Another study found that a large machine learning image dataset contained negative images labeled with specific nationalities (Alsudais, 2022). This study aims to focus on one area of interest: how text-to-image models generate images when specific nationalities are mentioned in the input prompts. The primary research question of this work is to examine whether certain nationalities are depicted negatively by text-to-image models, and thus causing representational harms. Initial investigations involved testing several text-to-image models using prompts such as 'cooking a meal' or 'playing soccer,' with a specific focus on nationality in the text input. Early findings indicate that the generated images for some nationalities may unintentionally reinforce stereotypes or portray people in traditional attire, which does not accurately represent their everyday activities. This tendency to default to stereotypes can be harmful and perpetuates outdated views. This research aims to shed new light on representational harms in text-to-image models, contributing to a broader understanding of bias in AI.

Share

COinS