•  
  •  
 
Communications of the Association for Information Systems

Author ORCID Identifier

Abdulkareem Alsudais: https://orcid.org/0000-0002-9961-6889

Abstract

This paper contributes to research on the ethics of utilizing publicly available images and videos in training AI models by analyzing five prominent open research datasets containing images and videos collected from web user-generated content. This study investigates the current unavailability of these images and videos to understand the extent to which users remove or limit the visibility of their content. This could indicate their opposition to the perpetual use of their images or videos in open datasets, current AI models, or the training of future models. The findings reveal that all five datasets have a substantial number of items that are no longer accessible via their original URLs. Further, a longitudinal analysis over two and a half years reveals a statistically significant increase in this unavailability. The study identifies and categorizes the factors driving this unavailability, including account termination, content being made private by users, and items removed by platforms due to policy violations. This study shows that a significant portion of users may eventually choose to remove their content from the web. This adds valuable insights to AI ethics research, highlighting privacy and the users' right to be forgotten in the context of publicly shared images and videos.

Share

COinS
 

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.