Paper Number

ECIS2026-1673

Paper Type

SP

Abstract

This study addresses a central challenge in multimodal AI training data preparation: integrating privacy, copyright, and utility at the preprocessing stage. Existing approaches typically treat these regulatory dimensions in isolation, whereas our Compliance-Aware Data Pipeline (CAP) unifies them within a coherent technical artifact. Building on seven design requirements from the literature, the study develops a minimum viable product as a modular multi-agent pipeline that detects, transforms, and evaluates sensitive data towards compliance while preserving utility. An empirical evaluation using the Hateful Memes dataset confirms the expected compliance-utility trade-off, where higher compliance corresponds to reduced semantic proximity. Grounded in Design Science Research, the approach demonstrates the technical feasibility of proactive compliance and provides a foundation for further iterations incorporating additional modalities, human-in-the-loop mechanisms, and legal evaluation.

Share

COinS
 
Jun 14th, 12:00 AM

A Design Science Research Approach Towards Compliance-Aware Multimodal AI Training Data Preparation: Integrating Privacy, Copyright, and Utility

This study addresses a central challenge in multimodal AI training data preparation: integrating privacy, copyright, and utility at the preprocessing stage. Existing approaches typically treat these regulatory dimensions in isolation, whereas our Compliance-Aware Data Pipeline (CAP) unifies them within a coherent technical artifact. Building on seven design requirements from the literature, the study develops a minimum viable product as a modular multi-agent pipeline that detects, transforms, and evaluates sensitive data towards compliance while preserving utility. An empirical evaluation using the Hateful Memes dataset confirms the expected compliance-utility trade-off, where higher compliance corresponds to reduced semantic proximity. Grounded in Design Science Research, the approach demonstrates the technical feasibility of proactive compliance and provides a foundation for further iterations incorporating additional modalities, human-in-the-loop mechanisms, and legal evaluation.

When commenting on articles, please be friendly, welcoming, respectful and abide by the AIS eLibrary Discussion Thread Code of Conduct posted here.