Abstract
In a recent talk, the CEO of Snowflake asserted that you cannot have an AI strategy without a data strategy. Drawing on our prior work, our objective is to define AI Ready Data. In this respect, the data would tend toward standardization, automation, and AI augmentation. In other words, AI Ready Data involves more standardized data management, with data production being automated and AI augmented. We propose three levels of data management, in an increasing order: Foundational, Advanced, and AI Specific. Foundational data management refers to the concepts such as data lake, data quality, data governance, data integration, data preparation, data catalog, and tracing data lineage. The next level, advanced data management, is an improvement while still not reaching an AI ready stage. At this level, we propose that data lakehouse, data observability, active metadata, knowledge graphs, DataOps, data products, and a federated data architecture play a role. AI specific data management, is the highest level, where data is ready for a robust AI strategy implementation. At this level, we propose the following key characteristics: data labeling, synthetic data, data enrichment, data bias protocols, chunking/vector embedding, as well as prompt and feature engineering. In addition to senior level experience of one coauthor, design science (Hevner & Chatterjee, 2010) serves as a useful lens for our work.
Recommended Citation
Nagpal, Pankaj and Alikhachkina, Elena, "Data Governance for AI" (2025). AMCIS 2025 TREOs. 184.
https://aisel.aisnet.org/treos_amcis2025/184
Comments
tpp1452