Abstract

As the volume of available data increases exponentially, traditional data warehouses struggle to transform this data into actionable knowledge. This study explores the potentialities of Hadoop as a data transformation tool in the setting of a traditional data warehouse environment. Hadoop’s distributed parallel execution model and horizontal scalability offer great capabilities when the amounts of data to be processed require the infrastructure to expand. Through a typification of the SQL statements, responsible for the data transformation processes, we were able to understand that Hadoop, and its distributed processing model, delivers outstanding performance results associated with the analytical layer, namely in the aggregation of large data sets. We demonstrate, empirically, the performance gains that can be extracted from Hadoop, in comparison to a Relational Database Management System, regarding speed, storage usage, and scalability potential, and suggest how this can be used to evolve data warehouses into the age of Big Data.

Recommended Citation

Dias, Henrique and Henriques, Roberto, "Augmenting data warehousing architectures with Hadoop" (2019). CAPSI 2019 Proceedings. 2.
https://aisel.aisnet.org/capsi2019/2

Download

COinS

CAPSI 2019 Proceedings

Augmenting data warehousing architectures with Hadoop

Abstract

Recommended Citation

Search

Links

Browse

Author Corner

CAPSI 2019 Proceedings

Augmenting data warehousing architectures with Hadoop

Authors

Abstract

Recommended Citation

Share

Search

Links

Browse

Author Corner