Abstract

The interest on the discovery of information hidden in large amounts of data exploded in the last decade, bringing to light the need of efficient and effective tools to access all sources and kinds of data. On the other hand, the need to secure and share valuable data led to the development of new technologies, like blockchain, that warrant data integrity and transparency. Combining both is a natural demand, but several issues become clear, such as the lack of access efficiency and the need of data replication in common solutions. Indeed, the unique existing approach is by emulating queries, mostly through Smart Contracts, and applying traditional machine learning algorithms over the resulting data, stored externally for allowing multiple accesses. In this paper, we performed a systematic literature review that provides the above conclusions. Later, we discuss a new system architecture for the analysis of data stored in a blockchain, exploring the scalability and high-performance of data access in distributed file systems and the fast and up-to-date predictions of a streaming analysis approach.

Recommended Citation

Baptista, M.R., Mira da Silva, M., Rupino da Cunha, P., & Antunes, C. (2023). Data Analysis on Blockchain Distributed File Systems: Systematic Literature Review. In A. R. da Silva, M. M. da Silva, J. Estima, C. Barry, M. Lang, H. Linger, & C. Schneider (Eds.), Information Systems Development, Organizational Aspects and Societal Trends (ISD2023 Proceedings). Lisbon, Portugal: Instituto Superior Técnico. ISBN: 978-989-33-5509-1. https://doi.org/10.62036/ISD.2023.14

Paper Type

Full Paper

DOI

10.62036/ISD.2023.14

Share

COinS
 

Data Analysis on Blockchain Distributed File Systems: Systematic Literature Review

The interest on the discovery of information hidden in large amounts of data exploded in the last decade, bringing to light the need of efficient and effective tools to access all sources and kinds of data. On the other hand, the need to secure and share valuable data led to the development of new technologies, like blockchain, that warrant data integrity and transparency. Combining both is a natural demand, but several issues become clear, such as the lack of access efficiency and the need of data replication in common solutions. Indeed, the unique existing approach is by emulating queries, mostly through Smart Contracts, and applying traditional machine learning algorithms over the resulting data, stored externally for allowing multiple accesses. In this paper, we performed a systematic literature review that provides the above conclusions. Later, we discuss a new system architecture for the analysis of data stored in a blockchain, exploring the scalability and high-performance of data access in distributed file systems and the fast and up-to-date predictions of a streaming analysis approach.