Abstract

In big data problems the data usually are collected on many sites, have a huge volume, and new pieces of data are constantly generated. It is often impossible to collect all the data needed for a research project on one computer, and even impractical, since one computer would not be able to process it in a reasonable time. An appropriate data analysis algorithm should, working in parallel on many computers, extract from each set of raw data some intermediate compact “information”, gradually combine and update it, and finally, use the accumulated information to produce the result. When new data appears, it must extract information from them, add it to the accumulated one, and eventually update the result. We consider several examples of a suitable transformation of processing algorithms, discuss specific features of the emerging information spaces and, in particular, their algebraic properties. We also show that the information space often can be equipped with an order relation that reflects the "quality" of the information.

Recommended Citation

Golubtsov, Peter, "Algebra of Information in Big Data Processing" (2017). International Conference Information Systems 2017 Special Interest Group on Big Data Proceedings. 4.
https://aisel.aisnet.org/icis2017b/4

Download

COinS

International Conference Information Systems 2017 Special Interest Group on Big Data Proceedings

Algebra of Information in Big Data Processing

Abstract

Recommended Citation

Search

Links

Browse

Author Corner

International Conference Information Systems 2017 Special Interest Group on Big Data Proceedings

Algebra of Information in Big Data Processing

Authors

Abstract

Recommended Citation

Share

Search

Links

Browse

Author Corner