Abstract

Data integration is an important topic in the information age. Although structural aspects are widely investigated, there is a lack of research on semantic discrepancies between data sources. Data integration should be able to handle input errors such as erroneous data and misspellings. Also problems like domain and data type mismatch, of missing values and duplicated records need investigation. Object identification is essential for the task of integration, especially if keys are absent or incorrect. This approach utilizes properties, which can be derived from the data sources used for identification - the derivable attributes. Two sources given, the values of the derivable attributes of pairs of records are compared and classified. A random sample of pairs is used for detecting similarities, rules or classification criteria. Different Statistical or Data Mining Techniques can be applied to classify pairs of records from two sources in order to link them or not.

Recommended Citation

Neiling, Mattis and Lenz, Hans-Joachim, "Data Integration by Means of Object Identification in Information Systems" (2000). ECIS 2000 Proceedings. 69.
https://aisel.aisnet.org/ecis2000/69

Download

COinS

ECIS 2000 Proceedings

Data Integration by Means of Object Identification in Information Systems

Abstract

Recommended Citation

Search

Links

Browse

Author Corner

ECIS 2000 Proceedings

Data Integration by Means of Object Identification in Information Systems

Authors

Abstract

Recommended Citation

Share

Search

Links

Browse

Author Corner