PACIS 2016 Proceedings

OUTLIER DETECTION VIA MINIMUM SPANNING TREE

Xin Tang, Xi’an Jiaotong UniversityFollow
Wei Huang, Xi’an Jiaotong UniversityFollow
Xue Li, The University of QueenslandFollow
Shengli Li, Xi’an Jiaotong UniversityFollow
Yuewen Liu, Xi’an Jiaotong UniversityFollow

Abstract

In the big data era, analysis with data sets becomes more and more important. How to obtain valuable information from the data records is all we care about. However, most of the time, there are outliers among the data records. Outliers can lead to wrong information extracted from the data sets, detecting them can help us modify these rules or get them easier. In this paper, we combine the distance-based and clustering-based outlier detection methods, use the theory of minimum spanning tree and standard normal distribution to define a new method of outlier detection. At the same time, our algorithm can find the data records which we should pay attention to in the data sets. The algorithm works with two phases. During the first phase, we build a minimum spanning tree by all data records, compute the average weight and the standard deviation of it. In the second phase, we use the distance of each data record with its $K$ nearest neighbours to discover the outliers. Experimental results show our algorithm is more accurate and efficient.

Recommended Citation

Tang, Xin; Huang, Wei; Li, Xue; Li, Shengli; and Liu, Yuewen, "OUTLIER DETECTION VIA MINIMUM SPANNING TREE" (2016). PACIS 2016 Proceedings. 211.
https://aisel.aisnet.org/pacis2016/211

Download

COinS

PACIS 2016 Proceedings

OUTLIER DETECTION VIA MINIMUM SPANNING TREE

Abstract

Recommended Citation

Search

Links

Browse

Author Corner

PACIS 2016 Proceedings

OUTLIER DETECTION VIA MINIMUM SPANNING TREE

Authors

Abstract

Recommended Citation

Share

Search

Links

Browse

Author Corner