Abstract

A wealth of data is generated daily by social media websites that is an essential component of the Big Data Revolution. In many cases, the data is anonymized before being disseminated for research and analysis. This anonymization process distorts the data so that some essential characteristics are lost which may not be captured by methods that are not robust against such transformations. In this paper we propose novel algorithms, for two-dimensional data, for a recently discovered statistical data analysis measure, the Ray Shooting Depth (RSD) that provides an affineinvariant ranking of data points. In addition, we prove some complexity results and illustrate some of the desirable properties of RSD via comparisons with other similar notions. We develop an open-source data visualization tool based on RSD, and show its applications in distribution estimation, outlier detection, and 2D tolerance-region construction.

Share

COinS