Location

Online

Event Website

https://hicss.hawaii.edu/

Start Date

3-1-2023 12:00 AM

End Date

7-1-2023 12:00 AM

Description

Clustering is common technique used to demonstrate relationships between data and information. Of recent interest is topological data analysis (TDA), which can represent and cluster data through persistent homology. The TDA algorithms used include the Topological Mode Analysis Tool (ToMATo) algorithm, Garin and Tauzin’s TDA Pipeline, and the Mapper algorithm. First, TDA is compared to ten other clustering algorithms on artificial 2D data where it ranked third overall. TDA had the second-highest performance in terms of average accuracy (97.9%); however, its computation-time performance ranked in the middle of the algorithms. TDA ranked fourth on the qualitative “visual trustworthiness” metric. On real-world data, TDA showed promising classification results (accuracy between 80-95%). Overall, this paper shows TDA is a competitive algorithm performance-wise, though computationally expensive. When TDA is used for visualization, the Mapper algorithm allows for unique alternative views especially effective for visualizing highly dimensional data.

Share

COinS
 
Jan 3rd, 12:00 AM Jan 7th, 12:00 AM

Clustering and Topological Data Analysis: Comparison and Application

Online

Clustering is common technique used to demonstrate relationships between data and information. Of recent interest is topological data analysis (TDA), which can represent and cluster data through persistent homology. The TDA algorithms used include the Topological Mode Analysis Tool (ToMATo) algorithm, Garin and Tauzin’s TDA Pipeline, and the Mapper algorithm. First, TDA is compared to ten other clustering algorithms on artificial 2D data where it ranked third overall. TDA had the second-highest performance in terms of average accuracy (97.9%); however, its computation-time performance ranked in the middle of the algorithms. TDA ranked fourth on the qualitative “visual trustworthiness” metric. On real-world data, TDA showed promising classification results (accuracy between 80-95%). Overall, this paper shows TDA is a competitive algorithm performance-wise, though computationally expensive. When TDA is used for visualization, the Mapper algorithm allows for unique alternative views especially effective for visualizing highly dimensional data.

https://aisel.aisnet.org/hicss-56/da/big_data_and_analytics/4