ZADU: A Python Library for Evaluating the Reliability of Dimensionality Reduction Embeddings

Hyeon Jeon, Aeri Cho, Jinhwa Jang, Soohyun Lee, Jake Hyun, Hyung-Kwon Ko, Jaemin Jo, Jinwook Seo

Room: 104

2023-10-26T03:45:00ZGMT-0600Change your timezone on the schedule page
2023-10-26T03:45:00Z
Exemplar figure, described by caption below
The UMAP embedding of the MNIST dataset (leftmost column), and two distortion visualizations generated by ZADUVis: CheckViz and the Reliability Map.
Fast forward
Full Video
Keywords

Dimensionality reduction, Reliability, Visualization Library, Distortion measures, Distortion visualizations

Abstract

Dimensionality reduction (DR) techniques inherently distort the original structure of input high-dimensional data, producing imperfect low-dimensional embeddings. Diverse distortion measures have thus been proposed to evaluate the reliability of DR embeddings. However, implementing and executing distortion measures in practice has so far been time-consuming and tedious. To address this issue, we present ZADU, a Python library that provides distortion measures. ZADU is not only easy to install and execute but also enables comprehensive evaluation of DR embeddings through three key features. First, the library covers a wide range of distortion measures. Second, it automatically optimizes the execution of distortion measures, substantially reducing the running time required to execute multiple measures. Last, the library informs how individual points contribute to the overall distortions, facilitating the detailed analysis of DR embeddings. By simulating a real-world scenario of optimizing DR embeddings, we verify that our optimization scheme substantially reduces the time required to execute distortion measures. Finally, as an application of ZADU, we present another library called ZADUVis that allows users to easily create distortion visualizations that depict the extent to which each region of an embedding suffers from distortions.