DimLift: Interactive Hierarchical Data Exploration through Dimensional Bundling



Laura Garrison, Juliane Müller, Stefanie Schreiber, Steffen Oeltze-Jafra, Helwig Hauser, Stefan Bruckner

 External link (DOI) 

 View presentation:2021-10-27T17:45:00ZGMT-0600Change your timezone on the schedule page
2021-10-27T17:45:00Z

Exemplar figure, described by caption below — DimLift is a mixed-initiative approach to creating and navigating dimensional bundles, defined as a subset of similarly contributing dimensions to the overall variation of a dataset. Parallel coordinate axes (A) map to the first or second principal components (PC1, PC2) of a dimensional bundle. Glyphs (A1) provide feedback on bundle characteristics and composition. Interactions (A2) allow users to pan (D) through the dataset, swap axes between PC1 and PC2, drill-down into a PC1 vs. PC2 score plot (B), or drill-down further to contained dimensions (C) and their relationships (C1). (E) provides an overview of all dimensional bundles and unbundled dimensions.

Fast forward

Direct link to video on YouTube: https://youtu.be/uildnYPOsQ0

Keywords

Dimensionality reduction, interactive visual analysis, visual analytics, parallel coordinates

Abstract

The identification of interesting patterns and relationships is essential to exploratory data analysis. This becomes increasingly difficult in high dimensional datasets. While dimensionality reduction techniques can be utilized to reduce the analysis space, these may unintentionally bury key dimensions within a larger grouping and obfuscate meaningful patterns. With this work we introduce DimLift, a novel visual analysis method for creating and interacting with dimensional bundles. Generated through an iterative dimensionality reduction or user-driven approach, dimensional bundles are expressive groups of dimensions that contribute similarly to the variance of a dataset. Interactive exploration and reconstruction methods via a layered parallel coordinates plot allow users to lift interesting and subtle relationships to the surface, even in complex scenarios of missing and mixed data types. We exemplify the power of this technique in an expert case study on clinical cohort data alongside two additional case examples from nutrition and ecology.