Visual Neural Decomposition to Explain Multivariate Data Sets

Johannes Knittel, Andrés Lalama, Steffen Koch, Thomas Ertl

View presentation:2020-10-28T18:15:00ZGMT-0600Change your timezone on the schedule page
2020-10-28T18:15:00Z
Exemplar figure
Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers. However, with an increasing number of independent variables, this process may become cumbersome and time-consuming due to the many possible combinations that have to be explored. We propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables. We developed a visual model based on neural networks that can be explored in a guided way to help analysts find and understand such correlations.
Fast forward

Direct link to video on YouTube: https://youtu.be/dD9wY6R_gt0

Keywords

Visual Analytics, Multivariate Data Analysis, Machine Learning

Abstract

Investigating relationships between variables in multi-dimensional data sets is a common task for data analysts and engineers. More specifically, it is often valuable to understand which ranges of which input variables lead to particular values of a given target variable. Unfortunately, with an increasing number of independent variables, this process may become cumbersome and time-consuming due to the many possible combinations that have to be explored. In this paper, we propose a novel approach to visualize correlations between input variables and a target output variable that scales to hundreds of variables. We developed a visual model based on neural networks that can be explored in a guided way to help analysts find and understand such correlations. First, we train a neural network to predict the target from the input variables. Then, we visualize the inner workings of the resulting model to help understand relations within the data set. We further introduce a new regularization term for the backpropagation algorithm that encourages the neural network to learn representations that are easier to interpret visually. We apply our method to artificial and real-world data sets to show its utility.