Interactive Subspace Cluster Analysis Guided by Semantic Attribute Associations

Salman Mahmood, Klaus Mueller

Room: 104

2023-10-25T22:12:00ZGMT-0600Change your timezone on the schedule page
2023-10-25T22:12:00Z
Exemplar figure, described by caption below
The Semantic Subspace Explorer learns the semantic associations among data attributes, exposing interesting data patterns. The interface consists of: the Control Panel, where users can determine the number of subspaces; the Semantic Space View, which visualizes the semantic space of all attributes; the Subspace Organizer, which shows an overview of the generated subspaces; the Subspace View which, shows a user-selected subspace in more detail as a biplot.
Fast forward
Full Video
Keywords

High-dimensional data;multivariate data;subspace clustering;subspace analysis;cluster analysis

Abstract

Multivariate datasets with many variables are increasingly common in many application areas. Most methods approach multivariate data from a singular perspective. Subspace analysis techniques, on the other hand. provide the user a set of subspaces which can be used to view the data from multiple perspectives. However, many subspace analysis methods produce a huge amount of subspaces, a number of which are usually redundant. The enormity of the number of subspaces can be overwhelming to analysts, making it difficult for them to find informative patterns in the data. In this paper, we propose a new paradigm that constructs semantically consistent subspaces. These subspaces can then be expanded into more general subspaces by ways of conventional techniques. Our framework uses the labels/meta-data of a dataset to learn the semantic meanings and associations of the attributes. We employ a neural network to learn a semantic word embedding of the attributes and then divide this attribute space into semantically consistent subspaces. The user is provided with a visual analytics interface that guides the analysis process. We show via various examples that these semantic subspaces can help organize the data and guide the user in finding interesting patterns in the dataset.