Diagnosing Concept Drift with Visual Analytics

Weikai Yang, Zhen Li, Mengchen Liu, Yafeng Lu, Kelei Cao, Ross Maciejewski, Shixia Liu

View presentation:2020-10-27T19:15:00ZGMT-0600Change your timezone on the schedule page
2020-10-27T19:15:00Z
Exemplar figure
DriftVis: A visual analytics system for detecting, explaining, and correcting for concept drift: (a) The stream-level visualization consists of a line chart for drift degree (A), a feature selection list (B), and a streaming scatterplot (C) to visualize the drift and data distribution change over time (e.g., density increases in G, H, and I); (b) The prediction-level visualization consists of a base learner view (D), a samples of interest view (E), and a performance view (F) to explore the impact of drift adaptation on the model’s accuracy.
Fast forward

Direct link to video on YouTube: https://youtu.be/449t1pfeKq0

Keywords

Concept drift, streaming data, change detection, scatterplot, t-SNE.

Abstract

Concept drift is a phenomenon in which the distribution of a data stream changes over time in unforeseen ways, causing prediction models built on historical data to become inaccurate. While a variety of automated methods have been developed to identify when concept drift occurs, there is limited support for analysts who need to understand and correct their models when drift is detected. In this paper, we present a visual analytics method, DriftVis, to support model builders and analysts in the identification and correction of concept drift in streaming data. DriftVis combines a distribution-based drift detection method with a streaming scatterplot to support the analysis of drift caused by the distribution changes of data streams and to explore the impact of these changes on the model's accuracy. A quantitative experiment and two case studies on weather prediction and text classification have been conducted to demonstrate our proposed tool and illustrate how visual analytics can be used to support the detection, examination, and correction of concept drift.