STULL: Unbiased Online Sampling for Visual Exploration of Large Spatiotemporal Data
Guizhen Wang, Jingjing Guo, Mingjie Tang, José Florencio de Queiroz, Calvin Yau, Anas Daghistani, Morteza Karimzadeh, Walid Aref, David Ebert
External link (DOI)
View presentation:Friday, October 30th, 2020 @ 15:00GMT+00:00Change your timezone on the schedule page
5 years agoYour current time: Tuesday, Jul 15th @ 07:11

Keywords
Geospatial Data, Data Management, Large-Scale Data Techniques, Visual Analytics
Abstract
Online sampling-supported visual analytics is increasingly important, as it allows users to explore large datasets with acceptable approximate answers at interactive rates. However, existing online spatiotemporal sampling techniques are often biased, as most researchers have primarily focused on reducing computational latency. Biased sampling approaches select data with unequal probabilities and produce results that do not match the exact data distribution, leading end users to incorrect interpretations. In this paper, we propose a novel approach to perform unbiased online sampling of large spatiotemporal data. The proposed approach ensures the same probability of selection to every point that qualifies the specifications of a user's multidimensional query. To achieve unbiased sampling for accurate representative interactive visualizations, we design a novel data index and an associated sample retrieval plan. Our proposed sampling approach is suitable for a wide variety of visual analytics tasks, e.g., tasks that run aggregate queries of spatiotemporal data. Extensive experiments confirm the superiority of our approach over a state-of-the-art spatial online sampling technique, demonstrating that within the same computational time, data samples generated in our approach are at least 50\% more accurate in representing the actual spatial distribution of the data and enable approximate visualizations to present closer visual appearances to the exact ones.