Net2Vis - A Visual Grammar for Automatically Generating Publication-Tailored CNN Architecture Visualizations



Alex Bäuerle, Christian van Onzenoodt, Timo Ropinski

 External link (DOI) 

 View presentation:2021-10-28T15:15:00ZGMT-0600Change your timezone on the schedule page
2021-10-28T15:15:00Z

Exemplar figure, described by caption below — Title slide for a presetation called "Net2Vis - A Visual Grammar for Automatically Generating Publication-Tailored CNN Architecture Visualizations. In the bottom left, one can see an image of the speaker, and two links: viscom.net2vis.uni-ulm.de and a13x.io. On the top right, one can see a logo depicting an eye, and the the affiliation of the author: Visual Computing Group, Ulm University.

Fast forward

Direct link to video on YouTube: https://youtu.be/j8oUQ8EbyGw

Keywords

Neural networks, architecture visualization, graph layouting

Abstract

To convey neural network architectures in publications, appropriate visualizations are of great importance. While most current deep learning papers contain such visualizations, these are usually handcrafted just before publication, which results in a lack of a common visual grammar, significant time investment, errors, and ambiguities. Current automatic network visualization tools focus on debugging the network itself and are not ideal for generating publication visualizations. Therefore, we present an approach to automate this process by translating network architectures specified in Keras into visualizations that can directly be embedded into any publication. To do so, we propose a visual grammar for convolutional neural networks (CNNs), which has been derived from an analysis of such figures extracted from all ICCV and CVPR papers published between 2013 and 2019. The proposed grammar incorporates visual encoding, network layout, layer aggregation, and legend generation. We have further realized our approach in an online system available to the community, which we have evaluated through expert feedback, and a quantitative study. It not only reduces the time needed to generate network visualizations for publications, but also enables a unified and unambiguous visualization design.