Comparing Direct and Indirect Methods of Audio Quality Evaluation in Virtual Reality Scenes of Varying Complexity

Thomas Robotham, Olli S. Rummukainen, Miriam Kurz, Marie Eckert, Emanuël A. P. Habets

View presentation:2022-10-20T20:00:00ZGMT-0600Change your timezone on the schedule page
2022-10-20T20:00:00Z
Exemplar figure, described by caption below
Perceptual study focusing on evaluation methods in multi-modal interactive virtual environment settings with different levels of complexity.

Prerecorded Talk

The live footage of the talk, including the Q&A, can be viewed on the session page, VR Invited Talks.

Fast forward
Keywords

Multi-modal, virtual reality, 6-Degrees-of-freedom, audio quality, direct scaling, indirect scaling, evaluation methods

Abstract

Many quality evaluation methods are used to assess uni-modal audio or video content without considering perceptual, cognitive, and interactive aspects present in virtual reality (VR) settings. Consequently, little is known regarding the repercussions of the employed evaluation method, content, and subject behavior on the quality ratings in VR. This mixed between- and within-subjects study uses four subjective audio quality evaluation methods (viz. multiple-stimulus with and without reference for direct scaling, and rank-order elimination and pairwise comparison for indirect scaling) to investigate the contributing factors present in multi-modal 6-DoF VR on quality ratings of real-time audio rendering. For each between-subjects employed method, two sets of conditions in five VR scenes were evaluated within-subjects. The conditions targeted relevant attributes for binaural audio reproduction using scenes with various amounts of user interactivity. Our results show all referenceless methods produce similar results using both condition sets. However, rank-order elimination proved to be the fastest method, required the least amount of repetitive motion, and yielded the highest discrimination between spatial conditions. Scene complexity was found to be a main effect within results, with behavioral and task load index results implying more complex scenes and interactive aspects of 6-DoF VR can impede quality judgments.