xDiversity
Projects

Information Visualization for Interactive Sound Recognition

Interactive machine learning technologies harbor significant potential to personalize media recognition models, such as those for images and sounds, to individual users. However, GUIs (Graphical User Interfaces) for interactive machine learning have predominantly been researched for images and texts, with insufficient exploration for use cases targeting non-visual data like sound. In this study, we envisioned a scenario where users browse vast amounts of sound data while labeling training data corresponding to their target sound recognition classes. We delved into visualization techniques for comprehending the overall structure of the samples. From sound spectrograms to deep-learning-based searches from sound to images, we experimentally compared various visualization techniques. Through this, we discuss design guidelines for GUIs tailored for interactive sound recognition handling large volumes of sound data.

Reference Lecture

  • Tatsuya Ishibashi, Yuri Nakao, Yusuke Sugano, “Investigating audio data visualization for interactive sound recognition”, in Proc. 25th International Conference on Intelligent User Interfaces (IUI 2020).

ContactInquiries /
Participation

At xDiversity, we welcome you to contact us
if you would like to participate in our research as
a tester or user
or if you are a business
with data that can be used.
Please contact us via the form below.

Research collaborator recruitment form

Other Inquiries