Last week I was in Istanbul for ICMI ’14, the International Conference on Multimodal Interaction. ICMI is where signal processing and machine learning meets human-computer interaction, with aims of finding ways to use and improve multimodal interaction.
Ask two people and you’ll get a different definition of “multimodal interaction“. From my (HCI) perspective, it is interaction with technology using a variety of human capabilities; such as our perceptual abilities (like seeing, hearing, feeling) and motor control abilities (like speaking, gesturing, touching). In one of this year’s keynotes, Yvonne Rogers said we should design multimodal interfaces because we also experience the world using many modalities.
In this post I’m going to recap what I thought were the most interesting papers at the conference this year. There are also some photos of the sights, because why not?
Gesture Heatmaps: Understanding Gesture Performance with Colorful Visualizations
by Radu-Daniel Vatavu, Lisa Anthony and Jacob O. Wobbrock
Vatavu et al. presented a poster on Gesture Heatmaps, which are visualisations of how users perform touch-stroke gestures. Their visualisations represent characteristics of how users perform gestures, such as stroke speed and distance error from a gesture template. These visualisations can be used to summarise gesture performances, giving insight into how users perform touch gestures. These could be used to identify problematic gestures or understand which parts of gestures users find difficult, for example. Something which I liked about this paper was the way they used these visualisations to create confusion matrices, showing where and why gestures were misclassified.
CrossMotion: Fusing Device and Image Motion for User Identification, Tracking and Device Association
by Andrew D. Wilson and Hrvoje Benko
Wilson and Benko found that device acceleration (from accelerometers) was highly correlated with image acceleration (from a Kinect, in this case). This means that fusing acceleration data from these two sources can be used to identify a particular person in an image, even if their mobile device isn’t visible (for example, phone in pocket). Some advantages of using this approach are that users can be found in an image from their device movement alone (simplifying identification) and devices can be identified and tracked, even without direct line of sight.
SoundFLEX: Designing Audio to Guide Interactions with Shape-Retaining Deformable Interfaces
by Koray Tahiroğlu, Thomas Svedström, Valtteri Wikström, Simon Overstall, Johan Kildal and Teemu Ahmaniemi
Tahiroğlu et al. looked at how audio cues could be used to guide interactions with a deformable interface. They found that sound was an effective way of encouraging users to deform devices and some of their designs were particularly effective for guiding users to specific deformations. Based on these findings, they recommend using sound to help users discover deformations. Koray had a cool demo at the conference, which is the first time I’ve tried a deformable device prototype. Pretty neat idea.