Addressing In-Air Gesture Systems

Introduction

Users must address a system in order to interact with it: this means discovering where and how to direct input towards the system. Sometimes this is a simple problem. For example: when using a touchscreen for input, users know to reach out and touch (how) the screen (where); when using a keyboard for input, they know to press (how) the keys (where). However, sometimes addressing a system can be complicated. When using mid-air gestures, it is not always clear where users should perform gestures and it is not clear how they direct them towards the system. My PhD thesis [1] looked at the problem of addressing in-air gesture systems and investigated interaction techniques that help users do this.

Finding where to perform gestures

Users need to know where to perform gestures, so that their actions can be sensed by the input device. If users’ movements can not be sensed, then their actions will have no effect on the system. Some gesture systems help users find where to gesture by showing them what the input device can see. For example, the images below show feedback that shows the user what the input device can “see”. If the user starts gesturing outside of the sensor range, he or she knows to move their body so that the input device can see them properly.

Not all systems are able to give this type of feedback. For example, if a system has no screen or only has a small display, then it may not be able to give such detailed feedback. My thesis investigated an interaction technique (sensor strength) that uses light, sound and vibration to help users find where to gesture. This means that systems do not need to show feedback on a screen. Sensor strength tells users how close their hand is to a “sweet spot” in the sensor space, which is a location where they can be seen easily by the sensors. If users are close to the sweet spot, they are more likely to be sensed correctly. If they are not close to it, their gestures may not be detected by the input sensors. My thesis [1] and my CHI 2016 paper [2] describe an evaluation of this technique. The results show that users could find the sweet spot with 51-80mm accuracy. In later work, we investigated how similar principles could be used in an ultrasound haptic interface [3].

Discovering how to direct input

Users have to discover how to direct their gestures towards the system they want to interact with. This might not be necessary if there is only one gesture-sensing device in the room. However, in future there may be many devices which could all be detecting gestures at the same time; the image below illustrates this type of situation. Already, our mobile phones, televisions, game systems and laptops have the ability to sense gestures, so this problem may be closer than expected.

Users need to direct their input towards the intended device, since many sensor-based interfaces attempt to infer interactions within the same space.

If gestures are being sensed by many devices at the same time, users may affect one (or more) systems unintentionally. This is called the Midas Touch problem. Other researchers have developed clutching techniques designed to overcome the Midas Touch problem, although these are not always practical if there is more than one gesture system at a time. My thesis investigated an alternative interaction technique, Rhythmic Gestures, which allow users to direct their input towards the one system they wish to interact with, via spatial and temporal coincidence (or motion correlation). This interaction technique can be used by many systems at once with little interference.

Rhythmic gestures

Rhythmic gestures are simple hand movements that are repeated in time with an animation, shown to the user on a visual display. These gestures consist of a movement and an interval: for example, a user may move their hand repeatedly from side to side, every 500ms, or they may move their hand up and down, every 750ms. The image below shows how an interactive light display could be used to show users a side-to-side gesture movement. The animations could also be shown on the screen, if necessary. Users can perform a rhythmic gesture by following the animation, in time, with their hand.

An example of how a touchless gesture interface can convey the spatial and temporal characteristics of a rhythmic gesture, even if no screen is visible or available.

This interaction technique can be used to direct input. If every gesture-sensing system looks for a different rhythmic gesture, then users can use that gesture to show which system they want to interact with. This overcomes the Midas Touch problem, as it informs the other systems that input is not intended for them. My thesis [1] and CHI 2016 paper [2] describe evaluations of this interaction technique. I found that users could successfully perform rhythmic gestures, even without feedback about their hand movements.

Rhythmic micro-gestures

I extended the rhythmic gesture concept to use micro-gesture movements, very small movements of the hand. For example, tapping the thumb off the side of the hand or opening and closing all fingers at once. These could be especially useful for interacting discreetly with mobile devices while in public or on-the-go. My ICMI 2017 paper [4] describes a user study into their performance, finding that users could use them successfully with just audio feedback to convey the rhythm.

Videos

A video demonstration of the interaction techniques described here:

The 30-second preview video for the CHI 2016 paper:

Summary

To address an in-air gesture system, users need to know where to perform gestures and they need to know how to direct their input towards the system. My research investigated two techniques (sensor strength and rhythmic gestures) that help users solve these problems. I evaluated the techniques individually and found them successful. I also combined them, creating a single technique which shows users how to direct their input while also helping them find where to gesture. Evaluation of the combined technique found that performance was also good, with users rating the interaction as easy to use. Together, these techniques could be used to help users address in-air gesture systems.

References

[1] Interaction techniques with novel multimodal feedback for addressing gesture-sensing systems
E. Freeman.
In PhD Thesis, University of Glasgow. 2016.

[2] Do That, There: An Interaction Technique for Addressing In-Air Gesture Systems
E. Freeman, S. Brewster, and V. Lantz.
In Proceedings of the 34th Annual ACM Conference on Human Factors in Computing Systems – CHI ’16, 2319-2331. 2016.

[3] HaptiGlow: Helping Users Position their Hands for Better Mid-Air Gestures and Ultrasound Haptic Feedback
E. Freeman, D. Vo, and S. Brewster.
In Proceedings of IEEE World Haptics Conference 2019, the 8th Joint Eurohaptics Conference and the IEEE Haptics Symposium, TP2A.09. 2019.

[4] Rhythmic Micro-Gestures: Discreet Interaction On-the-Go
E. Freeman, G. Griffiths, and S. Brewster.
In Proceedings of 19th ACM International Conference on Multimodal Interaction – ICMI ’17, 115-119. 2017.

Earcons and Tactons in Pure Data

In recent projects I’ve been using Pure Data for generating Earcons and Tactons; I’ve written more about generating Tactons with Pure Data here. I’m sharing these here in the hope that they’ll be useful to someone – if you use or adapt them, I’d be very interested to hear what cool things you’re doing with them!

Everything is on github: https://github.com/efreeman/pd-earcons-tactons

Tactons for Selection Gestures

TactonsSelectionGestures

Source here. This patch generates the tactile feedback I used in the above-device tactile feedback project. Feedback was generated in Pure Data using commands sent over a TCP socket (localhost:34567; change port number in network-listener sub-patch). Looking into the network-listener patch shows the message protocol:

  • y: Enable output
  • z: Disable output
  • on: Currently does nothing (unused outlet)
  • off: Stops all output in progress
  • ramp_a: Ramp amplitude from 0 to 1, at 150 Hz over 1000 ms
  • ramp_a_exp: As above, with exponential increase rather than linear
  • ramp_f: Ramp frequency from 0 to 150 Hz over 1000 ms
  • ramp_rough: Modulate a 150 Hz wave with a wave whose frequency changes from 0 Hz to 75 Hz over 1000 ms
  • const: Produce a 150 Hz wave for 1000 ms
  • pulse: Produce a 150 Hz wave for 200 ms

We used 150 Hz for all sine waves as this was the resonant frequency of our actuator; change values as needed. Duration can be altered from 1000 ms too.

This is essentially a dumb patch which generates sound on demand. It’s easier to have greater control over feedback in a more expressive programming language. Using sockets is a nice low-cost way of communicating between Pure Data and other applications.

Single Tone Generator

ToneGenerator

Source here. This patch generates an enveloped tone using parameters sent over a socket. As in my other patches, the TCP socket is on localhost port 34567; this can be changed in the network-listener sub-patch. Here I’ve used the left and right channel for different types of tone: audio in the left, vibrotactile in the right. The right channel is at a fixed frequency to drive an actuator, whereas the left channel can vary tone frequency. Audio and tactile feedback can be enabled/disabled on demand. Message protocol:

  • a0: Audio off
  • a1: Audio on
  • t0: Tactile off
  • t1: Tactile on
  • <attack> <decay> <freq>: Enveloped tone

Tone generation uses three parameters: attack time (in ms), decay time (in ms) and oscillating frequency (in Hz). Attack time specifies how long the tone takes to reach full amplitude. Decay time specifies how long the tone takes to return to zero amplitude. The tone is also sustained at full amplitude for the decay duration. Frequency specifies the audio frequency but not the tactile frequency, which is fixed.

As an example, “100 500 440” produces a 1100 ms-long tone at 440 Hz; it takes 100 ms to reach full amplitude, is sustained for 500 ms and then returns to zero amplitude after a further 500 ms. If you wish to further specify the sustain time, change the arguments passed into vline~ in the sineclick sub-patch.

This can be used to create a wide variety of sound effects with corresponding tactile patterns. Short envelope times result in “click” sounds, whereas longer envelope times result in much softer tones. Generate rhythmic patterns using more expressive programming environments, as trying to do this dynamically in pure data would be cumbersome.

Gestures

Hand gestures. Photo by Charles Haynes: CC BY-SA.
Hand gestures. Photo by Charles Haynes: CC BY-SA.

I want to make gesture interaction – interacting with computers through hand movements in mid-air – easier and more enjoyable to use. My PhD research focused on helping users address gesture systems, especially when those systems only have limited capabilities for providing feedback. I’ve written a lot about gestures and problems with gesture interaction, so this page brings that information together to give an overview of why gestures are difficult and how we might make them better.

Addressing Gesture Systems

When users want to interact with an in-air gesture system, they must first address it. This involves finding where to perform gestures, so that they can be sensed, and finding out how to direct input towards the system you want to interact with. During my PhD, I developed and evaluated interaction techniques for addressing in-air gesture systems. You can read more about this here. A related challenge is finding where to put your hands for mid-air interfaces, especially when mid-air haptics are used. I developed a system called HaptiGlow that helped users find a good hand position for mid-air input.

Above- and Around-Device Interaction

My research often looks at gestures in close proximity to small devices, either above (for example, gesturing over a phone on a table) or around (for example, gesturing behind a device you are holding with your other hand) those devices. I give an introduction to around-device interaction here, present some research and guidelines for above-device interaction with phones here, and discuss our work on above-device tactile feedback here. I also explain why we would want to use these types of gestures here.

Gestures Are Not “Natural”

In this post I outline three gesture interaction problems (the Midas Touch problem, the address problem and the sensing problem) and what implications these have for gesture interaction. In short, we should not think of gestures as being “natural” because there are many practical issues we must overcome to make them usable.

Novel Gesture Feedback

My PhD research looks at how we can move feedback about gestures off the screen and into the space around devices instead. I’ve written about tactile feedback for gestures here. I’ve also written about interactive light feedback, a novel type of display, for gestures here.

Gestures In Multimodal Interaction

Here I talk about two papers from 2014 where gestures are considered as part of multimodal interactions. While this idea was notably demonstrated in the 1980s, it still hasn’t reached mainstream computing. Perhaps this is about to change with new technologies.

Above-Device Tactile Feedback

Introduction

My PhD research looks at improving gesture interaction with small devices, like mobile phones, using multimodal feedback. One of the first things I looked at in my PhD was tactile feedback for above-device interfaces. Above-device interaction is gesture interaction over a device; for example, users can gesture at a phone on a table in front of them to dismiss unwanted interruptions or could gesture over a tablet on the kitchen counter to navigate a recipe. I look at above-device gesture interaction in more detail in my Mobile HCI ’14 poster paper [1], which gives a quick overview of some prior work on above-device interaction.

Tactile Feedback for Above-Device Interaction

In two studies, described in my ICMI ’14 paper [2], we looked at how above-device interfaces could give tactile feedback. Giving tactile feedback during gestures is a challenge because users don’t touch the device they are gesturing at; tactile feedback would go unnoticed unless users were holding the device while they gestured. We looked at ultrasound haptics and distal tactile feedback from wearables. In our studies, users interacted with a mobile phone interface (pictured above) which used a Leap Motion to track two selection gestures.

Gestures

An illustration of the count gesture. The user has extended four fingers on their right hand, thus selecting the fourth target.

Our studies looked at two selection gestures: Count (above) and Point (below). These gestures were from our user-designed gesture study [1]. With Count, users select from numbered targets by extending the appropriate number of fingers. When there’s more than five targets, we partition targets into groups. Users can select from a group by moving their hand. In the image above, the palm position is closest to the bottom half of the screen so we activate the lower group of targets. If users moved their hands towards the upper half of the screen, we would activate the upper group of four targets. Users had to hold a Count gesture for 1000 ms to make a selection.

Illustration of the point gesture. A hand with an extended index finger is selecting one of the on-screen targets, using a circular cursor.

With Point, users controlled a cursor which was mapped to their finger position relative to the device. We used the space beside the device to avoid occluding the screen while gesturing. Users made selections by dwelling the cursor over a target for 1000 ms.

For a video demo of these gestures, see:

Tactile Feedback

In our first study we looked at different ways of giving tactile feedback. We compared feedback directly from the device when held, ultrasound haptics (using an array of ultrasound transducers, below) and distal feedback from wearable accessories. We used two wearable tactile feedback prototypes: a “watch” and a “ring” (vibrotactile actuators affixed to a watch strap and an adjustable velcro ring). We found that all were effective for giving feedback, although participants had divided preferences.

A photograph of an ultrasound haptics device.

Some preferred feedback directly from the phone because it was familiar, although this is an unlikely case in above-device interaction because an advantage of this interaction modality is that users don’t need to first lift the phone or reach out to touch it. Some participants liked feedback from our ring prototype because it was close to the point of interaction (when using Point) and others preferred feedback from the watch (pictured below) because it was a more acceptable accessory than a vibrotactile ring. An advantage of ultrasound haptics is that users do not need to wear any accessories and participants appreciated this, although the feedback was less noticeable than vibrotactile feedback. This was partly because of the small ultrasound array used (similar size to a mobile phone) and partly because of the nature of ultrasound haptics.

Tactile Watch Prototype

In a second study we focused on feedback given on the wrist using our watch prototype. We were interested to see how tactile feedback affected interaction using our Point and Count gestures. We looked at three tactile feedback designs in addition to just visual feedback. Tactile feedback had no impact on performance (possibly because selection was too easy) although it had a significant positive effect on workload. Workload (measured using NASA-TLX) was significantly lower when dynamic tactile feedback was given. Users also preferred to receive tactile feedback to no tactile feedback.

A more detailed qualitative analysis and the results of both studies appear in our ICMI 2014 paper [2]. A position paper [3] from the CHI 2016 workshop on mid-air haptics and displays describes this work in the broader context of research towards more usable mid-air widgets.

Tactile Feedback Source Code

A Pure Data patch for generating our tactile feedback designs is available here.

References

[1] Towards Usable and Acceptable Above-Device Interactions
E. Freeman, S. Brewster, and V. Lantz.
In Mobile HCI ’14 Posters, 459-464. 2014.

[2] Tactile Feedback for Above-Device Gesture Interfaces: Adding Touch to Touchless Interactions
E. Freeman, S. Brewster, and V. Lantz.
In Proceedings of the International Conference on Multimodal Interaction – ICMI ’14, 419-426. 2014.

[3] Towards Mid-Air Haptic Widgets
E. Freeman, D. Vo, G. Wilson, G. Shakeri, and S. Brewster.
In CHI 2016 Workshop on Mid-Air Haptics and Displays: Systems for Un-instrumented Mid-Air Interactions. 2016.

ICMI ’14 Paper Accepted

My full paper, “Tactile Feedback for Above-Device Gesture Interfaces: Adding Touch to Touchless Interactions”, was accepted to ICMI 2014. It was also accepted for oral presentation rather than poster presentation, so I’m looking forward to that!

Tactile Feedback for Above-Device Interaction.
Tactile Feedback for Above-Device Interaction.

In this paper we looked at tactile feedback for above-device interaction with a mobile phone. We compared direct tactile feedback to distal tactile feedback from wearables (rings, smart-watches) and ultrasound haptic feedback. We also looked at different feedback designs and investigated the impact of tactile feedback on performance, workload and preference.

ultrasound array
Array of Ultrasound Transducers for Ultrasound Haptic Feedback.

We found that tactile feedback had no impact on input performance but did improve workload significantly (making it easier to interact). Users also significantly preferred tactile feedback to no tactile feedback. More details are in the paper [1] along with design recommendations for above- and around-device interface designers. I’ve written a bit more about this project here.

Video

The following video (including awful typo on the last scene!) shows the two gestures we used in these studies.

References

[1] Tactile Feedback for Above-Device Gesture Interfaces: Adding Touch to Touchless Interactions
E. Freeman, S. Brewster, and V. Lantz.
In Proceedings of the International Conference on Multimodal Interaction – ICMI ’14, 419-426. 2014.

Creating Tactons in Pure Data

What are Tactons?

Tactons are “structured tactile messages” for communicating non-visual information. Structured patterns of vibration can be used to encode information, for example a quick buzz to tell me that I have a new email or a longer vibration to let me know that I have an incoming phone call.

Vibrotactile actuators are often used in HCI research to deliver Tactons as these provide a higher fidelity of feedback than the simple rotation motors used in mobile phones and videogame controllers. Sophisticated actuators allow us to change more vibrotactile parameters, providing more potential dimensions for Tacton design. Whereas my previous example used the duration of vibration to encode information (short = email, long = phone call), further information could also be encoded using a different vibrotactile parameter. Changing the “roughness” of the feedback could be used to indicate how important an email or phone call is, for example.

How do we create Tactons?

Now that we know what Tactons are and what they could be used for, how do we actually create them? How can we drive a vibrotactile actuator to produce different tactile sensations?

Linear and voice-coil actuators can be driven by providing a voltage but, rather than dabble in electronics, the HCI community typically uses audio signals to drive the actuator. A sine wave, for example, produces a smooth and continuous-feeling sensation. For more information on how audio signal parameters can be used to create different vibrotactile sensations, see [1], [2] and [3].

Tactons can be created statically using an audio synthesiser or a sound editing program like Audacity to generate sine waves, or can be created dynamically using Pure Data. The rest of this post is going to be a quick summary of Pure Data components which can be used in creating vibrotactile feedback in real-time. I’ve just provided an overview of the key components which I use when creating tactile feedback. With the components discussed, the following vibrotactile parameters can be manipulated: frequency, spatial location, amplitude, “roughness” (with amplitude modulation) and duration.

Tactons with Pure Data components

osc~ Generates a sine-wave. First inlet or argument can be used to set the frequency of the sine-wave, e.g. osc~ 250 creates a 250 Hz signal.

dac~ Audio output. First argument specifies the number of channels and each inlet is used to send an incoming signal to that channel, e.g. dac~ 4 creates a four-channel audio output. Driving different actuators with different audio channels can allow vibration to be encoded spatially.

*~ Multiply signal. Multiplies two signals to produce a single signal. Amplitude modulation (see [2] and [3] above) can be used to create different textures by multiplying two sine waves together. Multiplying osc~ 250 with osc~ 30 creates quite a “rough” feeling texture. This can also be used to change the amplitude of a signal. Multiplying by 0 silences the signal. Multiplying by 0.5 reduces amplitude by 50%. Tactons can be turned on and off by multiplying the wave by 1 and 0, respectively.

delay Sends a bang after a delay. This can be used to provide precise timings for tacton design. To play a 300 ms vibration, for example, an incoming bang could send 1 to the hot inlet of *~, enabling the tacton. Sending that same bang to delay 300 would send a bang after 300 ms, which could then send 0 to the cold inlet of *~, ending the tacton.

phasor~ Creates a ramping waveform. Can be used to create sawtooth waves. This tutorial explains how this component can also be used to create square waveforms.

Multimodal Android Development Part 1

This post is the first of two which gives a brief introduction to creating multimodal interactions in Android applications. I’ll briefly cover some of the SDK features available to you as an Android developer which you can use to create richer interactions in your apps. Example code will be quite concise because I assume you have at least a basic knowledge of Android development. Feel free to leave any comments suggesting how I can better explain these concepts, or to let me know if I’ve made any mistakes or omissions.

What is “multimodal” interaction?

Multimodal interaction, put simply, is interaction involving more than one modality (e.g. multiple senses). For example, an application may provide a combination of visual and haptic (touch) feedback. These types of interaction design provide a number of benefits, for example allowing those with sensory impairment to interact using other senses, or allowing interaction in contexts where one sense may be otherwise occupied.

One of the most ubiquitous examples of a multimodal interaction is the way in which mobile phones combine visual, audible and haptic feedback to inform users of a new text, phone call, etc. This combination of modalities is particularly useful when your phone is, say, in your pocket. Obviously you can’t see the phone, but you will probably feel the phone vibrate or hear your ringtone as new notifications appear.

Haptic feedback in Android

Most handheld Android devices have some sort of rotation motor in them allowing simple haptic feedback. Although not common in tablets (largely due to size constraints), all modern Android phones will have tactile feedback available. You can control the phone vibrator through the Vibrator class. Note that in order to use this, your Manifest must request the following permission: android.permission.VIBRATE

/* Request the device's vibrator service. Remember to check
 * for null return value, in case this isn't available. */
Vibrator vibrator = (Vibrator) getSystemService(Context.VIBRATOR_SERVICE);

/* Two ways to control the vibrator:
 *  1. Turn on for a specific time
 *  2. Provide a vibration pattern */

/* 1. Vibrate for 200ms */
vibrator.vibrate(200);

/* 2. Vibrate for 200ms, pause for 100ms, vibrate for 300ms. */
long[] pattern = new long[] {0, 200, 100, 300};

/* Perform this pattern once only (repeat := -1). */
vibrator.vibrate(pattern, -1);

/* Vibrate for 200ms, followed by indefinite repeat of
 * 100ms pause followed by 300ms vibrate. Setting
 * repeat := 2 tells the vibrator to repeat at offset
 * 2 into the vibration pattern. */
vibrator.vibrate(pattern, 2);

 

Touchscreen gestures

Using touchscreen gestures to interact with applications can be fun, efficient and useful when users may be unable to select a particular action on the screen. For example, it can be difficult to select a button on-screen when running or walking. A touch gesture, however, is a lot easier and requires less precision from the user. The disadvantage with touch gestures is that if not used sparingly, there may be too much for the user to remember!
Creating a set of gestures for your application is simple: create a gesture library on an Android Virtual Device using the Gesture Builder application (available on the AVD by default) and add a GestureOverlayView to your activity layout. In your activity, you just have to load the gesture library from your resources and implement an OnGesturePerformedListener.

 

private GestureLibrary mLibrary;

public void onCreate(Bundle savedInstanceState) {
  ...
  /* 1. Load gesture library from the res/raw/gestures file */
  mLibrary = GestureLibraries.fromRawResource(this, R.raw.gestures);

  if (!mLibrary.load())
    /* Error: unable to load from resources! */
    ...

  /* 2. Find reference to the gesture overlay view */
  GestureOverlayView gov = (GestureOverlayView) findViewById(R.id.gestureOverlay);

  /* 3. Register callback for gesture input */
  gov.addOnGesturePerformedListener(this);
}

The callback method for gesture performance receives a Gesture as an argument. This can be used to obtain a list of predictions: which gestures in your library that Android thought the gesture was. With these predictions, you can use the prediction score (or contextual information) to determine which gesture the user was most likely to have performed. I find it useful to define a threshold for gesture acceptance, so that you can reject erroneous or inaccurate gestures. The best way to choose this threshold value is through trial and error: see what works for you and your gestures.

private static final double ACCEPTANCE_THRESHOLD = 10.0;

public void onGesturePerformed(GestureOverlayView overlay, Gesture gesture) {
  /* 1. Get list of gesture predictions */
  ArrayList predictions = mLibrary.recognize(gesture);

  if (predictions.size() > 0) {
    /* 2. Find highest scoring prediction */
    Prediction bestPrediction = predictions.get(0);

    for (int i = 1; i < predictions.size(); i++) {
      Prediction p = predictions.get(i);
      if (p.score > bestPrediction.score)
        bestPrediction = p;
    }

    /* 3. Decide if we'll accept this gesture */
    if (bestPrediction.score > ACCEPTANCE_THRESHOLD)
      gestureAccepted(bestPrediction.name);
  }
}

private void gestureAccepted(String gestureName) {
  /* Respond appropriately to the gesture name */
  ...
}