Gesture Units > Gesture Form > Gesture Phases > Rhythmic Properties > Gesture Referentiality > Gesture Pragmatics

Challenge: Identifying Gesture Phases

What are the phases the gesture consists of?

Learning Objectives

Find out more about the second aspect of the prosodic dimension of gesture
Learn how to identify Gesture Phases
Explore the different movement phases of a gesture
Realize that one particular phase is carrying the overall gestural meaning
Discover the variety of phrasal structures that may exist
Learn how annotate gesture phases in ELAN

In this challenge, we will:

Watch the explanatory video
Do a task on identifying Gesture Phases
Do a task on selecting the matching annotation for the video example
Watch an annotation tutorial
Practice annotating on our own

Explanatory Video: Identifying Gesture Phases

Let's watch the video to learn more about Gesture Phases! We recommend watching it in full screen to see the gestural movements in the examples better.

Click here to see the transcript of the video

In this video we will learn about one component of the prosodic dimension of gestures called gesture phasing. First, we will define the Gesture Phases within the G-Unit. Then, we will explore a few examples together. Finally, we will describe the steps involved in annotating Gesture Phases.

The Prosodic Dimension of Gesture

According to M3D, gesture phasing belongs to the prosodic dimension of gesture. Remember when we mentioned the prosodic dimension of gesture, we use this term to indicate the raw organizational structure of gestural movements, that is, how they are grouped together at various levels. Just like in speech prosody, where intonational phrases can be divided into smaller intermediate phrases, the same can happen in gesture. In the Challenge about identifying Gesture Units, we spoke about the G-Unit being the span of time from when the hands leave the rest position to when the hands return to rest position. Now, we want to explore what is happening within the G-Unit, during the execution of gestural movements. That is, we want to take that larger “chunk” of gestural movement, and divide it up into smaller “chunks” of movements, called gesture phases.

What are gesture phases?

Remember, a gesture, as defined by Kendon (2004) is any visible bodily action when it is used as an utterance or as part of an utterance. Another way of putting it is that a gesture is all or part of a communicative act. A Gesture Unit must contain a minimum of one gesture, but often contains multiple gestures occurring one right after another. A single gesture can in turn be broken down into smaller Gesture Phases. Typically, gesture researchers have identified at least three different movement phases of gesture, namely the preparation, stroke, and recovery. Additionally, a fourth phase, a hold, may appear at any point within the stream of movement and refers to pauses in movement.

The stroke is the only obligatory part of a gesture and thus, the only obligatory phase within a G-Unit. It is often the most kinematically salient movement representing the highest effort and it is said to carry the “meaning” of the gesture. In other words, the stroke largely corresponds to the part of the gesture that is acting as the utterance, as per our definition of gesture. It is important to remember that strokes do not have to be very large, even small, subtle movements, such as a small flick of the finger should be considered as a gesture.

Given the central role of the stroke and its obligatory nature, we understand this one-to-one correspondence as indicating that one stroke corresponds to one gesture. If you want to do any quantitative analyses, such as calculating gesture rate or counting the number of gestures, you would want to consider the stroke as the basis of measurement.

In addition to the stroke, a gesture may have a preparation phase, which would include any preparatory movement of the hands into position to execute a stroke. The recovery would then be any movement from the stroke back to the rest position. Finally, holds may occur when the gesturing hand or hands show minimal movement, as if pausing before, during, or after any of the previously mentioned Gesture Phases. Importantly, holds are annotated when you perceive a momentary pause or minimal movement in the gesturing while playing the video at full-speed. For example, if you do something like this. Then this part would be the preparation, the part where my hand doesn't move would be the hold, this part where my hands carry out the kinematically salient movement would be the stroke, this part would again be a hold and finally, when my hands move back to the rest position would the recovery.

Examples of gesture phases within the G-unit

Here are a few examples of G-Units which contain a single stroke, yet differ in the number of other phases.

Here is a Gesture Unit consisting of only a single stroke followed by a recovery [see example]. Here is another example where the G-Unit contains a preparation, a stroke, and a recovery [see example]. Finally, this G-Unit shows a preparation, a stroke, a hold, and a recovery [see example]. As we mentioned before, the only phase that is consistently present is the stroke.

However, gestural movements often are quite complex. As we said earlier, a single G-Unit may have multiple gestures occurring one after the other. For example, here is a single G-Unit that contains two strokes separated by a hold [see example]. Sometimes instead of being separated by a hold, the hands may execute a stroke, and immediately prepare for another stroke [see example]. In this example, three small strokes are each separated by a very small upward movement, which acts as a preparation to execute the downward stroke movement.

Sometimes gestural excursions can be long and quite complex. In such cases, it is important to determine if such movements constitute a single, multidirectional stroke, subsequent strokes with no gesture phases separating them, or individual strokes separated by small gesture phases such as a preparation or hold. Again this is where the labeler’s perception is key - you can ask yourself if one of the movement phases within this complex gestural excursion seems a bit slower or less kinematically salient than another, or if the two subsequent movements seem to be conveying two distinct meanings. Your responses may lend you to annotate them as separate stroke phases. For example here, the up/down movements really seem to lend to the perception of three different gestures [see example]. However, if there does not seem to be a clear distinction in salience, yet we perceive two movements which convey meaning, then we may label them two subsequent strokes, such as here: [see example]. If the movement is perceptually one meaningful “gesture”, labelers can label it as one long, multidirectional stroke, such as in this example [see example].

How to annotate Gesture phases

So how do we annotate the phasing structure of hand movements? We really want to think of annotating gesture phasing as a holistic assessment of first finding the stroke, then identifying other gesture phases that accompany the stroke. Some may use a more “top-down” approach, such as finding movements that convey meaning and identify them as the stroke. Others may adopt a more “bottom-up” approach, identifying strokes on kinematics alone. M3D proposes a combination of the two, by doing two separate passes: a first pass without the audio, followed by a second pass with the audio on. The first pass makes use of kinematics to divide the stream of movement into different movement phases and apply an initial value, or a best estimate whether the movement phase is a preparation, stroke, recovery, etc. The second pass with speech is where meaning is taken into account to validate and refine the labels that you assigned to those different movement phases.

For example, a movement phase that did not seem very salient may have been labeled as a preparation without speech. However, given the context in speech, it suddenly becomes clear that the movement phase is actually conveying meaning, and thus should be labeled as a stroke. Another example would be an up-down movement which may have been separated into two movement phases (preparation-stroke) but with the context given in the audio, it becomes clear that in fact, both movements convey one meaning, and thus those two segments should be combined into a single gesture phase. Alternatively, a biphasic movement may appear to be a single meaningful movement, and thus without audio is labeled as a stroke, but with audio it becomes clear that only one of the phases functions to convey meaning, and thus the annotation needs to be divided into two separate gesture phases. By following this approach, we ensure a more holistic approach to the identification of gestures.

Here are a few tips to help you when annotating phasing:

1.) Play the video at different speeds, from frame-by-frame analysis to playing full speed to really get a feel for the gestural movement.

2.) During the first pass without audio, you can make use of any form annotations you may have previously made to help guide you in determining phase onsets and offsets. For example, if you perceive a downward movement as a stroke, then you may annotate it so that the stroke onset coincides with the annotations in the “Trajectory direction” Form tier.

3.) Avoid bias from the speech signal. As we previously mentioned, the second pass is used mainly to refine your initial phase annotations. Though some refinement is necessary, labelers should stick to only changing labels, inserting additional phase boundaries, or deleting existing boundaries. Avoid shifting the existing phase boundaries, as this would introduce bias from, for example, pitch accentuation in speech.

4.) The entire span of the G-Unit should be divided and annotated as different gesture phases - there should be no gaps in your phasing annotations under a G-Unit annotation!

5.) When identifying strokes, try to think about the smallest “meaning-bearing unit” of the stream of movement - avoid labeling long stretches of movement as a single stroke, unless you truly perceive the movement as taking part in the conveyal of specific meaning. Try to identify potential movements that do not convey meaning and label them as other, non-stroke phases.

In this video, we’ve explored another aspect of the prosodic dimension of gesture, specifically the different Gesture Phases within a G-Unit. Remember: A Gesture Unit is made up of one or more Gesture Phases. The stroke is the only obligatory phase, as it conveys the meaning of the gesture. Optionally there may be a preparation phase when the hands move into position to execute the stroke, a hold when the hands pause before or after a stroke, or a recovery phase when the hand returns to rest. Annotating Gesture Phases takes place over two passes, the first without audio and the second with audio, where we refine our annotations. Thanks for watching!

Task 1: Recognize the Gesture Phases

Practice how to recognize the Gesture Phases within a Gesture Unit. Pay attention to the exercise question as it changes in the course of the task. Click on the "Let's go" button to start the task. The videos within the task might take a minute to load. If you have trouble accessing the task, please click here.

Duration: about 8 minutes

Task 2: Select the best Gesture Phase annotation

Click on the "Let's go" button to start the task. The videos within the task might take a minute to load. If you have trouble accessing the task, please click here.

Duration: about 13 minutes

This task requires basic knowledge of ELAN

Annotation Tutorial: Let's annotate together!

Watch this annotation tutorial to find out how to annotate Gesture Phases in ELAN and for being prepared for the final task of this challenge.

Task 3: Now it's your turn!

This task requires knowledge of ELAN

For the same video as in the previous challenge, try annotating Gesture Phases on your own. Remember to not have gaps between the annotations.

After annotating, click on the link below to download the solutions and compare them to yours. You'll also find the ELAN template and .mp4 files, in case you need them.

Click here to access the material

Useful Resources

Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge University Press.

M3D Resources (Template, Manual, M3D-TED corpus): https://osf.io/ankdx/

Google Sites

Report abuse