Main Page

From GestureML
Jump to: navigation, search


Welcome to the GestureML Wiki. This wiki contains GML, knowledge base articles and tutorials designed for users and developers of Gesture Markup Language and the Gestureworks family of products. Gesture Markup Language (GML) is an XML based multitouch gesture user interface language.

GML is an extensible markup language used to define gestures that describe interactive object behavior and the relationships between objects in an application. Gesture Markup Language has been designed to enhance the development of multiuser multitouch and other HCI device driven applications.



GML provides tools for interaction developers to freely design unique hi fidelity multitouch gesture interactions from a range of HCI input devices. A gesture can be a simple as a single tap or as complex as a series of detailed hand motion sequences that can lead to a gesture based password or rich character behaviors.

Feature List

  • Custom multitouch gesture definition
  • Run-time gesture editing
  • Gesture action matching
  • Gesture property filtering
  • Gesture value boundaries
  • Gesture event mapping
  • Continuous and discrete gesture events
  • Concurrent parallel gesture support
  • Gesture set definition
  • Device specific gesture definition
  • Input specific gesture definition
  • Bi-manual gesture definition
  • Gesture sequence definition
  • Compound gesture definition
  • CML support

GestureML Overview

The declarative form of GML can be used effectively to create complete, human readable, descriptions of multitouch gesture actions and specify how events and commands are generated in an application layer. GML can be used in combination with CML to create rich, dynamically defined user experiences. When GML is used with a Gestureworks engine in combination with Creative Markup Language (CML): objects can be dynamically constructed and managed along with well defined, dynamic display properties and interactive behaviors.

Central to the design of GML structure is conceptual framework of Objects Containers Gestures and Manipulators (OCGM). In conjunction with OCGM are included methods for defining Human Computer Interaction (HCI) design principles such as affordance and feedback. One of the primary goals of GML is to present a standard markup language for integrating a complete range of Natural User Interface (NUI) modes and models which would allow for the creation of multiple discrete or blended user interfaces. GML can be used to construct gestures for a wide variety of input methods such as: tangible objects, touch surfaces, body tracking, accelerometer, voice and brain-wave. When GML is combined with CML it has been designed to enable the development of the complete spectrum of post-WIMP NUIs (or RBI's) such as: organic UI’s, Zoomable UI's, augmented reality, haptics, multiuser and full range immersive multitouch environments.
GML has been developed an open standard that can be used to rapidly create and share gestures for a wide variety of Human Computer Input(HCI) devices. Promoting these features by presenting a user friendly method for shaping complex interactions provides a corner stone with which to build the next generation of dynamic, production level HCI applications.

Current implementations of GML in the form of an external gesture engine (as in Gestureworks Core) present 300+ base gestures that can be integrated into an application layer using bindings (available in C++, C#, Java and Python). This model effectively provides an infinite number of possible gestures each with potential to be recast or refined after related applications have been compiled and distributed. This approach puts interaction development directly into the hands of the UX designer and even allows independent end user management.

Examples of Use

From a UI development standpoint multitouch gestures are relatively new and in many cases methods of best practice for UX development has remained closely linked to application type and available devices or modes. In order to effectively explore new UX paradigms any complete gesture description must provide and inherent flexibility in the way gestural input is recognized and mapped within applications but also remain outside the compiled application. Loosely coupling gesture recognition to the application in this manner provides a standard method to dynamically define gestures. This model allows users to define equivalent gestures or variable gesture modes for different input types and device types without requiring further application level development.

For example as multitouch input devices continue to increase the number of supported touch points and grow in size, touch screen UX is seeing a shift towards full hand multitouch and multi-user application spaces. Providing methods and by which developers can create gestures that use 2 finger pinch to zoom or five finger zoom will be essential step in developing multitouch software. This can be seen at and

GML Architectural Overview

Each gesture defined in the GML document uses a fully editable and extendable system that can be conceptually broken down into a four step process:

The first step is the definition of the gesture action. This definition is used to match the behavior of the input device to the trigger entry into the gesture pipeline. This can be a simple as defining the minimum number of touch points or describing a detailed vector path.

The second step is the assignment of the analysis module. Currently GML allows you to specify a specific analysis module from the set of built in compiled algorithms. However the GML specification is also designed to accommodate custom code blocks that can be directly evaluated at run time and directly inserted into the gesture processing pipeline.

The third step is the establishment of post processing filters. For example: values returned from the gesture analysis algorithm can be passed through a simple low pass filter which helps smooth out high frequency noise which can present in the form of touch point “jitter”. The “noise filter” can help smooth out these errors and reduce the wobble effect. In addition to this the values returned from the noise filter can also be fed into a secondary “inertial” filter that can be used to give the effect of inertial mass and friction to gestures, resulting in attributing psudo-physical behavior to touch objects associated with the gesture. In this way multiple cumulative filters can be applied to the gesture pipeline in much the same way as multiple filters can be added to display objects in popular image editing apps.

The fourth and final step in defining a gesture using GML is a description of how to map returned values from analysis and processing directly to a defined touch object property or to a gesture event value for a gesture dispatched on the touch object.

With these four steps GML can be used to define surface gestures by performing configured geometric analysis on clusters of points or single touch points. The return values can then be easily processed and assigned to customizable display object properties. This can be done at runtime without re-compiling which effectively separates the gesture interactions from the application code in such a way as to externalize the scripting of touch UI/UX enabling interaction designers to work along side application developers.

A single GML document can be used to define all gestures used in an application. These gestures can be divided into groups called gesture sets. Each gesture set consists of a series of defined gestures or “gesture objects” which can selectively be applied to any touch object defined in the CML or in the application code.

GML Example Syntax

  1. <Gesture id="n-drag-inertia" type="drag">
  2.         <match>
  3.                 <action>
  4.                         <initial>
  5.                                 <cluster point_number="0" point_number_min="1" point_number_max="10"/>
  6.                         </initial>
  7.                 </action>
  8.         </match>       
  9.         <analysis>
  10.                 <algorithm class="kinemetric" type="continuous">
  11.                 <library module="drag"/>
  12.                         <returns>
  13.                                 <property id="drag_dx" result="dx"/>
  14.                                 <property id="drag_dy" result="dy"/>
  15.                         </returns>
  16.                 </algorithm>
  17.         </analysis>    
  18.         <processing>
  19.                 <inertial_filter>
  20.                                 <property ref="drag_dx" active="true" friction="0.9"/>
  21.                                 <property ref="drag_dy" active="true" friction="0.9"/>
  22.                 </inertial_filter>
  23.                 <delta_filter>
  24.                                 <property ref="drag_dx" active="true" delta_min="0.5" delta_max="500"/>
  25.                                 <property ref="drag_dy" active="true" delta_min="0.5" delta_max="500"/>
  26.                 </delta_filter>
  27.         </processing>
  28.         <mapping>
  29.                 <update dispatch_type="continuous">
  30.                         <gesture_event type="drag">
  31.                                 <property ref="drag_dx" target="x"/>
  32.                                 <property ref="drag_dy" target="y"/>
  33.                         </gesture_event>
  34.                 </update>
  35.         </mapping>
  36.  </Gesture>

Working with GML

The following list outlines the existing tools for working with GML. These methods can be used for editing, extending or constructing gestures using XML.

Editing Gestures in GML

Adding code comments to gestures
Editing point min and max number
Editing gesture event types
Customizing gesture dimensions
Editing hold event duration
Editing tap event translation max threshold
Editing double tap interevent duration
Editing flick gesture acceleration min value
Re-mapping gesture return values the rotate-to-zoom gesture
Adding new strokes to the stroke library

Working with Gesture Filters in GML

Activating and adjusting delta limits
Activating and adjusting gesture boundaries
Using the "inertial filter" with the drag gesture
Using the "mean filter" with the rotate gesture
Using the "multiply filter" to create fast-drag gesture

GML Example Index

There are over 300 ready made gestures in the standard "my_gesture.gml" GML file that is distributed with Gestureworks products. This GML file documents standard common gestures that can be used as part of any application. Each any every gesture can be edited and extended in an almost endless variety of ways. The following index lists common gesture types and outlines the GML structures used to fully describe the gesture.

Simple GestureML Descriptions (surface touch gestures)

Simple N point drag gesture “n-drag”
One point drag gesture “1-finger-drag”
Two point drag gesture “2-finger-drag”
Three point drag gesture “3-finger-drag”
Four point drag gesture “4-finger-drag”
Five point drag gesture “5-finger-drag”

Simple N point rotate gesture “n-rotate”
Two point rotate gesture “2-finger-rotate”
Three point rotate gesture “3-finger-rotate”
Four point rotate gesture “4-finger-rotate”
Five point rotate gesture “5-finger-rotate”

Simple N point scale gesture “n-scale”
Two point scale gesture “2-finger-scale”
Three point scale gesture “3-finger-scale”
Four point scale gesture “4-finger-scale”
Five point scale gesture “5-finger-scale”

Compound N point drag, rotate and scale gesture “n-manipulate”

Simple N point tap gesture
3-point tap gesture

Double Tap
Simple N point double tap gesture
1 point double tap gesture

Tiple Tap
Simple N point triple tap gesture
1 point triple tap gesture

Simple N point hold gesture
3 point hold gesture

Simple N point flick gesture “n-flick”

Simple N point swipe gesture “n-swipe”

Simple N point scroll gesture “n-scroll”

Three point tilt “3-finger-tilt”

five point orient “5-finger-orient”

One point pivot “1-finger-pivot”

One point stroke letter “1-finger-stroke-letter”

Advanced GestureML Descriptions (surface gestures)

Gesture Filtering
N point drag gesture with inertial filter “n-drag-inertia”
N point scale gesture with inertial filter “n-scale-inertia”
N point rotate gesture with inertial filter “n-rotate-inertia”
One point pivot gesture with inertial filter “n-pivot-inertia”
N point "noise" filtered rotate gesture “n-rotate”
N point rotate gesture with inertial & "noise" filter “n-rotate”

Gesture Property Mapping
Simple mapping with target change
Linear mapping
Exponential mapping

Complex Gestures

Manipulation Gestures
Manipulation gestures are gestures that typically involve a direct transformation of a touch object based on a one to one mapping of cluster motion. These types of transformations require scale, rotation, and translation information and processing as well as specific matching criteria for each property. To accomplish this in a single gesture is created with a series of dimensions that can be treated independently and used to describe the change in each transformation property as an action is performed. This can be done using a single cluster analysis algorithm, a single gesture object and return a single gesture event.

Creating consolidated gestures for complex manipulations

Geometry Defined Gestures
Geometry defined gestures are gestures that can be characterized by the shape, size of a point cluster and the relative position of the touch points within.

The orientation gesture
The row gesture
The column gesture
The touch triangle gesture
The touch square gesture

Bi-manual Gestures

Bi-manual gestures are gestures that have actions which "require" two hands in order to be able to complete efficiency. A good example of this is the split gesture. Although the split gesture can be performed using 2 to 10 touch points, since it requires a critical separation between points in a cluster that typically exceeds the diameter of a hand it is best performed by placing fingers from both the left hand and right hand on a touch object and then pulling the two hands apart. This has the effect of creating two discrete touch point clusters still associated with the touch object.

The split gesture
The view and position orientation gesture
The complimentary rotate gesture
The complimentary scale gesture

Gesture Sequences

Gesture sequences define a range of gestures that can be defined using a set of sub gesture actions that are performed in series or parallel to meet matching criteria. Each gesture used in a sequence must be defined individually in the root GML document. Any gesture used as part of a sequence must then explicitly referenced in the matching criteria of the sequence gesture.

Activation Gestures
Activation gestures are a set of gestures that uses parallel gesture sequencing as matching criteria. The simplest type of activation gesture are the "hold" series these require "n" number of touch points to be locked on a location while a secondary gesture action is performed. These gesture are called activation gesture because they can be used effectively to activate new modes or behavior or interaction in an application. Activation gestures are well suited to this task as they require a precise sequence matched action which is difficult to accidentally perform and unlikely to conflict with other common actions.

The hold-tap gesture
The hold-drag gesture
The hold-scale gesture
The hold-rotate gesture

Series Gestures
Series gestures are a set of gestures that use a series of gesture actions (discrete or continuous) as a gesture sequence to fulfill matching criteria. The first gesture in the sequence is used as the primary matching criteria. Once the first gesture in the sequence is completed the second gesture algorithm is activated and used to determine weather the second gesture is present. If the second gesture action is matched then a gesture event is returned. This primary and secondary gesture sequence can be extended to include an chain of any number of gesture actions and types.

The tap sequence gesture
The scale-rotate gesture
The tap-rotate-tap gesture
The tap-drag-tap-drag-tap

Augmented Gestures

Augmented gestures are common gestures that have been augmented with additional touch point properties such as "pressure" or point "width and height" requirements.

Pressure augmented gestures
Pressure augmented gestures are set of gestures that have additional matching criteria and return values that use pressure data associated with touch points to augment the gesture properties. In most cases pressure data associated with the touch point cluster is used to increase the fidelity of a gesture when using advanced pressure sensitive multitouch input devices.

Accelerometer augmented gestures
Accelerometer augmented gestures are a set of gestures that have additional matching criteria that use a secondary source of action data from accelerometer input values. In most cases this comes from mobile devices which have accelerometers built in but it can also come from devices such as accelerometer gloves or wands.

Compound Gestures

Compound gestures are made from one or more discrete GML defined gesture object.

Multimodal Gestures
Multimodal gestures are compound gestures that are made up of multiple gestures that come from more than one input device or input mode. A good example is a gesture that uses both pen/stylus and touch input to create a gesture event.


  • Consistent model for understanding and describing gesture analysis
  • Separation of interactions and behaviors from content
  • Easy to read xml structure
  • A range of gestures can be defined for a single application
  • Allows for crowd-sourcing gesture development
  • Device agnostic
  • Input method agnostic
  • NUI + OCGM structure for developing flexible UX models
  • XML based open standard, easy to post and share gml
  • Clear separation between touch input protocol and gesture definition
  • Simple method for describing a complete gesture library
  • Native transformation mapping
  • Ad-hoc blended interactions (cumulative transformations)
  • Manageable complexity (gesture block principle)

Proposed Expansion of Schema

  • Device based gesture set definitions
  • Map direct to system gesture commands
  • Map direct to key and mouse events
  • Upload user profiles, preferred interfaces
  • Direct UI/UX state integration
  • Direct algorithm scripting for gesture definitions (using javascript)
  • Direct gesture audio, haptic and visual feedback definitions
  • In-line support in CML

Related Resources

Frameworks & SDKS

Gestureworks Core: C++ framework for use with C++,C#.NET,Java and Python (Uses GML)
Gestureworks Flash: ActionScript3 framework for use with Flash and Air (Uses GML, CML and CSS)
OpenExhibits: ActionScript3 framework for use with Flash and Air (Uses GML, CML and CSS)


GestureKey: C++ based utility application that maps gestures to mouse events and keyboard shortcuts (Uses Gestureworks Core & GML)

Personal tools