3D Object Manipulation

This tutorial will introduce you to using gesture and skeleton input to enrich the user interface of the games and applications you make with Unity. We will create a 3D cursor that is controlled by hand gestures. We will use this cursor to select objects in the scene and move them in 3D space.

This tutorial will take approximately 30 minutes to complete.

Download the Final Result

The final Unity project obtained in this tutorial can be found in our open-source samples repository on GitHub. After you clone the repository, follow these steps to run the application:

  1. Launch Unity, in the Projects tab select Open.
  2. Browse to the Unity\Tutorials\3D Object Manipulation directory within the cloned repository.
  3. Press the play button (or Ctrl+P) to run the scene.

Prerequisites

This tutorial assumes you have basic familiarity with the C# programming language and some experience with the Unity environment. We assume you know how to create Unity projects, scenes, game objects and scripts.

We recommend you complete the Introduction tutorial before starting this tutorial, we assume you are familiar with the Project Prague toolkit for Unity prefabs discussed there.

Step 1 - Scene Preparation

  1. We start this tutorial with the scene obtained in the 3D Object Manipulation - Scene Preparation tutorial. This scene contains a mouse-controlled cursor that allows you to hover over objects, "grab" them and move them to a new location in space. In our current tutorial, we will replace the mouse for gestures and motions, and use our hand to control the cursor instead.

    Please either complete the 3D Object Manipulation - Scene Preparation tutorial, or follow these instructions to obtain its final product. Note that the 3D Object Manipulation - Scene Preparation tutorial does not cover any material related to the Project Prague toolkit for Unity and you should feel free to skip it.

  2. In Unity, load the final project created in the 3D Object Manipulation - Scene Preparation tutorial. Play the 3D Object Manipulation scene. Make sure you can grab and move objects using the mouse:

    Grabbing and moving instructions

    On step 3 we will learn how to do drag an object (number 2 in the figure above) using the hand, and on step 4 we will learn to move an object away and towards the camera (number 3 in the figure above) using the hand.

Step 2 - Hand Cursor

  1. Add the GesturesManager and UIManager prefabs to the scene (from MicrosoftGesturesToolkit\Prefabs). We will use these prefabs to communicate with the Gestures Service. Refer to step 2 of the introduction tutorial for more details.

  2. In order to control the cursor with your hand, we first need to obtain access to the hand-skeleton information. The Gestures Service computes a hand-skeleton and communicates it to all subscribing clients on a frame-by-frame basis. The GesturesManager game object in our scene acts as a client of the Gestures Service. GesturesManager's RegisterToSkeleton() and UnregisterFromSkeleton() methods allow us to subscribe and unsubscribe to the hand-skeleton stream.

    We would like to receive hand-skeleton information whenever the Cursor game object is active. Please add the following implementation for the OnEnable() and OnDisable() methods to the Cursor script:

    private void OnEnable()
    {
        // Register to skeleton events
        GesturesManager.Instance.RegisterToSkeleton();
    }
    
    private void OnDisable()
    {
        // Unregister from skeleton events
        GesturesManager.Instance.UnregisterFromSkeleton();
    }
    
  3. We will now extract the palm position which is arriving from the Gestures Service, bundled with other skeleton information. We will use the palm position to compute the Cursor location on screen.

    The hand-skeleton is provided in units of millimeters, in the following left-handed coordinate system:

    Hand skeleton coordinate system We used a RealSense™ camera here to demonstrate the coordinate system axes. Note that the same coordinate system applies to all depth-cameras supported by Project Prague.

    Ideally, we would like the Main Camera in our scene to see your hand from the perspective of your eyes. If we can achieve this - the 3D cursor's projection to the screen will follow your hand in a way that feels natural.

    In an attempt to approximate the desired perspective, we will use the below coefficients to map the hand-skeleton (which is given in the depth-camera's view-space) to the 3D cursor (that we want to express in the Main Camera's view-space). Add these public members to Cursor.cs:

    [Tooltip("Scales the palm position vector to camera space.")]
    public Vector3 PalmUnitsScale = new Vector3(-.1f, .1f, -.1f);
    
    [Tooltip("Offsets the palm position vector in camera space.")]
    public Vector3 PalmUnitsOffset = new Vector3(0f, 0f, 70f); // if using Kinect, replace with: new Vector3(0f, 0f, 120f);
    

    Note that this mapping also performs a scale-down by a factor of 10, which in fact is a unit conversion from millimeters to centimeters. We do this because we want the dynamic range of the cursor position to be appropriate for the size of objects in our scene - on the order of magnitude of 1-10 Unity units.

    With this preparation, we are ready to compute the actual conversion of the palm position to a cursor position. Add the GetPalmCameraPosition() method to the Cursor script:

    private Vector3 GetPalmCameraPosition()
    {
        // Convert palm position from depth-camera space to Main-Camera space
        var skeleton = GesturesManager.Instance.SmoothDefaultSkeleton;
        if (skeleton == null)
        {
            return Vector3.zero;
        }
        return Vector3.Scale(skeleton.PalmPosition, PalmUnitsScale) + PalmUnitsOffset;
    }
    

    Note the use of the SmoothDefaultSkeleton property in GetPalmCameraPosition(). This property provides a smoothed version of the hand skeleton currently seen by the depth-camera. The smoothing is achieved by an averaging of the skeletons received over the last several frames. You can control the number of frames used for averaging - examine the GesturesManager in the Inspector window and modify the Smooth Moving Average Window Size field.

    Note that the PalmPosition property of the skeleton corresponds to the location of the center of the hand:

    Palm position landmark

  4. replace the GetCursorScreenPosition() with the following contents:

    private Vector3 GetCursorScreenPosition()
    {
        // Replace mouse position with palm position.
        var palmCameraPosition = GetPalmCameraPosition();
        var palmWorldPosition = Camera.main.transform.TransformPoint(palmCameraPosition);
        var palmScreenPosition = (Vector2)Camera.main.WorldToScreenPoint(palmWorldPosition);
        return palmScreenPosition;
    }
    
  5. Make sure you have the Gestures Service running. Play the scene and bring either hand in front of the depth-camera. You should be able to control the cursor by moving your hand.

Step 3 - Grab and Move Object Using Gestures

We will now introduce a gesture and use it to trigger the cursor to enter and leave "grab mode". When in grab mode, the grabbed object follows the cursor (which follows your hand), allowing you to move it to a new location.

  1. In the Project window, locate the GestureTrigger prefab under MicrosoftGesturesToolkit\Prefabs. Drag and drop it to the Hierarchy window to create a new GestureTrigger game object in your scene.

  2. Examine the GestureTrigger game object in the Inspector window, select the XAML Gesture radio button, expand the Gesture XAML section and paste in the following gesture definition:

    <Gesture Name="GrabReleaseGesture"
             xmlns="http://schemas.microsoft.com/gestures/2015/xaml">
        <Gesture.Segments>
            <IdleGestureSegment Name="Idle" />
            <HandPose Name="InitSpreadPose">
                <PalmPose Context="{AnyHand}" Direction="Forward|Down" />
                <FingerPose Context="Index, Middle, Ring, Pinky" Flexion="Open" />
            </HandPose>
            <HandPose Name="GrabPose">
                <PalmPose Context="{AnyHand}" />
                <FingerPose Context="Index, Middle, Ring, Pinky" Flexion="Folded" />
            </HandPose>
            <HandPose Name="FinalSpreadPose">
                <PalmPose Context="{AnyHand}" />
                <FingerPose Context="Index, Middle, Ring, Pinky" Flexion="Open" />
            </HandPose>
        </Gesture.Segments>
        <Gesture.SegmentsConnections>
            <SegmentConnections From="Idle" To="Idle, InitSpreadPose" />
            <SegmentConnections From="InitSpreadPose" To="GrabPose" />
            <SegmentConnections From="GrabPose" To="FinalSpreadPose" />
            <SegmentConnections From="FinalSpreadPose" To="Idle" />
        </Gesture.SegmentsConnections>
    </Gesture>
    

    When done, the GestureTrigger Inspector view should look like this:

    GrabReleaseGesture gesture definition

    Tip

    To generate a XAML representation of a gesture, create a C# gesture object and call its ToXaml() method. Visit our overview page to read about creating gestures in C#.

    The GrabReleaseGesture is made up of 3 poses as illustrated in the state-machine below:

    GrabReleaseGesture

    To learn more about the concept of a gesture as a state machine, please visit our overview page.

  3. We would like to use the GrabReleaseGesture in the following manner

    • GrabPose detection will cause the cursor to enter grab mode, i.e., it should trigger StartGrab().
    • Idle detection will cause the cursor to leave grab mode, i.e., it should trigger StopGrab().

      Note

      The Idle state is the initial state in every gesture. Whenever the user either performs a gesture to completion or abandons a gesture in the middle of its execution, the state-machine falls back to the idle state.

      Examine the GestureTrigger game object in the Inspector window and press the Add Gesture Segment Event Button twice. This should generate two new UI (user interface) sections, Segment #1 and Segment #2.

      • In the Segment #1 drop down list, select the GrabPose (1), then click the + sign in the On Trigger () pane (2). Drag the Cursor object to the None (Object) box (3) and select the Cursor → StartGrab() method from the No Function drop-down list (4):

        GrabPose gesture trigger

      • In the Segment #2 drop down list, select the Idle (1), then click the + sign in the On Trigger () pane (2). Drag the Cursor object to the None (Object) box (3) and select the Cursor → StopGrab() method from the No Function drop-down list (4):

        Idle gesture trigger

  4. Run the scene. Test the feature we've added on this step:

    • Hover over an object with the cursor,
    • Grab it by clinching your hand into a fist,
    • Move the object to a new location,
    • Release the object by spreading your fingers apart.

Step 4 - Move Object Away or Towards Camera

On this step, we will enable the grabbed object to move in the radial direction as well.

  1. Add the following private member to Cursor.cs:

    private float _lastPalmDistance;
    

    This member needs to be initialized every time an object is grabbed. To do that, add the following line at the end of the StartGrab() method:

    _lastPalmDistance = GetPalmCameraPosition().magnitude;
    
  2. Replace the contents of the GetCursorDistanceCoefficient() method in Cursor.cs with the following:

    private float GetCursorDistanceScalingFactor()
    {
        var currentPalmDistance = GetPalmCameraPosition().magnitude;
        var coefficient = currentPalmDistance / _lastPalmDistance;
        _lastPalmDistance = currentPalmDistance;
    
        return coefficient;
    }
    
  3. Try running the scene. Grab an object and move your hand towards or away from the depth-camera. The object in the scene should follow your hand, moving respectively in the virtual scene.

    Tip

    Don't hold your hand too close to the depth-camera. The camera has a frustum shaped field-of-view - as you bring your hand closer, the detectible area becomes smaller, leaving you with less range of motion to manipulate objects in the scene.