How to get raw (Euclidian) distance from Azure Kinect

Anonymous
2021-09-09T06:11:09.443+00:00

I wish to know if I can get the raw (Euclidian) distance to objects from the Azure Kinect. The Azure Kinect depth data is the z-coordinates, not the distance between two points. Is there a way to get the distance directly from the Azure Kinect, instead of having to calculate it from the 3D coordinates?

Azure Kinect DK
Azure Kinect DK
A Microsoft developer kit and peripheral device with advanced artificial intelligence sensors for sophisticated computer vision and speech models.
287 questions
{count} votes

Accepted answer
  1. JAMES MORGENSTERN 176 Reputation points
    2021-09-17T16:22:30.58+00:00

    @QuantumCache @Quentin Miller sorry, you are still wrong. you are not rigorously solving the geometry -- you settle for approximations. The stupid part of course is that Kinect does measure the range but then it gets transformed into depth and Microsoft refuses to give you the range. Part of that issue is that the actual range measurements are ambiguous and the errors are proportional to the true range. So microsoft transforms the raw data to give you a depth image to go with the color values. The key point is that the inverse transform really is not available to you. Consider: to go from Kinect 2D data to 3D local coordinate system is this: take the unit vector normal to the camera plane from the camera origin and rotate it for the angular change in x [the lens is angular!] and then rotate again for the change in y and then multiply by the range to get the pixel coordinates in 3D. but Kinect wont give you the 3D coordinates -- just the Z value [depth]. so you say to take the difference between the x pixel coordinate and the x coordinate of the middle of the image for your euclidean distance; but that is a fallacy. you do NOT know the actual 3D distance between the projection of the pixel at (x,y) and projection of the image origin (0,0) and you assume that they are co-planar. Not necessarily true. Do the rigorous math and you will see the errors. At this point one has to live with depth information as range is unavailable.


3 additional answers

Sort by: Most helpful
  1. Quentin Miller 351 Reputation points
    2021-09-17T22:21:13.273+00:00

    I passed this on to the expert (who is no longer working on Azure Kinect but kindly responded). Hope this helps.

    The Z depth is not from “camera focal point” you mentioned or the “XY plane’s pixel at (X,Y)” that JAMESMORGENSTERN-0766 mentioned. The Z depth is from the camera center origin to the object but on the Z axis distance.

    The mathematics behind how the radial depth (aka range, aka Euclidean distance between camera center origin to the object) and the X,Y,Z Cartesian coordinates is as following:

    1. Depth engine computes the radial depth internally, R
    2. Using the depth camera intrinsic, you unproject each pixel (u, v) from normalized image space to an ray vector (x, y, 1)
    3. Normalize the ray vector (x, y, 1) to unit vector so its length is So the Z/R = 1 / sqrt(x^2+y^2+1), we use this Z/R ratio to convert the R to Z depth
    4. Output Z depth in AKDK. We provided k4a_calibration_2d_to_3d() to allow user to get XYZ 3d point in space relative to the camera origin, therefore, one can compute the Euclidean distance (aka radial depth/range depth) by taking the output of above function, which is a float3 point3d (assume it is X, Y, Z), then R = sqrt(X^2 + Y^2 + Z^2)
    1 person found this answer helpful.
    0 comments No comments

  2. Quentin Miller 351 Reputation points
    2021-09-15T23:21:06.013+00:00

    https://learn.microsoft.com/en-us/azure/kinect-dk/coordinate-systems

    The Z coordinate is the distance from the camera focal point (or in other words the depth of the pixel in 3D space)

    0 comments No comments

  3. JAMES MORGENSTERN 176 Reputation points
    2021-09-17T14:12:54.483+00:00

    Sorry guys ... Sattish and QuentinMiller are both wrong!

    The Z coordinate is the "Depth" ... that is the euclidean distance from the XY plane's pixel at (X,Y) to the object. To get euclidean distance from the lens focal point to the point in space you need the distances from the focal point to the (X,Y) point in the plane that is perpendicular to the Depth vector (0,0,depth). For the distance in the xy plane to your virtual pixel, though, you need the range which is what you are trying to find in the first place. I suggest you redo your model of the geometry and figure out how to use depth and not range

    0 comments No comments