question

MinseokKang-8425 avatar image
0 Votes"
MinseokKang-8425 asked QuentinMiller-3866 edited

How to get raw (Euclidian) distance from Azure Kinect

I wish to know if I can get the raw (Euclidian) distance to objects from the Azure Kinect. The Azure Kinect depth data is the z-coordinates, not the distance between two points. Is there a way to get the distance directly from the Azure Kinect, instead of having to calculate it from the 3D coordinates?

azure-kinect-dk
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @MinseokKang-8425 Is your query related to Azure Kinect V2 or Azure Kinect DK ?

0 Votes 0 ·

Hello @SatishBoddu-MSFT The question I am asking is related to the Azure Kinect DK.

0 Votes 0 ·

Hello @MinseokKang-8425,

A similar PG responded Github URL to refer is: How to get (Euclidian) distance data

The depth value is the distance in mm of the pixel from the depth camera.

Also, thought of sharing below similar GitHub issue where commented by RoseFlunder

Referenced Github issue: How to get/output the distance a person is away from the Kinect once their body is detected?

choosing a position from the body (i.e. head) and just calculate the euclidean distance to 0|0|0 because thats the position of the depth camera.

Example:
Head Position is at 2|1|1:
Distance = sqrt((2-0)²+(1-0)²+(1-0)²)

you could check out this example on how to get all the joint positions:
https://github.com/microsoft/Azure-Kinect-Samples/blob/master/body-tracking-samples/simple_cpp_sample/main.cpp#L15

So after line 15 you would calculate the joints distance to the camera like this:
sqrt(position.v[0]position.v[0] + position.v[1] position.v[1] + position.v[2] + position.v[2])

For the "sqrt" function you would need to include "math.h"

Please comment in the below section if you need further help and we would be glad to help you!.



0 Votes 0 ·

The depth value is the distance in mm of the pixel from the depth camera.

Distance is not depth. I tested it, and no, it is not the distance from the camera to the object in a straight line. The depth value is the z-coordinates, which is different from the distance.

Also, thought of sharing below similar GitHub issue where commented by RoseFlunder

I was specifically asking for a method other than calculating distance from the xyz-coordinates.

Since the Azure Kinect uses a time-of-flight sensor to measure depth, the Azure Kinect has to measure the Euclidian distance in order to compute the depth value. I was wondering if there was a way to get the distance data that the Azure Kinect uses to compute the depth value.
0 Votes 0 ·
JAMESMORGENSTERN-0766 avatar image
0 Votes"
JAMESMORGENSTERN-0766 answered SatishBoddu-MSFT commented

@SatishBoddu-MSFT @QuentinMiller-3866 sorry, you are still wrong. you are not rigorously solving the geometry -- you settle for approximations. The stupid part of course is that Kinect does measure the range but then it gets transformed into depth and Microsoft refuses to give you the range. Part of that issue is that the actual range measurements are ambiguous and the errors are proportional to the true range. So microsoft transforms the raw data to give you a depth image to go with the color values. The key point is that the inverse transform really is not available to you. Consider: to go from Kinect 2D data to 3D local coordinate system is this: take the unit vector normal to the camera plane from the camera origin and rotate it for the angular change in x [the lens is angular!] and then rotate again for the change in y and then multiply by the range to get the pixel coordinates in 3D. but Kinect wont give you the 3D coordinates -- just the Z value [depth]. so you say to take the difference between the x pixel coordinate and the x coordinate of the middle of the image for your euclidean distance; but that is a fallacy. you do NOT know the actual 3D distance between the projection of the pixel at (x,y) and projection of the image origin (0,0) and you assume that they are co-planar. Not necessarily true. Do the rigorous math and you will see the errors. At this point one has to live with depth information as range is unavailable.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you, James, for helping on this thread. Really appreciate your help!

0 Votes 0 ·
QuentinMiller-3866 avatar image
0 Votes"
QuentinMiller-3866 answered

https://docs.microsoft.com/en-us/azure/kinect-dk/coordinate-systems

The Z coordinate is the distance from the camera focal point (or in other words the depth of the pixel in 3D space)

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

JAMESMORGENSTERN-0766 avatar image
0 Votes"
JAMESMORGENSTERN-0766 answered

Sorry guys ... Sattish and QuentinMiller are both wrong!

The Z coordinate is the "Depth" ... that is the euclidean distance from the XY plane's pixel at (X,Y) to the object. To get euclidean distance from the lens focal point to the point in space you need the distances from the focal point to the (X,Y) point in the plane that is perpendicular to the Depth vector (0,0,depth). For the distance in the xy plane to your virtual pixel, though, you need the range which is what you are trying to find in the first place. I suggest you redo your model of the geometry and figure out how to use depth and not range

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

QuentinMiller-3866 avatar image
0 Votes"
QuentinMiller-3866 answered QuentinMiller-3866 edited

I passed this on to the expert (who is no longer working on Azure Kinect but kindly responded). Hope this helps.

The Z depth is not from “camera focal point” you mentioned or the “XY plane’s pixel at (X,Y)” that JAMESMORGENSTERN-0766 mentioned. The Z depth is from the camera center origin to the object but on the Z axis distance.

The mathematics behind how the radial depth (aka range, aka Euclidean distance between camera center origin to the object) and the X,Y,Z Cartesian coordinates is as following:

  1. Depth engine computes the radial depth internally, R

  2. Using the depth camera intrinsic, you unproject each pixel (u, v) from normalized image space to an ray vector (x, y, 1)

  3. Normalize the ray vector (x, y, 1) to unit vector so its length is So the Z/R = 1 / sqrt(x^2+y^2+1), we use this Z/R ratio to convert the R to Z depth

  4. Output Z depth in AKDK. We provided k4a_calibration_2d_to_3d() to allow user to get XYZ 3d point in space relative to the camera origin, therefore, one can compute the Euclidean distance (aka radial depth/range depth) by taking the output of above function, which is a float3 point3d (assume it is X, Y, Z), then R = sqrt(X^2 + Y^2 + Z^2)


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.