Anomalies in depth data acquisition by the Kinect DK camera: inconsistencies within captured measurements

Jiang, Wentao 0 Reputation points
2023-06-30T09:32:21.5433333+00:00

When previewing videos captured by Kinect DK using the Kinect viewer, I have noticed inconsistent depth information for the same standing person. For example, the depth measurement of the person's head is 2.247m, while the depth measurement of their feet is 2.61m (with a known height of 1.83m). After conducting a series of tests below, I suspect that this inconsistency may be caused by errors in the field of view of the TOF lens. However, the test results did not align with my expectations, so I am seeking professional guidance from our community to understand the causes.

Test 1: I aligned the person's head, Kinect DK camera, and the TOF lens along the same plane, while the person's feet remained within the FOV of the Kinect DK. The person stood straight, aligning his body plane perpendicular to the ground and optical axis. When observed from the side view, a right-angled triangle with orthogonal projection could be formed. Capturing through Kinect DK software, the depth measurement of person's head was 2.247m, while the depth measurement of the feet was 2.655m. Calculating the person's height within the software using the Pythagorean theorem resulted in $ \sqrt{2.655^2 - 2.247^2} = 1.41m $, significantly deviating from the actual height of 1.83m, with a difference of $ \Delta=|1.83-1.41|m=0.42m $.

Question 1: Why is there inconsistency in the depth information of the same person's head and feet in the videos captured by Kinect DK? Is this issue related to the FOV of the Kinect DK camera? If it is a field of view problem, could it be resolved by adjusting the camera's field of view to achieve consistent depth information between the person's head and feet?

Hypothesis 1: Assuming I position the Kinect DK at a point which measurement equal to 50% of person's total height ($0.915m$, let's call this point "half-height point or position") to capture the person's half-height position, aligning the person's half-height point with the optical axis of the Kinect DK. In this setup, the head, feet, and Kinect DK form an isosceles triangle. If this hypothesis holds true, there should be a right-angled triangle formed between the person's head, half-height point, and Kinect DK, and another right-angled triangle formed between the person's feet, half-height point, and Kinect DK, exhibiting symmetry along the optical axis.

Test 2: The depth measurement of the person's head was $2.254m$, the half-height point position was $2.379m$, and the position of the person's feet was $2.677m$. Unfortunately, the test results showed asymmetry in right-angled triangles. The right-angled triangle formed by the person's head, half-height point position, and the Kinect DK camera resulted in the failure of the Pythagorean theorem. If we calculate based on the right-angled triangle formed by the person's feet, half-height point, and the Kinect DK camera, $ \sqrt{2.677^2 - 2.379^2} \times 2 = 1.2274 \times 2 = 2.4549m $, with a difference of $ \Delta=|1.83 - 2.4549| = 0.62m $ compared to the expected value. This deviation is significant.

Question: Therefore, why is there inconsistency in the depth information of the same person's head and feet in the videos captured by Kinect DK? Furthermore, based on the results of Test 2, it appears that the issue is not related to the field of view. In this scenario, I politely seek for assistance to determine which point should be considered as the "true" standard for depth information when using Kinect DK for depth data acquisition. enter image description here

Azure Kinect DK
Azure Kinect DK
A Microsoft developer kit and peripheral device with advanced artificial intelligence sensors for sophisticated computer vision and speech models.
287 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. LeelaRajeshSayana-MSFT 13,621 Reputation points
    2023-07-03T21:52:46.5733333+00:00

    Hi @Jiang, Wentao Greetings! Welcome to Microsoft Q&A forum. Thank you for posting this question here. Our Community SMEs will review this request and get back to you as soon as possible.

    There are a number of factors that affect the consistency in the depth information captured by Kinect DK. These include field of view of the TOF lens, the angle of the sensor, the distance between the sensor and the person, and the lighting conditions in the environment among others.

    Based on the results of Test 1, it appears that the inconsistency may be related to the field of view of the Kinect DK camera. When the person's head and feet are aligned along the same plane, the depth measurements are inconsistent, which suggests that the field of view of the camera may not be wide enough to capture the entire person's body accurately. Adjusting the camera's field of view may help to achieve more consistent depth information between the person's head and feet.

    However, based on the results of Test 2, it appears that the issue may not be related to the field of view. In this scenario, it may be difficult to determine which point should be considered as the "true" standard for depth information when using Kinect DK for depth data acquisition. It is possible that the depth measurements are affected by other factors, such as the angle of the sensor or the distance between the sensor and the person. To overcome this issue, you may need to conduct further tests to isolate each factor and observe its effect on the depth measurements. Once you have identified the factor(s) that are contributing to the inconsistency, you can take steps to overcome them, such as adjusting the placement of the sensor or recalibrating the sensor.

    Hope this helps you with the next steps.

    0 comments No comments