Solved: Mapping a Pixel to a 3D Point by Deprojecting

ARafi4 · ‎02-25-2018

I am trying to map from a pixel to the point where it is located in 3D space, and by looking at librealsense on GitHub, it seems that the function rs2_deproject_pixel_to_point() is exactly what I need. However, I am unsure of how I can get a specific pixel from an image as well as the depth data for that image.

From the hello-world example, I can see that you can get the distance to the center of an object by using get_distance() with half of the width and height of the frame as follows:

// Try to get a frame of a depth image

depth_frame depth = frames.get_depth_frame();

// The frameset might not contain a depth frame, if so continue until it does

if (!depth) continue;

// Get the depth frame's dimensions

float width = depth.get_width();

float height = depth.get_height();

// Query the distance from the camera to the object in the center of the image

float dist_to_center = depth.get_distance(width / 2, height / 2);

Is this the best way to do it? What if I want the pixel located at [0][0] in the frame? I think this would be easier for me if I can convert a frame to a 2D array so I can address each pixel specifically rather than using a fraction of the frame's height and width. Thanks for any help.

jb455 · ‎02-26-2018

It'll probably be easiest for you to use the Point Cloud class. There is an example here: https://github.com/IntelRealSense/librealsense/tree/master/examples/pointcloud librealsense/examples/pointcloud at master · IntelRealSense/librealsense · GitHub.

This generates a 1D array of 3D points so to access a specific pixel (x,y) take the (x + y * depth.get_width())th element of the vertices array.

If that's too much overhead and you really only want a single pixel each time, the get_distance method works with (x,y) coordinates as inputs (so get_distance(0,0) will be the first pixel). There is a warning in one of the samples that calling get_distance many times might not be very efficient, so if you do want more than a few pixels it's probably best to use the point cloud.

View solution in original post

jb455 · ‎02-26-2018

It'll probably be easiest for you to use the Point Cloud class. There is an example here: https://github.com/IntelRealSense/librealsense/tree/master/examples/pointcloud librealsense/examples/pointcloud at master · IntelRealSense/librealsense · GitHub.

This generates a 1D array of 3D points so to access a specific pixel (x,y) take the (x + y * depth.get_width())th element of the vertices array.

If that's too much overhead and you really only want a single pixel each time, the get_distance method works with (x,y) coordinates as inputs (so get_distance(0,0) will be the first pixel). There is a warning in one of the samples that calling get_distance many times might not be very efficient, so if you do want more than a few pixels it's probably best to use the point cloud.

ARafi4 · ‎02-26-2018

Yes, I eventually want to do this for every point in a certain object and average them to the 3D center of the object, so I will use Point Cloud as you suggest. However, I tried to run this example and it didn't seem to work as there were a few compile-time errors; perhaps I messed it up by accident. But that doesn't matter what I am doing anyway.

What does depth.get_width() do? Is it getting the z coordinate for a pixel int the depth frame of width w? Does frames.get_depth_frame() return a 1D array each with a specific width? I was imagining that it would return a 2D array which would mean that there would be multiple pixels with the same width.

Thanks for all the help. I will try this out.

jb455 · ‎02-26-2018

depth.get_width() should return the pixel width of the depth image (I use C# not C++ so I'm going by the samples!)

The data is stored row-by-row, so the first 640 elements (assuming a 640*480 image for example - in this case get_width should return 640) represent each pixel in the first row, then the next 640 are the second row down etc. So if you want a pixel in the third row (y=2), you need to skip the first 640*2 elements to get to the start of that row, then add whatever your x value is to the index to move across the image. Hence [x + (y * width)].