The Intel RealSense cameras have been gaining in popularity for the past few years for use as a 3D camera and for visual odometry. I had the chance to hear a presentation from Daniel Piro about using the Intel RealSense cameras generally and for SLAM (Simultaneous Localization and Mapping). The following post is based on his talk.
Depth Camera (D400 series)
Depth information is important since that gives us the information needed to understand shapes, sizes, and distance. This lets us (or a robot) know how far it is from items to avoid running into things and to plan path around obstacles in the image field of view. Traditionally this information has come from RADAR or LIDAR, however in some applications we can also get that from cameras. In cameras we often get depth from using 2 cameras for stereo vision.
The Intel RealSense Depth camera (D400 series) uses stereoscopic depth sensing to determine the range to an item. So essentially it has two cameras and can do triangulation from them for stereo. This sensor uses two infrared cameras for the stereo and then also has an RGB camera onboard. So you can get 4 data products from the sensor; RGB image, depth image, left infrared image, and right infrared image. Think of each image frame as a 3D snapshot of the environment, where each color (RGB) pixel also has a range value (depth) to the item that is in the image. The farther the items are from the camera the greater the range/depth error will be.
The D400 cameras have an infrared projector for getting better features on surfaces with minimal texture in the infrared camera for computing the stereo reconstruction. This projector can be turned on and off if you want. Disabling the projector is often useful for tracking applications (since the projected dots don’t move with the items being tracked).
One thing to be aware is that the infrared images are rectified (to make the images look the same in a common plane) in the camera, however the RGB camera image is not rectified. This means that if you want the depth and RGB images to line up well, you need to manually rectify the RGB image.
Pro Tip 1: The driver has a UV map to help map from the depth pixel to the RGB image to help account for the difference in image sizes. This lets you match the depth to RGB image data points better.
Pro Tip 2: If using the D435i (the IMU version), use the timestamps from the images and IMU to synchronize the two sensors. If you use system time (from your computer) there will be more error (partially due to weird USB timing).
Pro Tip 3: The cameras have Depth Presets. These are profiles that let you optimize various settings for various conditions. Such as high density, high accuracy, etc..
Pro Tip 4: Make sure your exposure is set correctly. If you are using auto exposure try changing the Mean Intensity Set Point (the setting is not where exposure is, it is under AE control, not the most obvious).
If you want to use manual exposure you can play with are Exposure setpoint and Gain constant. Start with the exposure setpoint, then adjust the gain.
You might also want to see this whitepaper for more methods of tuning the cameras for better performance.
Visual Odometry & SLAM (T265)
Visual odometry is the generic term for figuring out how far you have moved using a camera. This is as opposed to “standard” odometry using things such as wheel encoders, or inertial odometry with a IMU.
RealSense T265 is a tracking camera that is designed to be more optimal for Visual Odometry and SLAM (wider field of view and not using infrared light). It can do SLAM onboard as well as loop closure. However, this camera is not able to return RGB images (since it does not have a RGB camera onboard) and the depth returned is not as good as the D400 series (and can be a little trickier to get).
Using both a RealSense D435i sensor and a RealSense T265 sensor can provide both the maps and the better quality visual odometry for developing a full SLAM system. The D435i used for the mapping, and the T265 for the tracking.
Software
Intel provides the RealSense SDK2.0 library for using the RealSense cameras. It is Open Source and work on Mac, Windows, Linux, and Android. There are also ROS and OpenCV wrappers.
Click here for the developers page with the SDK.
Within the SDK (software development kit) it includes a viewer to let you view images, record images, change settings, or update the firmware.
Pro Tip 5: Spend some time with the viewer looking at the camera and depth images when designing your system so you can compare various mounting angles, heights, etc.. for the cameras.
The RealSense SDK (software development kit) has a few filters that can run on your computer to try and improve the returned depth map. You can play with turning these on and off in the viewer. Some of these include:
- Decimation Filter
- Spatial Edge-Preserving Filter
- Temporal Filter
- Holes Filling Filter
Within ROS there is a realsense-ros package that provides a wrapper for working with the cameras in ROS and lets you view images and other data in RVIZ.
ROS RealSense Occupancy Map package is available as an experimental feature in a separate branch of the RealSense git repo. This uses both the D400 and T265 cameras for creating the map.
For SLAM with just the D435i sensor, see here.
Pro Tip 6: You can use multiple T265 sensors for better accuracy. For example, if one sensor is pointed forward and another backwards; you can use the confidence values from each sensor to feed into a filter.
I know this is starting to sound like a sales pitch
Pro Tip 7: If you have multiple cameras you can connect and query for a serial number to know which cameras is which.
Also if you remove the little connector at the top of the camera you can wire and chain multiple cameras together to synchronize them. (This should work in the SDK 2.0)
See this whitepaper for working with multiple camera configurations. https://simplecore.intel.com/realsensehub/wp-content/uploads/sites/63/Multiple_Camera_WhitePaper04.pdf
Pro Tip 8: Infinite depth points (such as points to close or to far from the sensor) have a depth value of 0. This is good to know for filtering.
Here are two code snippet’s for using the cameras provided by Daniel, to share with you. The first one is for basic displaying of images using python. The second code snippet is using OpenCV to also detect blobs, also using python.
## License: Apache 2.0. See LICENSE file in root directory.
## Copyright(c) 2015-2017 Intel Corporation. All Rights Reserved.
###############################################
## Open CV and Numpy integration ##
###############################################
import pyrealsense2 as rs
import numpy as np
import cv2
cv2.namedWindow('RealSense', cv2.WINDOW_AUTOSIZE)
# Configure depth and color streams
pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)
# Start streaming
pipeline.start(config)
try:
while True:
# Wait for a coherent pair of frames: depth and color
frames = pipeline.wait_for_frames()
depth_frame = frames.get_depth_frame()
color_frame = frames.get_color_frame()
if not depth_frame or not color_frame:
continue
# Convert images to numpy arrays
depth_image = np.asanyarray(depth_frame.get_data())
color_image = np.asanyarray(color_frame.get_data())
# Apply colormap on depth image (image must be converted to 8-bit per pixel first)
scaled_depth=cv2.convertScaleAbs(depth_image, alpha=0.08)
depth_colormap = cv2.applyColorMap(scaled_depth, cv2.COLORMAP_JET)
# Stack both images horizontally
images = np.hstack((color_image, depth_colormap))
# Show images
cv2.imshow('RealSense', images)
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
finally:
# Stop streaming
pipeline.stop()
## License: Apache 2.0. See LICENSE file in root directory.
## Copyright(c) 2015-2017 Intel Corporation. All Rights Reserved.
###############################################
## Open CV and Numpy integration ##
###############################################
import pyrealsense2 as rs
import numpy as np
import cv2
def nothing(args):
pass
def detectBlobs(mask):
# Set up the SimpleBlobdetector with default parameters.
params = cv2.SimpleBlobDetector_Params()
# Change thresholds
params.minThreshold = 1;
params.maxThreshold = 255;
# Filter by Area.
params.filterByArea = True
params.maxArea = 4000
params.minArea = 300
# Filter by Circularity
params.filterByCircularity = True
params.minCircularity = 0.1
# Filter by Convexity
params.filterByConvexity = True
params.minConvexity = 0.5
# Filter by Inertia
params.filterByInertia = True
params.minInertiaRatio = 0.1
detector = cv2.SimpleBlobDetector_create(params)
# Detect blobs.
reversemask= mask
keypoints = detector.detect(reversemask)
im_with_keypoints = cv2.drawKeypoints(mask, keypoints, np.array([]),
(0,0,255), cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
return im_with_keypoints
def thresholdDepth(depth):
depth[depth==0] = 255 #set all invalid depth pixels to 255
threshold_value = cv2.getTrackbarPos('Threshold','Truncated Depth')
# Zero if dist>TH
ret,truncated_depth=cv2.threshold(scaled_depth,threshold_value,255,cv2.THRESH_BINARY_INV)
return truncated_depth
cv2.namedWindow('RealSense', cv2.WINDOW_AUTOSIZE)
cv2.namedWindow('Truncated Depth', cv2.WINDOW_AUTOSIZE)
cv2.createTrackbar('Threshold','Truncated Depth',30,255,nothing)
# Configure depth and color streams
pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.depth, 640, 480, rs.format.z16, 30)
config.enable_stream(rs.stream.color, 640, 480, rs.format.bgr8, 30)
# Start streaming
pipeline.start(config)
try:
while True:
# Wait for a coherent pair of frames: depth and color
frames = pipeline.wait_for_frames()
depth_frame = frames.get_depth_frame()
color_frame = frames.get_color_frame()
if not depth_frame or not color_frame:
continue
# Convert images to numpy arrays
depth_image = np.asanyarray(depth_frame.get_data())
color_image = np.asanyarray(color_frame.get_data())
# Apply colormap on depth image (image must be converted to 8-bit per pixel first)
scaled_depth=cv2.convertScaleAbs(depth_image, alpha=0.08)
depth_colormap = cv2.applyColorMap(scaled_depth, cv2.COLORMAP_JET)
# Stack both images horizontally
images = np.hstack((color_image, depth_colormap))
# Show images
cv2.imshow('RealSense', images)
truncated_depth=thresholdDepth(scaled_depth)
truncated_depth=detectBlobs(truncated_depth)
cv2.imshow('Truncated Depth', truncated_depth)
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
finally:
# Stop streaming
pipeline.stop()
I hope you found this informative and can make use of the Pro Tips.
Thank you to Daniel for presenting this information and allowing me to share it. This content is based on his talk. Daniel has also provided the full slide set that can be accessed by clicking here.
Disclaimer: I have not received any funding or free items from Intel.
Liked this article? Take a second to support me on Patreon! This post appeared first on Robots For Roboticists.