Self-supervised learning of visual appearance solves fundamental problems of optical flow

Flying insects as inspiration to AI for small drones

How do honeybees land on flowers or avoid obstacles? One would expect such questions to be mostly of interest to biologists. However, the rise of small electronics and robotic systems has also made them relevant to robotics and Artificial Intelligence (AI). For example, small flying robots are extremely restricted in terms of the sensors and processing that they can carry onboard. If these robots are to be as autonomous as the much larger self-driving cars, they will have to use an extremely efficient type of artificial intelligence – similar to the highly developed intelligence possessed by flying insects.

Optical flow

One of the main tricks up the insect’s sleeve is the extensive use of ‘optical flow’: the way in which objects move in their view. They use it to land on flowers and avoid obstacles or predators. Insects use surprisingly simple and elegant optical flow strategies to tackle complex tasks. For example, for landing, honeybees keep the optical flow divergence (how quickly things get bigger in view) constant when going down. By following this simple rule, they automatically make smooth, soft landings.

I started my work on optical flow control from enthusiasm about such elegant, simple strategies. However, developing the control methods to actually implement these strategies in flying robots turned out to be far from trivial. For example, when I first worked on optical flow landing my flying robots would not actually land, but they started to oscillate, continuously going up and down, just above the landing surface.

Bees on the left, a drone on the right
Honeybees are a fertile source of inspiration for the AI of small drones. They are able to perform an impressive repertoire of complex behaviors with very limited processing (~960,000 neurons). Drones are in their turn very interesting “models” for biology. Testing out hypotheses from biology on drones can bring novel insights into the problems faced and solved by flying insects like honeybees.

Fundamental problems

Optical flow has two fundamental problems that have been widely described in the growing literature on bio-inspired robotics. The first problem is that optical flow only provides mixed information on distances and velocities – and not on distance or velocity separately. To illustrate, if there are two landing drones and one of them flies twice as high and twice as fast as the other drone, then they experience exactly the same optical flow. However, for good control these two drones should actually react differently to deviations in the optical flow divergence. If a drone does not adapt its reactions to the height when landing, it will never arrive and start to oscillate above the landing surface.

The second problem is that optical flow is very small and little informative in the direction in which a robot is moving. This is very unfortunate for obstacle avoidance, because it means that the obstacles straight ahead of the robot are the hardest ones to detect! The problems are illustrated in the figures below.

Illustration of the two problems
Left: Problem 1: The white drone is twice as high and goes down twice as fast as the red drone. However, they both see the same optical flow divergence, as in both cases the object in view gets twice as big. This can be seen by the colored triangles – at the highest position it captures the full angle of the landing platform, at the lowest position it only covers half of the landing platform in the field of view.
Right: Problem 2: The drone moves straight forward, in the direction of its forward-looking camera. Hence, the focus of expansion is straight ahead. Objects close to this direction, like the red obstacle, have very little flow. This is illustrated by the red lines in the figure: The angles of these lines with respect to the camera are very similar. Objects further from this direction, like the green obstacle, have considerable flow. Indeed, the green lines show that the angle gets quickly bigger when the drone moves forward.

Learning visual appearance as the solution

In an article published in Nature Machine Intelligence today [1], we propose a solution to both problems. The main idea was that both problems of optical flow would disappear if the robots were able to interpret not only optical flow, but also the visual appearance of objects in their environment. This solution becomes evident from the above figures. The rectangular insets show the images captured by the drones. For the first problem it is evident that the image perfectly captures the difference in height between the white and red drone: The landing platform is simply larger in the red drone’s image. For the second problem the red obstacle is as large as the green one in the drone’s image. Given their identical size, the obstacles are equally close to the drone.

Exploiting visual appearance as captured by an image would allow robots to see distances to objects in the scene similarly to how we humans can estimate distances in a still picture. This would allow drones to immediately pick the right control gain for optical flow control and it would allow them to see obstacles in the flight direction. The only question was: How can a flying robot learn to see distances like that?

The key to this question lay in a theory I devised a few years back [2], which showed that flying robots can actively induce optical flow oscillations to perceive distances to objects in the scene. In the approach proposed in the Nature Machine Intelligence article the robots use such oscillations in order to learn what the objects in their environment look like at different distances. In this way, the robot can for example learn how fine the texture of grass is when looking at it from different heights during landing, or how thick tree barks are at different distances when navigating in a forest.

Relevance to robotics and applications

Implementing this learning process on flying robots led to much faster, smoother optical flow landings than we ever achieved before. Moreover, for obstacle avoidance, the robots were now also able to see obstacles in the flight direction very clearly. This did not only improve obstacle detection performance, but also allowed our robots to speed up. We believe that the proposed methods will be very relevant to resource-constrained flying robots, especially when they operate in a rather confined environment, such as flying in greenhouses to monitor crop or keeping track of the stock in warehouses.

It is interesting to compare our way of distance learning with recent methods in the computer vision domain for single camera (monocular) distance perception. In the field of computer vision, self-supervised learning of monocular distance perception is done with the help of projective geometry and the reconstruction of images. This results in impressively accurate, dense distance maps. However, these maps are still “unscaled” – they can show that one object is twice as far as another one but cannot convey distances in an absolute sense.

In contrast, our proposed method provides “scaled” distance estimates. Interestingly, the scaling is not in terms of meters but in terms of control gains that would lead the drone to oscillate. This makes it very relevant for control. This feels very much like the way in which we humans perceive distances. Also for us it may be more natural to reason in terms of actions (“Is an object within reach?”, “How many steps do I roughly need to get to a place?”) than in terms of meters. It hence reminds very much of the perception of “affordances”, a concept forwarded by Gibson, who introduced the concept of optical flow [3].

A drone flying
Picture of our drone during obstacle avoidance experiments. At first, the drone flies forward while trying to keep the height constant by keeping the vertical optical flow equal to zero. If it gets close to an obstacle, it will oscillate. The increasing oscillation when approaching an obstacle is used to learn how to see distance by means of visual appearance. After learning, the drone can fly faster and safer. For the experiments we used a Parrot Bebop 2 drone, replacing its onboard software with the Paparazzi open source autopilot – performing all computing onboard of the drone’s native processor.

Relevance to biology

The findings are not only relevant to robotics, but also provide a new hypothesis for insect intelligence. Typical honeybee experiments start with a learning phase, in which honeybees exhibit various oscillatory behaviors when they get acquainted with a new environment and related novel cues like artificial flowers. The final measurements presented in articles typically take place after this learning phase has finished and focus predominantly on the role of optical flow. The presented learning process forms a novel hypothesis on how flying insects improve their navigational skills over their lifetime. This suggests that we should set up more studies to investigate and report on this learning phase.

On behalf of all authors,
Guido de Croon, Christophe De Wagter, and Tobias Seidl.

References

A swarm of autonomous tiny flying robots

By K.N. McGuire, C. De Wagter, K. Tuyls, H.J. Kappen, G.C.H.E. de Croon

Greenhouses, search-and-rescue teams and warehouses are all looking for new methods to enable surveillance in a manner that is quick and safe for the objects and people surrounding them. Many of them already found their way into robotics, but wheeled ground-bound systems have limited maneuverability. Ideally it would be great if flying robots, a.k.a. micro aerial vehicles (MAV) can take advantage of their 3rd dimension to perform surveillance. A group or a swarm of them should be ideal to cover as much ground as possible. The price of fully autonomous MAVs usually comes with the cost of weight and size. This is problematic in green- and warehouses in particular, as many of us would not be comfortable working with 3 kilo MAVs flying over our heads.

Guus Schoonewille, TU Delft

There have been many advances already in artificial intelligence and navigation techniques to make MAVs fully autonomous navigators through these unknown areas. However, these techniques are computationally expensive and therefore need substantial on-board calculating power. A smaller MAV could rely on the aid of an external computer, but this will become a problem as soon as their numbers and the environment increase in size. Communication speeds through the inner structure of a building will degrade quickly, which makes any search-and-rescue scenario ill-fated. Ideally, a group of tiny MAVs would need to be independent and only use their onboard sensing and computing capabilities. Imagine making such tiny drones navigate in a GPS-deprived environment without the help of an external positioning system.

At the MAVLab of the TU Delft, we have tried to tackle this difficult problem by thinking of creative ways to solve the difficult issues surrounding autonomous indoor navigation. We try to look at nature for inspiration and have implemented that successfully in the past with the Delfly or optical flow-based control. So, if a swarm of bees can do it, why not a group of tiny MAVs? That did not turn out as easy as it initially seemed, as we needed to solve the intermediate issues out from the ground up. An individual bee is very talented as it can already avoid trees, bushes and his peers as soon as it has a chance to spread its wings. These capabilities do not come naturally to an MAV, and this has to be implemented in a very robust manner, because a crashed MAV usually cannot recover by itself. A bee can already go out and explore a field of flowers, and return to its home hive once it’s done. This is something that an MAV weighing less then 50 grams has never done autonomously before, let alone a swarm.

Guus Schoonewille, TU Delft

Since many navigation strategies are out of the question because of their complexity, we had to turn to other inspiration from the insect kingdom, namely a path planning technique called ‘Bug Algorithms’. Although the name suggests otherwise, bug algorithms are not really inspired by biology. These algorithms originated for mathematical path planning techniques, of which the core principles are that they require very little memory and are very simple in computation. The idea revolves around the fact that an agent has to go from A to B, and will deal with the intermediate obstacles on-the-fly by following their border (a.k.a. wall-following). Once the path towards the goal is clear, the agent will resume its way towards to goal again. By remembering some places along the way, it can make different decisions on locations it has encountered again to make sure it does not get stuck in an endless loop. By a set of very simple rules and memory blocks which only take a fraction of the computing power and memory, an agent can in most cases find its way to its goal. However, the main problem of most existing bug algorithms is that they rely on a known global location (as with GPS) or rely on perfect odometry. Hence, most existing bug algorithms are actually not so well suited for application to tiny robots that navigate in the real world.

Guus Schoonewille, TU Delft

We developed our own bug algorithm and extended it to make it suitable for a swarm of MAVs navigating in the real world. The swarm is leveraged by having the MAVs travel in different directions. The MAVs first fly away from the base station, exploring their environment. They are able to come back with the help of a wireless beacon located at the base station. Based on the signal strength of their connection with the beacon, they can evaluate the right direction to go back to. The MAVs are also communicating with each other for sharing their exploration preference and to avoid each other. We called this technique the Swarm Gradient Bug Algorithm (SGBA).

On the Faculty of Aerospace engineering of the Delft University of Technology, on the 11th floor of the highrise building, we used an empty office space for an ultimate system test of SGBA. We acquired 6 off-the-shelf MAVs, namely the Crazyflie 2.0’s of Bitcraze AB, equipped with the Flowdeck and Multirangerdeck. Six of these are sent in different directions to cover as much ground as possible in the beginning, while they were coordinating and avoiding each other at the same time. Once their battery is at 60%, they navigate back to the home beacon. With this, the crazyflies explored 80 % of the open rooms, and on average 4 of the 6 were able to return out of the 5 flights performed. Next to this, we have also experimented with configurations with 2 and 4 crazyflies, which resulted in 15 test flights in the real-world environment, backed up with hundreds of simulated environment flights with greater quantities.

Although we are very happy with the autonomous exploration by our swarm of MAVs, we see ample space for improvement, ranging from more robust obstacle detection and avoidance to more accurate intra-swarm distance sensing. On the long run, we envisage also vision-based navigation for small MAVs without a beacon at the base station. Still, for the time being, the proposed navigation method is already suitable for practical swarm navigation of small robots, as a single beacon at the base station is not a very limiting factor. Moreover, we hope that the main concept behind SGBA – trading off the accuracy of highly detailed navigation maps against substantially less computational requirements – will inspire others to tackle similarly complex missions with swarms of tiny robots.

For details on the experiments and SGBA, please check the paper:

Minimal navigation solution for a swarm of tiny flying robots to explore an unknown environment K.N. McGuire, C. De Wagter, K. Tuyls, H.J. Kappen, G.C.H.E. de Croon
Science Robotics, 23 October

The world’s smallest autonomous racing drone

Racing team 2018-2019: Christophe De Wagter, Guido de Croon, Shuo Li, Phillipp Dürnay, Jiahao Lin, Simon Spronk

Autonomous drone racing
Drone racing is becoming a major e-sports. Enthusiasts – and now also professionals – transform drones into seriously fast racing platforms. Expert drone racers can reach speeds up to 190 km/h. They fly by looking at a first-person view (FPV) of their drone, which has a camera transmitting images mounted on the front.

In recent years, the advance in areas such as artificial intelligence, computer vision, and control has raised the question whether drones would not be able to fly faster than humans. The advantage for the drone could be that it can sense much more than the human pilot (like accelerations and rotation rates with its inertial sensors) and process all image data quicker on board of the drone. Moreover, its intelligence could be shaped purely for only one goal: racing as fast as possible.

In the quest for a fast-flying, autonomous racing drone, multiple autonomous racing drone competitions have been organized in the academic community. These “IROS” drone races (where IROS stands for one of the most well-known world-wide robotics conferences) have been held from 2016 on. Over these years, the speed of the drones has been gradually improving, with the faster drones in the competition now moving at ~2 m/s.

Smaller
Most of the autonomous racing drones are equipped with high-performance processors, with multiple, high-quality cameras and sometimes even with laser scanners. This allows these drones to use state-of-the-art solutions to visual perception, like building maps of the environment or tracking accurately how the drone is moving over time. However, it also makes the drones relatively heavy and expensive.

At the Micro Air Vehicle laboratory (MAVLab) of TU Delft, we have as aim to make light-weight and cheap autonomous racing drones. Such drones could be used by many drone racing enthusiasts to train with or fly against. If the drone becomes small enough, it could even be used for racing at home. Aiming for “small” means serious limitations to the sensors and processing that can be carried onboard. This is why in the IROS drone races we have always focused on monocular vision (a single camera) and on software algorithms for vision, state estimation, and control that are computationally highly efficient.


With its 72 grams and 10 cm diameter, the modified “Trashcan” drone is currently the smallest autonomous racing drone in the world. In the background, Shuo Li, PhD student at working on autonomous drone racing at the MAVLab.

A 72-gram autonomous racing drone
Here, we report on how we made a tiny autonomous racing drone fly through a racing track with on average 2 m/s, which is competitive with other, larger state-of-the-art autonomous racing drones.

The drone, which is a modified Eachine “Trashcan”, is 10 cm in diameter and weighs 72 grams. This weight includes a 17-gram JeVois smart-camera, which consists of a single, rolling shutter CMOS camera, a 4-core ARM v7 1.34 GHz processor with 256 MB RAM, and a 2-core Mali GPU. Although limited compared to the processors used on other drones, we consider it as more than powerful enough: With the algorithms we explain below, the drone actually only uses a single CPU core. The JeVois camera communicates with a 4.2gram Crazybee F4 Pro Flight Controller running Paparazzi autopilot, via the MAVLink communication protocol. Both the JeVois code and Paparazzi code is open source and available to the community.

An important characteristic of our approach to drone racing is that we do not rely on accurate, but computationally expensive methods for visual Simultaneous Localization And Mapping (SLAM) or Visual Inertial Odometry (VIO). Instead, we focus on having the drone predict its motion as good as possible with an efficient prediction model and correct any drift of the model with vision-based gate detections.

Prediction
A typical prediction model would involve the integration of the accelerometer readings. However, on small drones the Inertial Measurement Unit (IMU) is subject to a lot of vibration, leading to noisier accelerometer readings. Integrating such noisy measurements quickly leads to an enormous drift in both the velocity and position estimates of the drone. Therefore, we have opted for a simpler solution, in which the IMU is only used to determine the attitude of the drone. This attitude can then be used to predict the forward acceleration, as illustrated in the figure below. If one assumes the drone to fly at a constant height, the force in the z-direction has to equal the gravity force. Given a specific pitch angle, this relation leads to a specific forward force due to the thrust. The prediction model then updates the velocity based on this predicted forward force and the expected drag force given the estimated velocity.

Prediction model for the tiny drone. The drone has an estimate of its attitude, including the pitch angle (ѳ). Assuming the drone to fly at a constant height, the force straight up (Tz) should equal gravity (g). Together, these two pieces allow us to calculate the thrust force that should be delivered by the drone’s four propellers (T), and, consequently also the force that is exerted forwards (Tx). The model uses this forward force, and resulting (backward) drag force (Dx), to update the velocity (vx) and position of the drone when not seeing gates.

Vision-based corrections
The prediction model is corrected with the help of vision-based position measurements. First, a snake-gate algorithm is used to detect the colored gate in the image. This algorithm is extremely efficient, as it only processes a small portion of the image’s pixels. It samples random image locations and when it finds the right color, it starts following it around to determine the shape. After a detection, the known size of the gate is used to determine the drone’s relative position to the gate (see the figure below). This is a standard perspective-N-point problem. The output of this process is a relative position to a gate. Subsequently, we figure out which gate on the racing track is most likely in our view, and transform the relative position to the gate to a global position measurement. Since our vision process often outputs quite precise position estimates but sometimes also produces significant outliers, we do not use a Kalman filter but a Moving Horizon Estimator for the state estimation. This leads to much more robust position and velocity estimates in the presence of outliers.

The gates and their sizes are known. When the drone detects a gate, it can use this knowledge to calculate its relative position to a gate. The global layout of the track and current supposed position of the drone are used to determine which gate the drone is most likely looking at. This way, the relative position can be transformed to a global position estimate.

Racing performance and future steps
The drone used the newly developed algorithms to race along a 4-gate race track in TU Delft’s Cyberzoo. It can fly multiple laps at an average speed of 2 m/s, which is competitive with larger, state-of-the-art autonomous racing drones (see the video at the top). Thanks to the central role of gate detections in the drone’s algorithms, the drone can cope with moderate displacements of the gates.
Possible future directions of research are to make the drone smaller and fly faster. In principle, being small is an advantage, since the gates are relatively bigger. This allows the drone to choose its trajectory more freely than a big drone, which may allow for faster trajectories. In order to better exploit this characteristic, we would have to fit optimal control algorithms into the onboard processing. Moreover, we want to make the vision algorithms more robust – as the current color-based snake gate algorithm is quite dependent on lighting conditions. An obvious option here is to start using deep neural networks, which would have to fit within the dual-core Mali GPU on the JeVois.

Arxiv article: Visual Model-predictive Localization for Computationally Efficient Autonomous Racing of a 72-gram Drone, Shuo Li, Erik van der Horst, Philipp Duernay, Christophe De Wagter, Guido C.H.E. de Croon.