Choosing the Right Encoder for Your Robot
Boston Dynamics’ latest video shows its Atlas humanoid robot has moves like Simone Biles
Why Some Smaller Manufacturers Never Get Started With Automation
Introduction to behavior trees
Abstraction in programming has evolved our use of computers from basic arithmetic operations to representing complex real-world phenomena using models. More specific to robotics, abstraction has moved us from low-level actuator control and basic sensing to reasoning about higher-level concepts behaviors, as I define in my Anatomy of a Robotic System post.
In autonomous systems, we have seen an entire host of abstractions beyond “plain” programming for behavior modeling and execution. Some common ones you may find in the literature include teleo-reactive programs, Petri nets, finite-state machines (FSMs), and behavior trees (BTs). In my experience, FSMs and BTs are the two abstractions you see most often today.
In this post, I will introduce behavior trees with all their terminology, contrast them with finite-state machines, share some examples and software libraries, and as always leave you with some resources if you want to learn more.
What is a behavior tree?
As we introduced above, there are several abstractions to help design complex behaviors for an autonomous agent. Generally, these consist of a finite set of entities that map to particular behaviors or operating modes within our system, e.g., “move forward”, “close gripper”, “blink the warning lights”, “go to the charging station”. Each model class has some set of rules that describe when an agent should execute each of these behaviors, and more importantly how the agent should switch between them.
Behavior trees (BTs) are one such abstraction, which I will define by the following characteristics:
- Behavior trees are trees (duh): They start at a root node and are designed to be traversed in a specific order until a terminal state is reached (success or failure).
- Leaf nodes are executable behaviors: Each leaf will do something, whether it’s a simple check or a complex action, and will output a status (success, failure, or running). In other words, leaf nodes are where you connect a BT to the lower-level code for your specific application.
- Internal nodes control tree traversal: The internal (non-leaf) nodes of the tree will accept the resulting status of their children and apply their own rules to dictate which node should be expanded next.
Behavior trees actually began in the videogame industry to define behaviors for non-player characters (NPCs): Both Unreal Engine and Unity (two major forces in this space) have dedicated tools for authoring BTs. This is no surprise; a big advantage of BTs is that they are easy to compose and modify, even at runtime. However, this sacrifices the ease of designing reactive behaviors (for example, mode switches) compared to some of the other abstractions, as you will see later in this post.
Since then, BTs have also made it into the robotics domain as robots have become increasingly capable of doing more than simple repetitive tasks. Easily the best resource here is the textbook “Behavior Trees in Robotics and AI: An Introduction” by Michele Colledanchise and Petter Ögren. In fact, if you really want to learn the material you should stop reading this post and go directly to the book … but please stick around?
Behavior tree terminology
Let’s dig into the terminology in behavior trees. While the language is not standard across the literature and various software libraries, I will largely follow the definitions in Behavior Trees in Robotics and AI.
At a glance, these are the types of nodes that make up behavior trees and how they are represented graphically:
Behavior trees execute in discrete update steps known as ticks. When a BT is ticked, usually at some specified rate, its child nodes recursively tick based on how the tree is constructed. After a node ticks, it returns a status to its parent, which can be Success, Failure, or Running.
Execution nodes, which are leaves of the BT, can either be Action or Condition nodes. The only difference is that condition nodes can only return Success or Failure within a single tick, whereas action nodes can span multiple ticks and can return Running until they reach a terminal state. Generally, condition nodes represent simple checks (e.g., “is the gripper open?”) while action nodes represent complex actions (e.g., “open the door”).
Control nodes are internal nodes and define how to traverse the BT given the status of their children. Importantly, children of control nodes can be execution nodes or control nodes themselves. Sequence, Fallback, and Parallel nodes can have any number of children, but differ in how they process said children. Decorator nodes necessarily have one child, and modify its behavior with some custom defined policy.
Check out the images below to see how the different control nodes work.
Robot example: searching for objects
The best way to understand all the terms and graphics in the previous section is through an example. Suppose we have a mobile robot that must search for specific objects in a home environment. Assume the robot knows all the search locations beforehand; in other words, it already has a world model to operate in.
Let’s start simple. If there is only one location (we’ll call it A), then the BT is a simple sequence of the necessary actions: Go to the location and then look for the object.
We’ve chosen to represent navigation as an action node, as it may take some time for the robot to move (returning Running in the process). On the other hand, we represent vision as a condition node, assuming the robot can detect the object from a single image once it arrives at its destination. I admit, this is totally contrived for the purpose of showing one of each execution node.
One very common design principle you should know is defined in the book as explicit success conditions. In simpler terms, you should almost always check before you act. For example, if you’re already at a specific location, why not check if you’re already there before starting a navigation action?
Our robot likely operates in an environment with multiple locations, and the idea is to look in all possible locations until we find the object of interest. This can be done by introducing a root-level Fallback node and repeating the above behavior for each location in some specified order.
Finally, suppose that instead of looking for a single object, we want to consider several objects — let’s say apples and oranges. This use case of composing conditions can be done with Parallel nodes as shown below.
- If we accept either an apple or an orange (“OR” condition), then we succeed if one node returns Success.
- If we require both an apple and an orange (“AND” condition), then we succeed if both nodes return Success.
- If we care about the order of objects, e.g., you must find an apple before finding an orange, then this could be done with a Sequence node instead.
Of course, you can also compose actions in parallel — for example, turning in place until a person is detected for 5 consecutive ticks. While my example is hopefully simple enough to get the basics across, I highly recommend looking at the literature for more complex examples that really show off the power of BTs.
Robot example revisited: decorators and blackboards
I don’t know about you, but looking at the BT above leaves me somewhat uneasy. It’s just the same behavior copied and pasted multiple times underneath a Fallback node. What if you had 20 different locations, and the behavior at each location involved more than just two simplified execution nodes? Things could quickly get messy.
In most software libraries geared for BTs you can define these execution nodes as parametric behaviors that share resources (for example, the same ROS action client for navigation, or object detector for vision). Similarly, you can write code to build complex trees automatically and compose them from a ready-made library of subtrees. So the issue isn’t so much efficiency, but readability.
There is an alternative implementation for this BT, which can extend to many other applications. Here’s the basic idea:
- Introduce decorators: Instead of duplicating the same subtree for each location, have a single subtree and decorate it with a policy that repeats the behavior until successful.
- Update the target location at each iteration: Suppose you now have a “queue” of target locations to visit, so at each iteration of the subtree you pop an element from that queue. If the queue eventually ends up empty, then our BT fails.
In most BTs, we often need some notion of shared data like the location queue we’re discussing. This is where the concept of a blackboard comes in: you’ll find blackboard constructs in most BT libraries out there, and all they really are is a common storage area where individual behaviors can read or write data.
Our example BT could now be refactored as follows. We introduce a “GetLoc” action that pops a location from our queue of known locations and writes it to the blackboard as some parameter target_location. If the queue is empty, this returns Failure; otherwise it returns Success. Then, downstream nodes that deal with navigation can use this target_location parameter, which changes every time the subtree repeats.
You can use blackboards for many other tasks. Here’s another extension of our example: Suppose that after finding an object, the robot should speak with the object it detected, if any. So, the “FoundApple” and “FoundOrange” conditions could write to a located_objects parameter in the blackboard and a subsequent “Speak” action would read it accordingly. A similar solution could be applied, for instance, if the robot needs to pick up the detected object and has different manipulation policies depending on the type of object.
Fun fact: This section actually came from a real discussion with Davide Faconti, in which… he essentially schooled me. It brings me great joy to turn my humiliation into an educational experience for you all.
Behavior tree software libraries
Let’s talk about how to program behavior trees! There are quite a few libraries dedicated to BTs, but my two highlights in the robotics space are py_trees and BehaviorTree.CPP.
py_trees is a Python library created by Daniel Stonier.
- Because it uses an interpreted language like Python, the interface is very flexible and you can basically do what you want… which has its pros and cons. I personally think this is a good choice if you plan on automatically modifying behavior trees at run time.
- It is being actively developed and with every release you will find new features. However, many of the new developments — not just additional decorators and policy options, but the visualization and logging tools — are already full-steam-ahead with ROS 2. So if you’re still using ROS 1 you will find yourself missing a lot of new things. Check out the PyTrees-ROS Ecosystem page for more details.
- Some of the terminology and design paradigms are a little bit different from the Behavior Trees in Robotics book. For example, instead of Fallback nodes this library uses Selector nodes, and these behave slightly differently.
BehaviorTree.CPP is a C++ library developed by Davide Faconti and Michele Colledanchise (yes, one of the book authors). It should therefore be no surprise that this library follows the book notation much more faithfully.
- This library is quickly gaining traction as the behavior tree library of the ROS developers’ ecosystem, because C++ is similarly the language of production quality development for robotics. In fact, the official ROS 2 navigation stack uses this library in its BT Navigator feature.
- It heavily relies on an XML based workflow, meaning that the recommended way to author a BT is through XML files. In your code, you register node types with user-defined classes (which can inherit from a rich library of existing classes), and your BT is automatically synthesized!
- It is paired with a great tool named Groot which is not only a visualizer, but a graphical interface for editing behavior trees. The XML design principle basically means that you can draw a BT and export it as an XML file that plugs into your code.
- This all works wonderfully if you know the structure of your BT beforehand, but leaves a little to be desired if you plan to modify your trees at runtime. Granted, you can also achieve this using the programmatic approach rather than XML, but this workflow is not documented/recommended, and doesn’t yet play well with the visualization tools.
So how should you choose between these two libraries? They’re both mature, contain a rich set of tools, and integrate well with the ROS ecosystem. It ultimately boils down to whether you want to use C++ or Python for your development. In my example GitHub repo I tried them both out, so you can decide for yourself!
Behavior trees vs. finite-state machines
In my time at MathWorks, I was immersed in designing state machines for robotic behavior using Stateflow — in fact, I even did a YouTube livestream on this topic. However, robotics folks often asked me if there were similar tools for modeling behavior trees, which I had never heard of at the time. Fast forward to my first day at CSAIL, my colleague at the time (Daehyung Park) showed me one of his repositories and I finally saw my first behavior tree. It wasn’t long until I was working with them in my project as a layer between planning and execution, which I describe in my 2020 recap blog post.
As someone who has given a lot of thought to “how is a BT different from a FSM?”, I wanted to reaffirm that they both have their strengths and weaknesses, and the best thing you can do is learn when a problem is better suited for one or the other (or both).
The Behavior Trees in Robotics and AI book expands on these thoughts in way more rigor, but here is my attempt to summarize the key ideas:
- In theory, it is possible to express anything as a BT, FSM, one of the other abstractions, or as plain code. However, each model has its own advantages and disadvantages in their intent to aid design at larger scale.
- Specific to BTs vs. FSMs, there is a tradeoff between modularity and reactivity. Generally, BTs are easier to compose and modify while FSMs have their strength in designing reactive behaviors.
Let’s use another robotics example to go deeper into these comparisons. Suppose we have a picking task where a robot must move to an object, grab it by closing its gripper, and then move back to its home position. A side-by-side BT and FSM comparison can be found below. For a simple design like this, both implementations are relatively clean and easy to follow.
Now, what happens if we want to modify this behavior? Say we first want to check whether the pre-grasp position is valid, and correct if necessary before closing the gripper. With a BT, we can directly insert a subtree along our desired sequence of actions, whereas with a FSM we must rewire multiple transitions. This is what we mean when we claim BTs are great for modularity.
On the other hand, there is the issue of reactivity. Suppose our robot is running on a finite power source, so if the battery is low it must return to the charging station before returning to its task. You can implement something like this with BTs, but a fully reactive behavior (that is, the battery state causes the robot to go charge no matter where it is) is easier to implement with a FSM… even if it looks a bit messy.
On the note of “messy”, behavior tree zealots tend to make the argument of “spaghetti state machines” as reasons why you should never use FSMs. I believe that is not a fair comparison. The notion of a hierarchical finite-state machine (HFSM) has been around for a long time and helps avoid this issue if you follow good design practices, as you can see below. However, it is true that managing transitions in a HFSM is still more difficult than adding or removing subtrees in a BT.
There have been specific constructs defined to make BTs more reactive for exactly these applications. For example, there is the notion of a “Reactive Sequence” that can still tick previous children in a sequence even after they have returned Success. In our example, this would allow us to terminate a subtree with Failure if the battery levels are low at any point during that action sequence, which may be what we want.
Because of this modularity / reactivity tradeoff, I like to think that FSMs are good at managing higher-level operating modes (such as normal operation vs. charging), and BTs are good at building complex sequences of behaviors that are excellent at handling recoveries from failure. So, if this design were up to me, it might be a hybrid that looks something like this:
Conclusion
Thank for reading through this introductory post, and I look forward to your comments, questions, and suggestions. If you want to try the code examples, check out my example GitHub repository.
To learn more about behavior trees, here are some good resources that I’ve relied on over the past year and a bit.
- Behavior Trees in Robotics and AI: An Introduction — textbook by Michele Colledanchise and Petter Ögren. I cannot recommend this one enough!
- py_trees and BehaviorTree.CPP software libraries.
- Behavior Trees for AI: How They Work — blog post by Chris Simpson.
- Slides comparing Hierarchical FSMs and BTs, by Kyong-Sok Chang and David Zhu.
See you next time!
Air Learning: A gym environment to train deep reinforcement algorithms for aerial robot navigation
New drone technology helps track relocated birds
Exploring the $75 Million Ford Robotics Building
The sense of touch – Tactile technologies for cobots
Faster path planning for rubble-roving robots
Three new helper robots at the Hsinchu National Taiwan University Hospital
ADATA Technology has collaborated with researchers at Hsinchu National Taiwan University Hospital (NTUH) to introduce the C-Rob Autonomous Mobile Robots. These robots use Artificial Intelligence (AI) to reduce the workload of healthcare workers as Taiwan continues to combat the Covid-19 pandemic.
Recently, an outbreak of Covid-19 struck Taiwan, and hospitals are prone to becoming hotspots for transmission. When Covid-infected patients enter hospitals, whether for testing or much-needed medical care, hospital staff will often prioritize these patients and devote less time to those visiting the hospital for non-Covid related reasons. On top of this, a clean environment must be maintained, with frequent disinfection to reduce the risk of transmission.
After realizing this problem, researchers at ADATA Technology and NTUH created the C-Rob robots hoping that they will assist hospital staff and help fight the pandemic. As of now, the C-Rob robots serve different purposes: two for disinfection and one for transporting and carrying goods. However, all three are equipped with smart navigation, obstacle avoidance, and the ability to move to the exact desired location accurately.
The first robot utilizes large UV lights to disinfect large rooms quickly, while the second robot is equipped with two nozzles that can spray disinfectant as needed.
The load-carrying robot consists of small shelving and a tablet stand, allowing hospital staff to have quick access to medical supplies or services.
The first priority of ADATA Technology and NTUH was to officially introduce the C-Rob robots into clinics to reduce the heavy workload of current healthcare workers working in Taiwan to combat Covid-19. The second phase of development will prioritize the optimization of AI algorithms, and the ability to observe behavior patterns of nursing staff (using smart detection) to make smarter decisions using AI.
The C-Rob robots can operate in hospitals and clinics and assist staff by carrying loads, helping to reduce the burden of nursing staff so more energy can be devoted to what should be prioritized: patients’ medical and physiological needs. The C-Rob robots’ disinfectant spraying and ultraviolet sterilization robots effectively clean healthcare facilities; without the need to devote staff to disinfection, hospitals can reduce staff members to further minimize transmission risk.
ADATA and Hsinchu NTUH operate near the heart of Taiwan’s science and technology industry. Yu Zhong-Ren, Dean of Hsinchu NTUH, says that there are plans in place to “apply the technology of smart autonomous mobile robots to medical services through the connection of the medical industry and the technological industry.” ADATA and Hsinchu NTUH hope to use cross-field cooperation to drive the transformation and development of Taiwan’s medical industry. In the future, ADATA and Hsinchu NTUH plan on using the C-Rob robots to create hospital beds that can autonomously navigate and avoid obstacles, further decreasing the burden on hospital staff.
Swimming robot gives fresh insight into locomotion and neuroscience
Scientists at the Biorobotics Laboratory (BioRob) in EPFL’s School of Engineering are developing innovative robots in order to study locomotion in animals and, ultimately, gain a better understanding of the neuroscience behind the generation of movement. One such robot is AgnathaX, a swimming robot employed in an international study with researchers from EPFL as well as Tohoku University in Japan, Institut Mines-Télécom Atlantique in Nantes, France, and Université de Sherbrooke in Canada. The study has just been published in Science Robotics.
A long, undulating swimming robot
“Our goal with this robot was to examine how the nervous system processes sensory information so as to produce a given kind of movement,” says Prof. Auke Ijspeert, the head of BioRob and a member of the Rescue Robotics Grand Challenge at NCCR Robotics. “This mechanism is hard to study in living organisms because the different components of the central and peripheral nervous systems* are highly interconnected within the spinal cord. That makes it hard to understand their dynamics and the influence they have on each other.”
AgnathaX is a long, undulating swimming robot designed to mimic a lamprey, which is a primitive eel-like fish. It contains a series of motors that actuate the robot’s ten segments, which replicate the muscles along a lamprey’s body. The robot also has force sensors distributed laterally along its segments that work like the pressure-sensitive cells on a lamprey’s skin and detect the force of the water against the animal.
The research team ran mathematical models with their robot to simulate the different components of the nervous system and better understand its intricate dynamics. “We had AgnathaX swim in a pool equipped with a motion tracking system so that we could measure the robot’s movements,” says Laura Paez, a PhD student at BioRob. “As it swam, we selectively activated and deactivated the central and peripheral inputs and outputs of the nervous system at each segment, so that we could test our hypotheses about the neuroscience involved.”
Two systems working in tandem
The scientists found that both the central and peripheral nervous systems contribute to the generation of robust locomotion. The benefit of having the two systems work in tandem is that it provides increased resilience against neural disruptions, such as failures in the communication between body segments or muted sensing mechanisms. “In other words, by drawing on a combination of central and peripheral components, the robot could resist a larger number of neural disruptions and keep swimming at high speeds, as opposed to robots with only one kind of component,” says Kamilo Melo, a co-author of the study. “We also found that the force sensors in the skin of the robot, along with the physical interactions of the robot’s body and the water, provide useful signals for generating and synchronizing the rhythmic muscle activity necessary for locomotion.” As a result, when the scientists cut communication between the different segments of the robot to simulate a spinal cord lesion, the signals from the pressure sensors measuring the pressure of the water pushing against the robot’s body were enough to maintain its undulating motion.
These findings can be used to design more effective swimming robots for search and rescue missions and environmental monitoring. For instance, the controllers and force sensors developed by the scientists can help such robots navigate through flow perturbations and better withstand damage to their technical components. The study also has ramifications in the field of neuroscience. It confirms that peripheral mechanisms provide an important function which is possibly being overshadowed by the well-known central mechanisms. “These peripheral mechanisms could play an important role in the recovery of motor function after spinal cord injury, because, in principle, no connections between different parts of the spinal cord are needed to maintain a traveling wave along the body,” says Robin Thandiackal, a co-author of the study. “That could explain why some vertebrates are able to retain their locomotor capabilities after a spinal cord lesion.”
Literature