Archive 12.04.2023

Page 46 of 63
1 44 45 46 47 48 63

Exploring movement optimization for a cyborg cockroach with machine learning

Have you ever wondered why some insects like cockroaches prefer to stay or decrease movement in darkness? Some may tell you it's called photophobia, a habit deeply coded in their genes. A further question would be whether we can correct this habit of cockroaches, that is, moving in the darkness just as they move in bright backgrounds.

Toward tactile and proximity sensing in large soft robots

In recent years, robots have become incredibly sophisticated machines capable of performing or assisting humans in all tasks. The days of robots functioning behind a security barrier are long gone, and today we may anticipate robots working alongside people in close contact.

Robots are everywhere—improving how they communicate with people could advance human-robot collaboration

Robots are machines that can sense the environment and use that information to perform an action. You can find them nearly everywhere in industrialized societies today. There are household robots that vacuum floors and warehouse robots that pack and ship goods. Lab robots test hundreds of clinical samples a day. Education robots support teachers by acting as one-on-one tutors, assistants and discussion facilitators. And medical robotics composed of prosthetic limbs can enable someone to grasp and pick up objects with their thoughts.

Interactive fleet learning

Commercial and industrial deployments of robot fleets: package delivery (top left), food delivery (bottom left), e-commerce order fulfillment at Ambi Robotics (top right), autonomous taxis at Waymo (bottom right).

In the last few years we have seen an exciting development in robotics and artificial intelligence: large fleets of robots have left the lab and entered the real world. Waymo, for example, has over 700 self-driving cars operating in Phoenix and San Francisco and is currently expanding to Los Angeles. Other industrial deployments of robot fleets include applications like e-commerce order fulfillment at Amazon and Ambi Robotics as well as food delivery at Nuro and Kiwibot.

Figure 1: “Interactive Fleet Learning” (IFL) refers to robot fleets in industry and academia that fall back on human teleoperators when necessary and continually learn from them over time.

These robots use recent advances in deep learning to operate autonomously in unstructured environments. By pooling data from all robots in the fleet, the entire fleet can efficiently learn from the experience of each individual robot. Furthermore, due to advances in cloud robotics, the fleet can offload data, memory, and computation (e.g., training of large models) to the cloud via the Internet. This approach is known as “Fleet Learning,” a term popularized by Elon Musk in 2016 press releases about Tesla Autopilot and used in press communications by Toyota Research Institute, Wayve AI, and others. A robot fleet is a modern analogue of a fleet of ships, where the word fleet has an etymology tracing back to flēot (‘ship’) and flēotan (‘float’) in Old English.

Data-driven approaches like fleet learning, however, face the problem of the “long tail”: the robots inevitably encounter new scenarios and edge cases that are not represented in the dataset. Naturally, we can’t expect the future to be the same as the past! How, then, can these robotics companies ensure sufficient reliability for their services?

One answer is to fall back on remote humans over the Internet, who can interactively take control and “tele-operate” the system when the robot policy is unreliable during task execution. Teleoperation has a rich history in robotics: the world’s first robots were teleoperated during WWII to handle radioactive materials, and the Telegarden pioneered robot control over the Internet in 1994. With continual learning, the human teleoperation data from these interventions can iteratively improve the robot policy and reduce the robots’ reliance on their human supervisors over time. Rather than a discrete jump to full robot autonomy, this strategy offers a continuous alternative that approaches full autonomy over time while simultaneously enabling reliability in robot systems today.

The use of human teleoperation as a fallback mechanism is increasingly popular in modern robotics companies: Waymo calls it “fleet response,” Zoox calls it “TeleGuidance,” and Amazon calls it “continual learning.” Last year, a software platform for remote driving called Phantom Auto was recognized by Time Magazine as one of their Top 10 Inventions of 2022. And just last month, John Deere acquired SparkAI, a startup that develops software for resolving edge cases with humans in the loop.

A remote human teleoperator at Phantom Auto, a software platform for enabling remote driving over the Internet.

Despite this growing trend in industry, however, there has been comparatively little focus on this topic in academia. As a result, robotics companies have had to rely on ad hoc solutions for determining when their robots should cede control. The closest analogue in academia is interactive imitation learning (IIL), a paradigm in which a robot intermittently cedes control to a human supervisor and learns from these interventions over time. There have been a number of IIL algorithms in recent years for the single-robot, single-human setting including DAgger and variants such as HG-DAgger, SafeDAgger, EnsembleDAgger, and ThriftyDAgger; nevertheless, when and how to switch between robot and human control is still an open problem. This is even less understood when the notion is generalized to robot fleets, with multiple robots and multiple human supervisors.

IFL Formalism and Algorithms

To this end, in a recent paper at the Conference on Robot Learning we introduced the paradigm of Interactive Fleet Learning (IFL), the first formalism in the literature for interactive learning with multiple robots and multiple humans. As we’ve seen that this phenomenon already occurs in industry, we can now use the phrase “interactive fleet learning” as unified terminology for robot fleet learning that falls back on human control, rather than keep track of the names of every individual corporate solution (“fleet response”, “TeleGuidance”, etc.). IFL scales up robot learning with four key components:

  1. On-demand supervision. Since humans cannot effectively monitor the execution of multiple robots at once and are prone to fatigue, the allocation of robots to humans in IFL is automated by some allocation policy \omega. Supervision is requested “on-demand” by the robots rather than placing the burden of continuous monitoring on the humans.
  2. Fleet supervision. On-demand supervision enables effective allocation of limited human attention to large robot fleets. IFL allows the number of robots to significantly exceed the number of humans (e.g., by a factor of 10:1 or more).
  3. Continual learning. Each robot in the fleet can learn from its own mistakes as well as the mistakes of the other robots, allowing the amount of required human supervision to taper off over time.
  4. The Internet. Thanks to mature and ever-improving Internet technology, the human supervisors do not need to be physically present. Modern computer networks enable real-time remote teleoperation at vast distances.

In the Interactive Fleet Learning (IFL) paradigm, M humans are allocated to the robots that need the most help in a fleet of N robots (where N can be much larger than M). The robots share policy \pi_{\theta_t} and learn from human interventions over time.

We assume that the robots share a common control policy \pi_{\theta_t} and that the humans share a common control policy \pi_H. We also assume that the robots operate in independent environments with identical state and action spaces (but not identical states). Unlike a robot swarm of typically low-cost robots that coordinate to achieve a common objective in a shared environment, a robot fleet simultaneously executes a shared policy in distinct parallel environments (e.g., different bins on an assembly line).

The goal in IFL is to find an optimal supervisor allocation policy \omega, a mapping from \mathbf{s}^t (the state of all robots at time t) and the shared policy \pi_{\theta_t} to a binary matrix that indicates which human will be assigned to which robot at time t. The IFL objective is a novel metric we call the “return on human effort” (ROHE):

    \[\max_{\omega \in \Omega} \mathbb{E}_{\tau \sim p_{\omega, \theta_0}(\tau)} \left[\frac{M}{N} \cdot \frac{\sum_{t=0}^T \bar{r}( \mathbf{s}^t, \mathbf{a}^t)}{1+\sum_{t=0}^T \|\omega(\mathbf{s}^t, \pi_{\theta_t}, \cdot) \|^2 _F} \right]\]

where the numerator is the total reward across robots and timesteps and the denominator is the total amount of human actions across robots and timesteps. Intuitively, the ROHE measures the performance of the fleet normalized by the total human supervision required. See the paper for more of the mathematical details.

Using this formalism, we can now instantiate and compare IFL algorithms (i.e., allocation policies) in a principled way. We propose a family of IFL algorithms called Fleet-DAgger, where the policy learning algorithm is interactive imitation learning and each Fleet-DAgger algorithm is parameterized by a unique priority function \hat p: (s, \pi_{\theta_t}) \rightarrow [0, \infty) that each robot in the fleet uses to assign itself a priority score. Similar to scheduling theory, higher priority robots are more likely to receive human attention. Fleet-DAgger is general enough to model a wide range of IFL algorithms, including IFL adaptations of existing single-robot, single-human IIL algorithms such as EnsembleDAgger and ThriftyDAgger. Note, however, that the IFL formalism isn’t limited to Fleet-DAgger: policy learning could be performed with a reinforcement learning algorithm like PPO, for instance.

IFL Benchmark and Experiments

To determine how to best allocate limited human attention to large robot fleets, we need to be able to empirically evaluate and compare different IFL algorithms. To this end, we introduce the IFL Benchmark, an open-source Python toolkit available on Github to facilitate the development and standardized evaluation of new IFL algorithms. We extend NVIDIA Isaac Gym, a highly optimized software library for end-to-end GPU-accelerated robot learning released in 2021, without which the simulation of hundreds or thousands of learning robots would be computationally intractable. Using the IFL Benchmark, we run large-scale simulation experiments with N = 100 robots, M = 10 algorithmic humans, 5 IFL algorithms, and 3 high-dimensional continuous control environments (Figure 1, left).

We also evaluate IFL algorithms in a real-world image-based block pushing task with N = 4 robot arms and M = 2 remote human teleoperators (Figure 1, right). The 4 arms belong to 2 bimanual ABB YuMi robots operating simultaneously in 2 separate labs about 1 kilometer apart, and remote humans in a third physical location perform teleoperation through a keyboard interface when requested. Each robot pushes a cube toward a unique goal position randomly sampled in the workspace; the goals are programmatically generated in the robots’ overhead image observations and automatically resampled when the previous goals are reached. Physical experiment results suggest trends that are approximately consistent with those observed in the benchmark environments.

Takeaways and Future Directions

To address the gap between the theory and practice of robot fleet learning as well as facilitate future research, we introduce new formalisms, algorithms, and benchmarks for Interactive Fleet Learning. Since IFL does not dictate a specific form or architecture for the shared robot control policy, it can be flexibly synthesized with other promising research directions. For instance, diffusion policies, recently demonstrated to gracefully handle multimodal data, can be used in IFL to allow heterogeneous human supervisor policies. Alternatively, multi-task language-conditioned Transformers like RT-1 and PerAct can be effective “data sponges” that enable the robots in the fleet to perform heterogeneous tasks despite sharing a single policy. The systems aspect of IFL is another compelling research direction: recent developments in cloud and fog robotics enable robot fleets to offload all supervisor allocation, model training, and crowdsourced teleoperation to centralized servers in the cloud with minimal network latency.

While Moravec’s Paradox has so far prevented robotics and embodied AI from fully enjoying the recent spectacular success that Large Language Models (LLMs) like GPT-4 have demonstrated, the “bitter lesson” of LLMs is that supervised learning at unprecedented scale is what ultimately leads to the emergent properties we observe. Since we don’t yet have a supply of robot control data nearly as plentiful as all the text and image data on the Internet, the IFL paradigm offers one path forward for scaling up supervised robot learning and deploying robot fleets reliably in today’s world.

This post is based on the paper “Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human Supervision” by Ryan Hoque, Lawrence Chen, Satvik Sharma, Karthik Dharmarajan, Brijen Thananjeyan, Pieter Abbeel, and Ken Goldberg, presented at the Conference on Robot Learning (CoRL) 2022. For more details, see the paper on arXiv, CoRL presentation video on YouTube, open-source codebase on Github, high-level summary on Twitter, and project website.

If you would like to cite this article, please use the following bibtex:

@article{ifl_blog,
    title={Interactive Fleet Learning},
    author={Hoque, Ryan},
    url={https://bair.berkeley.edu/blog/2023/04/06/ifl/},
    journal={Berkeley Artificial Intelligence Research Blog},
    year={2023} 
}

For Next-generation Surgical Robots, Minimize the Axial Length of Your Robotic Joints

Standalone arms provide much greater flexibility in positioning compared to the conventional design, allowing multiple arms to be aligned in a plane much closer to parallel. To further approach the parallel ideal, the bulk of each arm must be minimized.

Innocence over utilitarianism: Heightened moral standards for robots in rescue dilemmas

The Moralities of Intelligent Machines research group headed by Michael Laakasuo investigates people's moral views on imaginary rescue situations where the rescuer is either a human or a robot specifically designed for the task. The rescuer has to decide whether to save, for example, one innocent victim of a boating accident or two individuals whose irresponsible behavior caused the accident.

Brain-inspired intelligent robotics: Theoretical analysis and systematic application

Robots have become a crucial indicator for measuring the competitive strength of a country in science and technology. Robotic systems have made advancements in fields such as mechanical engineering, control and artificial intelligence technologies. However, the performance of current robotic systems still includes limitations and cannot satisfy the demands of an increasing number of applications. In order to address these problems, researchers have constructed a brain-inspired intelligent robotic system.

A framework to enable touch-enhanced robotic grasping using tactile sensors

To successfully cooperate with humans on manual tasks, robots should be able to grasp and manipulate a variety of objects without dropping or damaging them. Recent research efforts in the field of robotics have thus focused on developing tactile sensors and controllers that could provide robots with the sense of touch and bring their object manipulation capabilities closer to those of humans.

Robotic hand can identify objects with just one grasp

MIT researchers developed a soft-rigid robotic finger that incorporates powerful sensors along its entire length, enabling them to produce a robotic hand that could accurately identify objects after only one grasp. Image: Courtesy of the researchers

By Adam Zewe | MIT News Office

Inspired by the human finger, MIT researchers have developed a robotic hand that uses high-resolution touch sensing to accurately identify an object after grasping it just one time.

Many robotic hands pack all their powerful sensors into the fingertips, so an object must be in full contact with those fingertips to be identified, which can take multiple grasps. Other designs use lower-resolution sensors spread along the entire finger, but these don’t capture as much detail, so multiple regrasps are often required.

Instead, the MIT team built a robotic finger with a rigid skeleton encased in a soft outer layer that has multiple high-resolution sensors incorporated under its transparent “skin.” The sensors, which use a camera and LEDs to gather visual information about an object’s shape, provide continuous sensing along the finger’s entire length. Each finger captures rich data on many parts of an object simultaneously.

Using this design, the researchers built a three-fingered robotic hand that could identify objects after only one grasp, with about 85 percent accuracy. The rigid skeleton makes the fingers strong enough to pick up a heavy item, such as a drill, while the soft skin enables them to securely grasp a pliable item, like an empty plastic water bottle, without crushing it.

These soft-rigid fingers could be especially useful in an at-home-care robot designed to interact with an elderly individual. The robot could lift a heavy item off a shelf with the same hand it uses to help the individual take a bath.

“Having both soft and rigid elements is very important in any hand, but so is being able to perform great sensing over a really large area, especially if we want to consider doing very complicated manipulation tasks like what our own hands can do. Our goal with this work was to combine all the things that make our human hands so good into a robotic finger that can do tasks other robotic fingers can’t currently do,” says mechanical engineering graduate student Sandra Liu, co-lead author of a research paper on the robotic finger.

Liu wrote the paper with co-lead author and mechanical engineering undergraduate student Leonardo Zamora Yañez and her advisor, Edward Adelson, the John and Dorothy Wilson Professor of Vision Science in the Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the RoboSoft Conference.

A human-inspired finger

The robotic finger is comprised of a rigid, 3D-printed endoskeleton that is placed in a mold and encased in a transparent silicone “skin.” Making the finger in a mold removes the need for fasteners or adhesives to hold the silicone in place.

The researchers designed the mold with a curved shape so the robotic fingers are slightly curved when at rest, just like human fingers.

“Silicone will wrinkle when it bends, so we thought that if we have the finger molded in this curved position, when you curve it more to grasp an object, you won’t induce as many wrinkles. Wrinkles are good in some ways — they can help the finger slide along surfaces very smoothly and easily — but we didn’t want wrinkles that we couldn’t control,” Liu says.

The endoskeleton of each finger contains a pair of detailed touch sensors, known as GelSight sensors, embedded into the top and middle sections, underneath the transparent skin. The sensors are placed so the range of the cameras overlaps slightly, giving the finger continuous sensing along its entire length.

The GelSight sensor, based on technology pioneered in the Adelson group, is composed of a camera and three colored LEDs. When the finger grasps an object, the camera captures images as the colored LEDs illuminate the skin from the inside.

Image: Courtesy of the researchers

Using the illuminated contours that appear in the soft skin, an algorithm performs backward calculations to map the contours on the grasped object’s surface. The researchers trained a machine-learning model to identify objects using raw camera image data.

As they fine-tuned the finger fabrication process, the researchers ran into several obstacles.

First, silicone has a tendency to peel off surfaces over time. Liu and her collaborators found they could limit this peeling by adding small curves along the hinges between the joints in the endoskeleton.

When the finger bends, the bending of the silicone is distributed along the tiny curves, which reduces stress and prevents peeling. They also added creases to the joints so the silicone is not squashed as much when the finger bends.

While troubleshooting their design, the researchers realized wrinkles in the silicone prevent the skin from ripping.

“The usefulness of the wrinkles was an accidental discovery on our part. When we synthesized them on the surface, we found that they actually made the finger more durable than we expected,” she says.

Getting a good grasp

Once they had perfected the design, the researchers built a robotic hand using two fingers arranged in a Y pattern with a third finger as an opposing thumb. The hand captures six images when it grasps an object (two from each finger) and sends those images to a machine-learning algorithm which uses them as inputs to identify the object.

Because the hand has tactile sensing covering all of its fingers, it can gather rich tactile data from a single grasp.

“Although we have a lot of sensing in the fingers, maybe adding a palm with sensing would help it make tactile distinctions even better,” Liu says.

In the future, the researchers also want to improve the hardware to reduce the amount of wear and tear in the silicone over time and add more actuation to the thumb so it can perform a wider variety of tasks.


This work was supported, in part, by the Toyota Research Institute, the Office of Naval Research, and the SINTEF BIFROST project.

Page 46 of 63
1 44 45 46 47 48 63