Page 337 of 396
1 335 336 337 338 339 396

Collaborative Robot Market

Technology overview

Collaborative robots (also known as co-bots; an emerging automation technology) are designed to work alongside humans with precision, strength, and speed for achieving high efficiency in production. Collaborative robots differ from industrial robots in several ways, such as the absence of “safety fence” while working alongside humans, simplified programming and reduced setup time, integration of auto-speed reduction and distance monitoring or proximity sensors, and ability to reduce motor power and force in case a worker is working too close to the co-bot. Co-bots are safe and easy to operate, and yield quick return on investment (ROI). These robots offer quick ROI owing to their numerous advantages, such as easy installation, simple programming, no additional costs (such as those incurred to build and maintain safety fencing in traditional industrial robots), less costly tools and accessories (such as grippers, cameras, and laser systems), and acceptance by workforce. Co-bot operations are primarily governed by ISO 15066:2016 standard that further specifies various safety requirements. ISO 10218-1:2011, RIA/ANSI R15.06-2012, ISO 12100:2010, and ISO 13849-1:2008 are a few other standards governing safe operations of co-bots. Continuous advancements in sensor technologies, software, and end-of-arm tooling (EOAT) are further expanding collaborative robot capabilities and applications. Co-bots can perform a myriad of tasks that are not feasible using traditional industrial robots. Small and medium-sized enterprises (SMEs) are the major potential customers of co-bots where operations involve light-duty tasks, such as assembly, pick and place, machine tending, and quality inspection, which could be of low value and riskier if done by industrial robots. As per MarketsandMarkets’ analysis, the overall collaborative robot market is expected to be worth over USD 8 billion by 2025, growing at a CAGR of ~55% from 2018 to 2025.

The marketspace of collaborative robots is expected to undergo phenomenal expansion in the nearest future because of significant traction from SMEs worldwide owing to recognizable benefits of co-bots, reduced total cost of ownership (compared with other low-payload industrial robots), and one-stop solution for safe working environment while offering advanced industrial automation solutions to end users. Recent geopolitical phenomenon bolstering the manufacturing sector in developed nations across North America and Europe; rising factory floor automation in China, Japan, and India; growing integration of artificial intelligence (AI) into existing robotics infrastructure; enhancements in 3D machine vision technology; and the potential of cloud robotics to realize connected workspace are some of the key trends to watch out for in the collaborative robotics ecosystem. The shortage of skilled workforce for low-value tasks (such as assembly and pick and place) and limitation of traditional industrial robots in agile and flexible operations while complying with certain safety standards are further opening new growth avenues for co-bots not only across the manufacturing sector but also other sectors such as food, consumer packaging, and healthcare.

 

FIGURE 1           

Industrial Robot market vs. Collaborative Robot market (2017–2023)

Image Source: www.marketsandmarkets.com

 

 

 

 

 

 

 

 

 

Source: Industry Journals, Company Websites, Annual Reports, Press Releases, Expert Interviews, and MarketsandMarekts’ Analysis

Collaborative robots accounted for ~2% of the global industrial robot market in 2017; the market for co-bots was valued at USD 286.8 million in 2017. The share of collaborative robots is expected to increase significantly from 2017 to 2023; these are expected to account for ~17% of the global industrial robot market by 2023; the market for these robots is expected to be valued at USD 4,284.4 million by 2023.

 

Key Drivers for Collaborative Robots

High ROI and low price of collaborative robots attracting SMEs

Depending upon the working capacity and use, most co-bots offer ROI in a span of 6–12 months, and a few co-bots offer ROI within a period of 3–6 months. The collaborative robot market is highly competitive, wherein some players offer higher payload capacity co-bots and enjoy competitive advantage over others. FANUC’s CR-35iA (~USD 80,000) is one of the earliest collaborative robots with a payload capacity of up to 35kg. KUKA’s LBR iiwa is another co-bot priced at ~USD 100,000. ABB’s YuMi and Kawada’s NEXTAGE are other high-priced dual-arm co-bots priced at ~USD 40,000 and ~USD 60,000, respectively, because of the complexity involved in operating these co-bots. Co-bots from other leading manufacturers — UR3 and UR5 from Universal Robots (Denmark), Baxter and Sawyer from Rethink Robotics (US), TM5 Series from Techman Robot (Taiwan), and OB7 from Productive Robotics, Inc. (US) — are available at a price range of USD 20,000–40,000. Rising demand for low-cost co-bots, such as AUBO-i5 from AUBO Robotics (US), Eva from Automata Technologies Limited (UK), and Panda from Franka Emika GmbH (Germany), is further expected to fuel competition in the collaborative robotics market, which in turn will drive down the prices of co-bots in the upcoming years. Considering the additional cost of robot peripheral integration (i.e., the total cost of installing 1 robot, along with the complete set of necessary peripherals and equipment), the cost of integrating a co-bot could reach up to USD 100,000, which is still far less than the total cost incurred in case of conventional industrial robots.

Co-bots come as a packaged solution, comprising all the necessary accessories from manufacturers. The depreciation rate of co-bots is lesser than that of traditional heavyweight industrial robots. These robots can just be plugged in to start functioning, which reduces time, manpower, and floor space, and increases productivity when deployed on a large scale. Co-bots also offer agile manufacturing, and if required, they could be easily deployed from one production line to another without the hassle of changing the programming—i.e., co-bots could be trained by simply hand-guiding their arms—thereby eliminating the need for an expert-level software programmer to reprogram the entire coding of the robot.

Increasing investments in automation by industries

The robot revolution has made automated machines a part of daily life and has gone beyond industrial applications. Companies with expertise in different fields of automation and robotics are investing heavily in robotic ventures. For instance, KUKA AG (Germany), through its investment (acquisition) in Reis Robotics, gained access to new sectors such as foundry, solar energy, and battery production, which helped KUKA in bundling the competence in automated systems.

The robotics sector, through mergers and acquisitions, is providing the companies access to the fast-growing robotics industry. The use of industrial robotics has accelerated automation in an intelligent way, offering functional benefits such as production flexibility with enhanced product quality, improved production output with decline in operating costs, and provision of floor space for utilization. As a result, the automation industry is driving the market for industrial robotics.

The economics of robotics companies is improving with more units being sold in recent years, giving automation a new face. Drivers for this are co-bots with better technology, as well as developments in many regions worldwide. The collective benefits of adopting robotics would increase production with human–robot interface on the same floor. Industrial robotics automation offers the most viable opportunities for investment in sectors such as logistics and e-commerce, healthcare, defense, and agriculture. Automation and technological advancements are further favoring the adoption of collaborative robots by several end-user industries.

 

Current market scenario

 FIGURE 2           

AUTOMOTIVE INDUSTRY TO LEAD THE MARKET, IN TERMS OF VALUE, BY 2023

Image Source: www.marketsandmarkets.com

 

 

 

 

 

Source: Industry Journals, Company Websites, Annual Reports, Press Releases, Expert Interviews, and MarketsandMarekts’ Analysis

Primary industrial verticals for co-bots are electronics; automotive; metals and machining; plastics and polymers; food and agriculture; furniture and equipment; healthcare; scientific research; logistics; education; consumer goods; and die cast and foundry. The automotive industry accounted for the largest share of the collaborative robot market in 2017. In this industry, co-bots are used to perform diverse assembly tasks. Other crucial tasks involving co-bots in the automotive production line are pick and place, quality inspection, packaging and palletizing, machine tending, and material handling.

In 2017, Europe led the collaborative robot market in terms of market share, followed by Asia Pacific and North America. Germany held the largest share of the European collaborative robot market, followed by the UK and France. The growth of the market in Europe was driven by strong government support to promote factory automation solutions, thus supporting Industry 4.0 drive. Many authorities in Europe have collectively or independently developed human–robot collaboration and protection guidelines, which are driving this market.

Universal Robots A/S (Denmark) and ABB Ltd. (Switzerland) are 2 of the key players in the collaborative robot market. Universal Robots offers UR3, UR5, and UR10 collaborative robots named after their payloads in kilograms. The company launched its first robot in December 2008 and currently leads the collaborative robot sector. In June 2016, Universal Robots A/S (Denmark) launched Universal Robots+—an online user-friendly platform allowing users to easily customize robots as per their unique requirements. The company expanded its business operations in the US and India during 2016 to further increase its market reach in these countries.

ABB Ltd. (Switzerland) offers IRB 14000 YuMi, a dual-arm collaborative robot, which is used in a range of light-duty applications, such as small parts assembly, testing and packaging, electronic parts and components assembly, and consumer products assembly. YuMi has a payload capacity of 500g per arm and an overall weight of 38kg; it can move at a maximum velocity of 1,500mm/s. ABB also provides software support (under SISTEMA brand) for its YuMi robot. The company mainly focuses on expanding the reach of its co-bot into new geographies and acquiring other market players. For instance, in June 2017, ABB expanded the presence of its YuMi collaborative robot in the Netherlands by installing 3 YuMi co-bots on a production line at electrification products plant in Ede. In July 2017, ABB further strengthened software support for its robotics business with the acquisition of Bernecker + Rainer Industrie-Elektronik GmbH (Germany). Rising competition among existing co-bot manufacturers and the entrance of new players in the collaborative robot market (such as Techman Robot (a subsidiary of Quanta Storage, Inc., Taiwan) in 2012, Productive Robotics, Inc. (US) in 2013, and Franka Emika GmbH (Germany) in 2016) will further drive down the prices of collaborative robots in the nearest future. Collaborative robots, however, are less powerful and slower than traditional industrial robots, which becomes a challenge in handling a wide range of industrial-grade operations.

 

Conclusion

Collaborative robots are used in several industrial and service applications. The market is growing at a rapid pace owing to technological advancements and the integration of automation tools. The applications of collaborative robots have been increasing from automotive, packaging, distribution, and metalworking to inspection, mobile platforms, and human–machine interaction. However, making collaborative robots safe for routine interaction with humans is a major technological challenge that the robotics industry has been facing for a long time.

In industrial applications, for operating in more than 6 axes (degrees of freedom), co-bots require complex programming and compact assembly to perform an activity. As the ongoing trend of robotics requires the merger of mechanical and software engineering, many companies are capitalizing on this for enhancing their expertise in mechatronics and software.

Complex programming and multiple sensors are required for making a collaborative platform that can operate autonomously to provide a degree of intelligence and reliable mechanisms while performing a particular task. Apart from multiple-axis control, precise motion control has been a major technological challenge for operating alongside humans.

Furthermore, robot manufacturers face challenges while fixing the price and increasing the life expectancy of co-bots. Robot operating capabilities are needed to be scaled while making the system operational with precise real-time environments with instant system response and performing a defined set of steps. For example, modest changes in lighting conditions may also result in considerably divergent behaviors in robots. These technological challenges need to be addressed effectively to make humans confident enough to work alongside robots.

 

References :

  1. https://www.marketsandmarkets.com/Market-Reports/collaborative-robot-market-194541294.html
  2. https://www.marketsandmarkets.com/Market-Reports/Industrial-Robotics-Market-643.html

 

Author Details

Image: Marketsandmarkets.com

Name: Amritpal Singh Bedi

Designation: Research Analyst

Key Areas of Interest: Service Robotics, Collaborative Robotics, 3D Printing, Roll-to-Roll Printing, and Industrial Refrigeration, among others

Academic Qualification: MBA (Marketing), B. Tech (Electronics and Communication)

 

The post Collaborative Robot Market appeared first on Roboticmagazine.

Why it’s so hard to reach an international agreement on killer robots

For several years, civil society groups have been calling for a ban on what they call "killer robots". Scores of technologists have lent their voice to the cause. Some two dozen governments now support a ban and several others would like to see some kind of international regulation.

A new spin for soft micro-actuators

Credit: Wyss Institute Harvard

By Benjamin Boettner

Manipulating delicate tissues such as blood vessels during difficult surgeries, or gripping fragile organisms in the deep sea presents a challenge to surgeons and researchers alike. Roboticists have made inroads into this problem by developing soft actuators on the microscale that are made of elastic materials and, through the expansion or contraction of embedded active components, can change their shapes to gently handle objects without damaging them. However, the specific designs and materials used for their fabrication so far still limit their range of motion and the strength they can exert at scales on which surgeons and researchers would like to use them.

“The fabrication of soft micro-actuators is currently facing two main limitations: the reinforcement layers that direct bending often increase the overall size of the device,” explained Nina Sinatra, a Graduate Student working with Wyss Institute Core Faculty members Robert Wood, Ph.D., and Kevin Kit Parker, Ph.D., at Harvard’s Wyss Institute for Biologically Inspired Engineering and the Harvard Paulson School of Engineering and Applied Sciences (SEAS). “Also, bending micro-actuators in different ways, such as with twisting motions, thus far required that the geometry of the actuator had to be re-designed.”

Now, researchers led by Wood and Parker at the Wyss Institute and SEAS, including Sinatra, have joined their forces to create a new type of soft micro-actuator that can be fabricated using simple shapes and geometries, to exhibit greater toughness and different bend or bend-twist motions. As reported in the Journal of Micromechanics and Microengineering, the team for the first time achieved this feat by combining two very different fabrication approaches and materials with complementary functions.

This video explains how two fabrication techniques, soft lithography and rotary jet spinning of nanofibers, are combined to create a new type of micro-actuator for the manipulation of small fragile objects in challenging environments. Credit: Wyss Institute at Harvard University

“At the start of the project, we hypothesized that we could extend the toughness and bending capabilities of soft micro-actuators simply by reinforcing them with a thin layer of a nanofiber network,” said Sinatra, the study’s first author.

To test this idea, the team first used soft lithography to fabricate the micro-actuator’s elastic component as a thin, elongated shape with a rectangular cross-section. The flexible polymer polydimethylsiloxane (PDMS) was used for this component because of its elastic and chemical properties and compatibility with many biomedical and industrial applications. Soft lithography is a layered fabrication technique that is widely deployed to create diverse micro-scale structures from soft elastomeric materials with the help of stamps, molds, or masks. Choosing to manufacture the micro-actuator layer-by-layer also allowed them to embed a hollow channel along one side, which, in the finished design, can be pressurized with air or liquid to trigger the micro-actuator’s motions.

For the micro-actuator’s nanofiber component, the team took advantage of a rotary jet spinning (RJS) technique previously developed by Parker’s group to create nanofibers from biological materials in an ongoing effort to manufacture regenerating heart valves for patients in need. In RJS, a rotating nozzle extrudes a solution of proteins or polymers that solidify as thin nanofibers; to fabricate nanofabrics for soft micro-actuators, the team used solutions of strong and stretchable synthetic polymers such as nylon and polyurethane. They collected the extruded nanofibers in aligned sheets and bonded laser-cut sections of the sheets to the side opposite of the channel in the soft PDMS component. As the channel is pressurized, the nanofabric will reinforce one side of the actuator and preferentially induce bending toward this direction. This step completed the final composite micro-actuator.

With one end of the actuator free and the other end fixed to a pressure source, the team found their original hypothesis confirmed: the composite soft micro-actuators were 2.3 times tougher than control micro-actuators made of PDMS alone and capable of withstanding pressures almost 26% higher. “By varying the nanofiber orientation and incising angled patterns into nanofiber sheets, we could program bend-twist motions into the soft micro-actuators, which widens our design space and the usability of these micro-devices significantly,” said Sinatra. The resulting micro-actuators are the slimmest soft pneumatic devices to achieve a more complex bending-twisting motion, which point to the potential for unprecedented delicate manipulation on the microscale.

“This new prototype of nanofiber-enforced soft micro-actuators is a compelling starting point for real-world applications that, interfacing this new actuator design with other tools, can facilitate micromanipulation of various fragile objects in challenging environments,” said Wood, who also is the Charles River Professor of Engineering and Applied Sciences at SEAS.

Robots can now pick up any object after inspecting it

With the DON system, a robot can do novel tasks like look at a shoe it has never seen before and successfully grab it by its tongue.
Photo: Tom Buehler/CSAIL

Humans have long been masters of dexterity, a skill that can largely be credited to the help of our eyes. Robots, meanwhile, are still catching up.

Certainly there’s been some progress: For decades, robots in controlled environments like assembly lines have been able to pick up the same object over and over again. More recently, breakthroughs in computer vision have enabled robots to make basic distinctions between objects. Even then, though, the systems don’t truly understand objects’ shapes, so there’s little the robots can do after a quick pick-up.  

In a new paper, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), say that they’ve made a key development in this area of work: a system that lets robots inspect random objects, and visually understand them enough to accomplish specific tasks without ever having seen them before.

The system, called Dense Object Nets (DON), looks at objects as collections of points that serve as sort of visual roadmaps. This approach lets robots better understand and manipulate items, and, most importantly, allows them to even pick up a specific object among a clutter of similar — a valuable skill for the kinds of machines that companies like Amazon and Walmart use in their warehouses.

For example, someone might use DON to get a robot to grab onto a specific spot on an object, say, the tongue of a shoe. From that, it can look at a shoe it has never seen before, and successfully grab its tongue.

“Many approaches to manipulation can’t identify specific parts of an object across the many orientations that object may encounter,” says PhD student Lucas Manuelli, who wrote a new paper about the system with lead author and fellow PhD student Pete Florence, alongside MIT Professor Russ Tedrake. “For example, existing algorithms would be unable to grasp a mug by its handle, especially if the mug could be in multiple orientations, like upright, or on its side.”

The team views potential applications not just in manufacturing settings, but also in homes. Imagine giving the system an image of a tidy house, and letting it clean while you’re at work, or using an image of dishes so that the system puts your plates away while you’re on vacation.

What’s also noteworthy is that none of the data was actually labeled by humans. Instead, the system is what the team calls “self-supervised,” not requiring any human annotations.

Two common approaches to robot grasping involve either task-specific learning, or creating a general grasping algorithm. These techniques both have obstacles: Task-specific methods are difficult to generalize to other tasks, and general grasping doesn’t get specific enough to deal with the nuances of particular tasks, like putting objects in specific spots.

The DON system, however, essentially creates a series of coordinates on a given object, which serve as a kind of visual roadmap, to give the robot a better understanding of what it needs to grasp, and where.

The team trained the system to look at objects as a series of points that make up a larger coordinate system. It can then map different points together to visualize an object’s 3-D shape, similar to how panoramic photos are stitched together from multiple photos. After training, if a person specifies a point on a object, the robot can take a photo of that object, and identify and match points to be able to then pick up the object at that specified point.

This is different from systems like UC-Berkeley’s DexNet, which can grasp many different items, but can’t satisfy a specific request. Imagine a child at 18 months old, who doesn’t understand which toy you want it to play with but can still grab lots of items, versus a four-year old who can respond to “go grab your truck by the red end of it.”

In one set of tests done on a soft caterpillar toy, a Kuka robotic arm powered by DON could grasp the toy’s right ear from a range of different configurations. This showed that, among other things, the system has the ability to distinguish left from right on symmetrical objects.

When testing on a bin of different baseball hats, DON could pick out a specific target hat despite all of the hats having very similar designs — and having never seen pictures of the hats in training data before.

“In factories robots often need complex part feeders to work reliably,” says Florence. “But a system like this that can understand objects’ orientations could just take a picture and be able to grasp and adjust the object accordingly.”

In the future, the team hopes to improve the system to a place where it can perform specific tasks with a deeper understanding of the corresponding objects, like learning how to grasp an object and move it with the ultimate goal of say, cleaning a desk.

The team will present their paper on the system next month at the Conference on Robot Learning in Zürich, Switzerland.

European robot network helps nurses and home builders

Four knowledge institutes across Europe – the Danish Technological Institute (DTI, DK), Fraunhofer IPA (DE), Tecnalia (ES) and the Manufacturing Technology Centre (MTC, UK) – teamed up to offer highly qualified consulting services at no cost to companies that either wanted to use robot technology in their production or wanted to develop new robot technology to sell.

It’s been two years since the European project ROBOTT-NET kicked off and more than 60 projects have been executed with success and 1500 people have come to visit ROBOTT-NETs Open Labs. ROBOTT-NET has among other things helped nurses in F&P Robotics’ voucher. The project “Robot Assistant for Nurses” (RAN) aimed to automate frequent and labor intensive tasks of material preparation before the taking of blood samples.

ROBOTT-NET also helped make home builders’ lives easier with Robot at Works voucher project. The goal of the project was to use digital data in building projects to automate production directly on the building site.

The robot network has also worked with the Swastebot – a specialised waste sorting robot. The company Sadako developed a disruptive system for the recovery of valuable materials such as cans, bottles, bricks etc. in waste treatment plants. The system is based on artificial intelligence for material detection.

In total ROBOTT-NET has worked with 62 projects with companies such as Danfoss, Kärcher, Air Liquide and Nissan.

Now, eight projects have been selected for a Pilot to strengthen robotics development and the competitiveness of European Manufacturing even further. The projects will be teamed up with a real-world use cases and ROBOTT-NET will take the projects all the way to the market – to make even more robotics ideas in industrial and professional service robotics a reality.

You can check out what ROBOTT-NET has achieved so far in this video:

If you want to stay updated on ROBOTT-NET and the events in the projects, you can follow ROBOTT-NET on Twitter and LinkedIn.

Dexterous manipulation with reinforcement learning: Efficient, general, and low-cost

...

In this post, we demonstrate how deep reinforcement learning (deep RL) can be used to learn how to control dexterous hands for a variety of manipulation tasks. We discuss how such methods can learn to make use of low-cost hardware, can be implemented efficiently, and how they can be complemented with techniques such as demonstrations and simulation to accelerate learning.

Why Dexterous Hands?

A majority of robots in use today use simple parallel jaw grippers as manipulators, which are sufficient for structured settings like factories. However, manipulators that are capable of performing a wide array of tasks are essential for unstructured human-centric environments like the home. Multi-fingered hands are among the most versatile manipulators, and enable a wide variety of skills we use in our everyday life such as moving objects, opening doors, typing, and painting.

...

Unfortunately, controlling dexterous hands is extremely difficult, which limits their use. High-end hands can also be extremely expensive, due to delicate sensing and actuation. Deep reinforcement learning offers the promise of automating complex control tasks even with cheap hardware, but many applications of deep RL use huge amounts of simulated data, making them expensive to deploy in terms of both cost and engineering effort. Humans can learn motor skills efficiently, without a simulator and millions of simulations.

We will first show that deep RL can in fact be used to learn complex manipulation behaviors by training directly in the real world, with modest computation and low-cost robotic hardware, and without any model or simulator. We then describe how learning can be further accelerated by incorporating additional sources of supervision, including demonstrations and simulation. We demonstrate learning on two separate hardware platforms: an inexpensive custom-built 3-fingered hand (the Dynamixel Claw), which costs under \$2500, and the higher-end Allegro hand, which costs about \$15,000.

...
Left: Dynamixel Claw. Right: Allegro Hand.

Model-free Reinforcement Learning in the Real World

Deep RL algorithms learn by trial and error, maximizing a user-specified reward function from experience. We’ll use a valve rotation task as a working example, where the hand must open a valve or faucet by rotating it 180 degrees.

...
Illustration of valve rotation task.

The reward function simply consists of the negative distance between the current and desired valve orientation, and the hand must figure out on its own how to move to rotate it. A central challenge in deep RL is in using this weak reward signal to find a complex and coordinated behavior strategy (a policy) that succeeds at the task. The policy is represented by a multilayer neural network. This typically requires a large number of trials, so much so that the community is split on whether deep RL methods can be used for training outside of simulation. However, this imposes major limitations on their applicability: learning directly in the real world makes it possible to learn any task from experience, while using simulators requires designing a suitable simulation, modeling the task and the robot, and carefully adjusting their parameters to achieve good results. We will show later that simulation can accelerate learning substantially, but we first demonstrate that existing RL algorithms can in fact learn this task directly on real hardware.

A variety of algorithms should be suitable. We use Truncated Natural Policy Gradient to learn the task, which requires about 9 hours on real hardware.

... ...
Learning progress of the dynamixel claw on valve rotation.

...
Final Trained Policy on valve rotation.

The direct RL approach is appealing for a number of reasons. It requires minimal assumptions, and is thus well suited to autonomously acquire a large repertoire of skills. Since this approach assumes no information other than access to a reward function, it is easy to relearn the skill in a modified environment, for example when using a different object or a different hand – in this case, the Allegro hand.

...
360° valve rotation with Allegro Hand.

The same exact method can learn to rotate the valve when we use a different material. We can learn how to rotate a valve made out of foam. This can be quite difficult to simulate accurately, and training directly in the real world allows us to learn without needing accurate simulations.

...
Dynamixel claw rotating a foam screw.

The same approach takes 8 hours to solve a different task, which requires flipping an object 180 degrees around the horizontal axis, without any modification.

... ...
Dynamixel claw flipping a block.

These behaviors were learned with low cost hardware (<\$2500) and a single consumer desktop computer.

Accelerating Learning with Human Demonstrations

While model-free RL is extremely general, incorporating supervision from human experts can help accelerate learning further. One way to do this, which we describe in our paper on Demonstration Augmented Policy Gradient (DAPG), is to incorporate human demonstrations into the reinforcement learning process. Related approaches have been proposed in the context of off-policy RL, Q-learning, and other robotic tasks. The key idea behind DAPG is that demonstrations can be used to accelerate RL in two ways

...

  1. Provide a good initialization for the policy via behavior cloning.

  2. Provide an auxiliary learning signal throughout the learning process to guide exploration using a trajectory tracking auxiliary reward.

The auxiliary objective during RL prevents the policy from diverging from the demonstrations during the RL process. Pure behavior cloning with limited data is often ineffective in training successful policies due to distribution drift and limited data support. RL is crucial for robustness and generalization and use of demonstrations can substantially accelerate the learning process. We previously validated this algorithm in simulation on a variety of tasks, shown below, where each task used only 25 human demonstrations collected in virtual reality. DAPG enables a speedup of up to 30x on these tasks, while also learning natural and robust behaviors.

... ... ... ...
Behaviors learned in simulation with DAPG: object pickup, tool use, in-hand, door opening.

... ... ...
Behaviors robust to size and shape variations; natural and smooth behavior.

In the real world, we can use this algorithm with the dynamixel claw to significantly accelerate learning. The demonstrations are collected with kinesthetic teaching, where a human teacher moves the fingers of the robots directly in the real world. This brings down the training time on both tasks to under 4 hours.

... ...
Left: Valve rotation policy with DAPG. Right: Flipping policy with DAPG.

...
Learning Curves of RL from scratch on hardware vs DAPG.

Demonstrations provide a natural way to incorporate human priors and accelerate the learning process. Where high quality successful demonstrations are available, augmenting RL with demonstrations has the potential to substantially accelerate RL. However, obtaining demonstrations may not be possible for all tasks or robot morphologies, necessitating the need to also pursue alternate acceleration schemes.

Accelerating Learning with Simulation

A simulated model of the task can help augment the real world data with large amounts of simulated data to accelerate the learning process. For the simulated data to be representative of the complexities of the real world, randomization of various simulation parameters is often necessitated. This kind of randomization has previously been observed to produce robust policies, and can facilitate transfer in the face of both visual and physical discrepancies. Our experiments also suggest that simulation to reality transfer with randomization can be effective.

...
Policy for valve rotation transferred from simulation using randomization.

Transfer from simulation has also been explored in concurrent work for dexterous manipulation to learn impressive behaviors, and in a number of prior works for tasks such as picking and placing, visual servoing, and locomotion. While simulation to real transfer enabled by randomization is an appealing option, especially for fragile robots, it has a number of limitations. First, the resulting policies can end up being overly conservative due to the randomization, a phenomenon that has been widely observed in the field of robust control. Second, the particular choice of parameters to randomize is crucial for good results, and insights from one task or problem domain may not transfer to others. Third, increasing the amount of randomization results in more complex models tremendously increasing the training time and required computational resources (Andrychowicz et al report 100 years of simulated experience, training in 50 hours on thousands of CPU cores). Directly training in the real world may be more efficient and lead to better policies. Finally, and perhaps most importantly, an accurate simulator must be constructed manually, with each new task modeled by hand in the simulation, which requires substantial time and expertise. However, leveraging simulations appropriately can accelerate the learning, and more systematic transfer methods are an important direction for future work.

Accelerating Learning with Learned Models

In some of our previous work, we also studied how learned dynamics models can accelerate real-world reinforcement learning without access to manually engineered simulators. In this approach, local derivatives of the dynamics are approximated by fitting time-varying linear systems, which can then be used to locally and iterative improve a policy. This approach can acquire a variety of in-hand manipulation strategies from scratch in the real world. Furthermore, we see that the same algorithm can even learn to control a pneumatic soft robotic hand to perform a number of dexterous behaviors

... ...
Left: Adroit robotic hand performing in-hand manipulation. Right: Pneumatic Soft RBO Hand performing dexterous tasks.

However, the performance of methods with learned models is limited by the quality of the model that can be learned, and in practice asymptotic performance is often still higher with the best model-free algorithms. Further study of model-based reinforcement learning for efficient and effective real-world learning is a promising research direction.

Takeaways and Challenges

While training in the real world is general and broadly applicable, it has several challenges of its own.

  1. Due to the requirement to take a large number of exploratory actions, we observed that the hands often heat up quickly, which requires pauses to avoid damage.

  2. Since the hands must attempt the task multiple times, we had to build an automatic reset mechanism. In the future, a promising direction to remove this requirement is to automatically learn reset policies.

  3. Reinforcement learning methods require rewards to be provided, and this reward must still be designed manually. Some of our recent work has looked at automating reward specification.

However, enabling robots to learn complex skills directly in the real world is one of the best paths forward to developing truly generalist robots. In the same way that humans can learn directly from experience in the real world, robots that can acquire skills simply by trial and error can explore novel solutions to difficult manipulation problems and discover them with minimal human intervention. At the same time, the availability of demonstrations, simulators, and other prior knowledge can further reduce training times.


This article was initially published on the BAIR blog, and appears here with the authors’ permission. The work in this post is based on these papers:

A complete paper on the new robotic experiments will be released soon. The research was conducted by Henry Zhu, Abhishek Gupta, Vikash Kumar, Aravind Rajeswaran, and Sergey Levine. Collaborators on earlier projects include Emo Todorov, John Schulman, Giulia Vezzani, Pieter Abbeel, Clemens Eppner.

The pedestrian experiment

Followers of this blog will know that I have been working for some years on simulation-based internal models – demonstrating their potential for ethical robotssafer robots and imitating robots. But pretty much all of our experiments so far have involved only one robot with a simulation-based internal model while the other robots it interacts with have no internal model at all.

But some time ago we wondered what would happen if two robots, each with a simulation-based internal model, interacted with each other. Imagine two such robots approaching each other in the same way that two pedestrians approach each other on the sidewalk. Is it possible that these ‘pedestrian’ robots might, from time to time, engage in the kind of ‘dance’ that human pedestrians do when one steps to their left and the other to their right only to compound the problem of avoiding a collision with a stranger? The answer, it turns out, is yes!

The idea was taken up by Mathias Schmerling at the Humboldt University of Berlin, adapting the code developed by Christian Blum for the Corridor experiment. Chen Yang, one of my masters students, has now updated Mathias’ code and has produced some very nice new results.

Most of the time the pedestrian robots pass each other without fuss but in something between 1 in 5 and 1 in 10 trials we do indeed see an interesting dance. Here are a couple of examples of the majority of trials, when the robots pass each other normally, showing the robots’ trajectories. In each trial blue starts from the left and green from the right. Note that there is an element of randomness in the initial directions of each robot (which almost certainly explains the relative occurrence of normal and dance behaviours).

And here is a gif animation showing what’s going on in a normal trial. The faint straight lines from each robot show the target directions for each next possible action modelled in each robot’s simulation-based internal model (consequence engine); the various dotted lines show the predicted paths (and possible collisions) and the solid blue and green lines show which next action is actually selected following the internal modelling.

Here is a beautiful example of a ‘dance’, again showing the robot trajectories. Note that the impasse resolves itself after awhile. We’re still trying to figure out exactly what mechanism enables this resolution.

And here is the gif animation of the same trial:

Notice that the impasse is not resolved until the fifth turns of each robot.

Is this the first time that pedestrians passing each other – and in particular the occasional dance that ensues – has been computationally modelled?

All of the results above were obtained in simulation (yes there really are simulations within a simulation going on here), but within the past week Chen Yang has got this experiment working with real e-puck robots. Videos will follow shortly.


Acknowledgements.

I am indebted to the brilliant experimental work of first Christian Blum (supported by Wenguo Liu), then Mathias Schmerling who adapted Christian’s code for this experiment, and now Chen Yang who has developed the code further and obtained these results.

Page 337 of 396
1 335 336 337 338 339 396