Page 1 of 262
1 2 3 262

Estimating manipulation intentions to ease teleoperation

child with VR headset and image of blocks on a table

Teleoperation is one of the longest-standing application fields in robotics. While full autonomy is still work in progress, the possibility to remotely operate a robot has already opened scenarios where humans can act in risky environments without endangering their own safety, such as when defusing explosives or decommissioning nuclear waste. It also allows one to be present and act even at great distance: underwater, in space, or inside a patient miles away from the surgeon. These are all critical applications, where skilled and qualified operators control the robot after receiving specific training to learn to use the system safely.

Teleoperation for everyone?

The recent pandemic has yet made even more apparent the need for immersive telepresence and remote action also for non-expert users: not only could teleoperated robots take vitals or bring drugs to infectious patients, but we could assist our elderly living far away with chores like moving heavy stuff, or cooking, for example. Also, numerous physical jobs could be executed from home.

The recent ANA-Xprize finals have shown how far teleoperation can go (see this impressive video of the winning team), but in such situations both the perceptual and control load lie entirely on the operator. This can be quite taxing on a cognitive level: both perception and action are mediated, by cameras and robotic arms respectively, reducing the user’s situation awareness and natural eye-hand coordination. While robot sensing capabilities and actuators have undergone relevant technological progress, the interface with the user still lacks intuitive solutions facilitating the operator’s job (Rea & Seo, 2022).

Human and robot joining forces

Shared control has gained popularity in recent years, as an approach championing human-machine cooperation: low-level motor control is carried out by the robot, while the human is focused on high-level action planning. To achieve such a blend, the robotic system still needs a timely way to infer the operator intention, so as to consequently assist with the execution. Usually, motor intentions are inferred by tracking arm movements or motion control commands (if the robot is operated by means of a joystick), but especially during object manipulation the hand is tightly following information collected by the gaze. In the last decades, increasing evidence in eye-hand coordination studies has shown that gaze reliably anticipates the hand movement target (Hayhoe et al., 2012), providing an early cue about human intention.

Gaze and motion features to estimate intentions

In a contribution presented at IROS 2022 last month (Belardinelli et al., 2022), we introduced an intention estimation model that relies on both gaze and motion features. We collected pick-and-place sequences in a virtual environment, where participants could operate two robotic grippers to grasp objects on a cluttered table. Motion controllers were used to track arm motions and to grasp objects by button press. Eye movements were tracked by the eye-tracker embedded in the virtual reality headset.

Gaze features were computed by defining a Gaussian distribution centered at the gaze position and taking for each object the likelihood for it to be the target of visual attention, which was given by the cumulative distribution collected by the object bounding box. For the motion features, the hand pose and velocity were used to estimate the hand’s current trajectory which was compared to an estimated optimal trajectory to each object. The normalized similarity between the two trajectories defined the likelihood of each object to be the target of the current movement.

Figure 1: Gaze features (top) and motion features (bottom) used for intention estimation. In both videos the object highlighted in green is the most likely target of visual attention and of hand movement, respectively.

These features along with the binary grasping state were used to train two Gaussian Hidden Markov Models, one on pick and one on place sequences. For 12 different intentions (picking of 6 different objects and placing at 6 different locations) the general accuracy (F1 score) was above 80%, even for occluded objects. Importantly, for both actions already 0.5 seconds before the end of the movement a prediction with over 90% accuracy was available for at least 70% of the observations. This would allow for an assisting plan to be instantiated and executed by the robot.

We also conducted an ablation study to determine the contribution of different feature combinations. While the models with gaze, motion, and grasping features performed better in the cross validation, the improvement with respect to only gaze and grasping state was minimal. Even when checking obstacles nearby at first, in fact, the gaze was already on the target before the hand trajectory became sufficiently discriminative.

We also ascertained that our models could generalize from one hand to the other (when fed the corresponding hand motion features), hence the same models could be used to concurrently estimate each hand intention. By feeding each hand prediction to a simple rule-based framework, basic bimanual intentions could also be recognized. So, for example, reaching for an object with the left hand while the right hand is going to place the same object on the left hand is considered a bimanual handover.

Figure 2: Online intention estimation: the red frame denotes the current right-hand intention prediction, the green frame the left-hand prediction. Above the scene, the bimanual intention is shown in capital letters.

Such an intention estimation model could help an operator to execute such manipulations without focusing on selecting the parameters for the exact motor execution of the pick and place, something we don’t usually do consciously in natural eye-hand coordination, since we automated such cognitive processes. For example, once a grasping intention is estimated with enough confidence, the robot could autonomously select the best grasp and grasping position and execute the grasp, relieving the operator of carefully monitoring a grasp without tactile feedback and possibly with inaccurate depth estimation.

Further, even if in our setup motion features were not decisive for early intention prediction, they might play a larger role in more complex settings and when extending the spectrum of bimanual manipulations.

Combined with suitable shared control policies and feedback visualizations, such systems could also enable untrained operators to control robotic manipulators transparently and effectively for longer times, improving the general mental workload of remote operation.


Belardinelli, A., Kondapally, A. R., Ruiken, D., Tanneberg, D., & Watabe, T. (2022). Intention estimation from gaze and motion features for human-robot shared-control object manipulation. 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022.

Hayhoe, M. M., McKinney, T., Chajka, K., & Pelz, J. B. (2012). Predictive eye movements in natural vision. Experimental brain research, 217(1), 125-136.

Rea, D. J., & Seo, S. H. (2022). Still Not Solved: A Call for Renewed Focus on User-Centered Teleoperation Interfaces. Frontiers in Robotics and AI, 9.

Robot helps researchers achieve a new record at the world’s deepest cave pit

A small city in the east of Czechia called Hranice harbors a unique natural wonder that is not (yet) trending on Instagram. The Hranice Abyss is the deepest underwater cave pit in the world, and a team of researchers recently beat a record by revealing more about its depth.

Countering Luddite politicians with life (and cost) saving machines

Earlier this month, Candy Crush celebrated its decade birthday by hosting a free party in lower Manhattan. The climax culminated with a drone light display of 500 Unmanned Ariel Vehicles (UAVs) illustrating the whimsical characters of the popular mobile game over the Hudson. Rather than applauding the decision, New York lawmakers ostracized the avionic wonders to Jersey. In the words of Democratic State Senator, Brad Hoylman, “Nobody owns New York City’s skyline – it is a public good and to allow a private company to reap profits off it is in itself offensive.” The complimentary event followed the model of Macy’s New York fireworks that have illuminated the Hudson skies since 1958. Unlike the department store’s pyrotechnics that release dangerous greenhouse gases into the atmosphere, drones are a quiet climate-friendly choice. Still, Luddite politicians plan to introduce legislation to ban the technology as a public nuisance, citing its impact on migratory birds, which are often more spooked by skyscrapers in Hoylman’s district.

Beyond aerial tricks, drones are now being deployed in novel ways to fill the labor gap of menial jobs that have not returned since the pandemic. Founded in 2018, Andrew Ashur’s Lucid Drones has been power-washing buildings throughout the United States for close to five years. As the founder told me: “I saw window washers hanging off the side of the building on a swing stage and it was a mildly windy day. You saw this platform get caught in the wind and all of a sudden the platform starts slamming against the side of the building. The workers were up there, hanging on for dear life, and I remember having two profound thoughts in this moment. The first one, thank goodness that’s not me up there. And then the second one was how can we leverage technology to make this a safer, more efficient job?” At the time, Ashur was a junior at Davidson College playing baseball. The self-starter knew he was on to a big market opportunity.

Each year, more than 160,000 emergency room injuries, and 300 deaths, are caused by falling off of ladders in the United States. Entrepreneurs like Ashur understood that drones were uniquely qualified to free humans from such dangerous work. This first required building a sturdy tethered quadcopter, capable of a 300 psi flow rate, connected to a tank for power and cleaning fluid for less than the cost of the annual salary of one window cleaner. After overcoming the technical hurdle, the even harder task was gaining sales traction. Unlike many hardware companies that set out to disrupt the market and sell directly to end customers; Lucid partnered with existing building maintenance operators. “Our primary focus is on existing cleaning companies. And the way to think about it is we’re now the shiniest tool in their toolkit that helps them do more jobs with less time and less liability to make more revenue,” explains Ashur. This relationship was further enhanced this past month with the announcement of a partnership with Sunbelt Rentals, servicing its 1,000 locations throughout California, Florida, and Texas. Lucid’s drones are now within driving distance of the majority of the 86,000 facade cleaning companies in America.

According to Commercial Buildings Energy Consumption Survey, there are 5.9 million commercial office buildings in the United States, with an average height of 16 floors. This means there is room for many robot cleaning providers. Competing directly with Lucid are several other drone operators, including Apellix, Aquiline Drones, Alpha Drones, and a handful of local upstarts. In addition, there are several winch-powered companies, such as Skyline Robotics, HyCleaner, Serbot, Erlyon, Kite Robotics, and SkyPro. Facade cleaning is ripe for automation as it is a dangerous, costly, repetitive task that can be safely accomplished by an uncrewed system. As Ashur boasts, “You improve that overall profitability because it’s fewer labor hours. You’ve got lower insurance on ground cleaner versus an above ground cleaner as well as the other equipment.” His system being tethered, ground-based, and without any ladders is the safest way to power wash a multistory office building. He elaborated further on the cost savings, “It lowers insurance cost, especially when you look at how workers comp is calculated… we had a customer, one of their workers missed the bottom rung of the ladder, the bottom rung, he shattered his ankle. OSHA classifies it as a hazardous workplace injury. Third workers comp rates are projected to increase by an annual $25,000 over the next five years. So it’s a six-figure expense for just that one business from missing one single bottom rung of the ladder and unfortunately, you hear stories of people falling off a roof or other terrible accidents that are life changing or in some cases life lost. So that’s the number one thing you get to eliminate with the drone by having people on the ground.”

As older construction workers are retiring at alarming numbers and a declining younger population of skilled laborers, I pressed Ashur on the future of Lucid in expanding to other areas. He retorted, “Cleaning drones, that’s just chapter one of our story here at Lucid. We look all around us at service industries that are being crippled by labor shortages.” He continued to suggest that robots could inspire a younger, more creative workforce, “When it comes to the future of work, we really believe that robotics is the answer because what makes us distinctly human isn’t our ability to do a physical task in a repetitive fashion. It’s our ability to be creative and problem solve… And that’s the direction that the younger populations are showing they’re gravitating towards” He hinted further at some immediate areas of revenue growth, “Since we launched a website many years ago, about 50% of our requests come from international opportunities. So it is very much so a global problem.” In New York, buildings taller than six stories are required to have their facades inspected and repaired every five years (Local Law 11). Rather than shunting drones, State Senator Hoylman should be contacting companies like Lucid for ideas to automate facade work and create a new Manhattan-launched industry.

Five ways drones will change the way buildings are designed

Drones are already shaping the face of our cities—used for building planning, heritage, construction and safety enhancement. But, as studies by the UK's Department of Transport have found, swathes of the public have a limited understanding of how drones might be practically applied.

Six-dimensional force and torque sensor helps robots feel and touch

As an integral part of modern life, sensors are now almost ubiquitous. They measure physical input from the environment and convert it into data that can be interpreted by humans or machines. A new six-dimensional force/torque sensor has been developed by a research team from the Hefei Institutes of Physical Science of the Chinese Academy of Sciences.

Tiny swimming robots can restructure materials on a microscopic level

Controlling microscopic processes is inherently challenging. The everyday tools we use to manipulate matter on the macroscale can't simply be shrunk down to the size of cell, and even if they could, the physical forces they rely on work differently when their targets are measured in nanometers.

Lightweight robotic leg prosthesis replicates the biomechanics of the knee, ankle and toe joint

The Utah Bionic Leg, a motorized prosthetic for lower-limb amputees developed by University of Utah mechanical engineering associate professor Tommaso Lenzi and his students in the HGN Lab, is on the cover of the newest issue of Science Robotics.

The Utah Bionic Leg: A motorized prosthetic for lower-limb amputees

The Utah Bionic Leg is a motorized prosthetic for lower-limb amputees developed by University of Utah mechanical engineering associate professor Tommaso Lenzi and his students in the HGN Lab.

Lenzi’s Utah Bionic Leg uses motors, processors, and advanced artificial intelligence that all work together to give amputees more power to walk, stand-up, sit-down, and ascend and descend stairs and ramps. The extra power from the prosthesis makes these activities easier and less stressful for amputees, who normally need to over-use their upper body and intact leg to compensate for the lack of assistance from their prescribed prosthetics. The Utah Bionic Leg will help people with amputations, particularly elderly individuals, to walk much longer and attain new levels of mobility.

“If you walk faster, it will walk faster for you and give you more energy. Or it adapts automatically to the height of the steps in a staircase. Or it can help you cross over obstacles,” Lenzi says.

The Utah Bionic Leg uses custom-designed force and torque sensors as well as accelerometers and gyroscopes to help determine the leg’s position in space. Those sensors are connected to a computer processor that translates the sensor inputs into movements of the prosthetic joints. Based on that real-time data, the leg provides power to the motors in the joints to assist in walking, standing up, walking up and down stairs, or maneuvering around obstacles. The leg’s “smart transmission system” connects the electrical motors to the robotic joints of the prosthetic. This optimized system automatically adapts the joint behaviors for each activity, like shifting gears on a bike.

Finally, in addition to the robotic knee joint and robotic ankle joint, the Utah Bionic Leg has a robotic toe joint to provide more stability and comfort while walking. The sensors, processors, motors, transmission system, and robotic joints enable users to control the prosthetic intuitively and continuously, as if it was an intact biological leg.

Details of the leg’s newest technologies are described in a paper published in the journal. The paper was authored by University of Utah mechanical engineering graduate students Minh Tran, Lukas Grabert, Sarah Hood and Lenzi. You can read the paper here.

Lenzi and the university recently forged a new partnership with the worldwide leader in the prosthetics industry, Ottobock, to license the technology behind the Utah Bionic Leg and bring it to individuals with lower-limb amputations.

This article was originally published here.

Page 1 of 262
1 2 3 262