Page 5 of 12
1 3 4 5 6 7 12

Robot takes contact-free measurements of patients’ vital signs

By Anne Trafton

During the current coronavirus pandemic, one of the riskiest parts of a health care worker’s job is assessing people who have symptoms of Covid-19. Researchers from MIT, Boston Dynamics, and Brigham and Women’s Hospital hope to reduce that risk by using robots to remotely measure patients’ vital signs.

The robots, which are controlled by a handheld device, can also carry a tablet that allows doctors to ask patients about their symptoms without being in the same room.

“In robotics, one of our goals is to use automation and robotic technology to remove people from dangerous jobs,” says Henwei Huang, an MIT postdoc. “We thought it should be possible for us to use a robot to remove the health care worker from the risk of directly exposing themselves to the patient.”

Using four cameras mounted on a dog-like robot developed by Boston Dynamics, the researchers have shown that they can measure skin temperature, breathing rate, pulse rate, and blood oxygen saturation in healthy patients, from a distance of 2 meters. They are now making plans to test it in patients with Covid-19 symptoms.

“We are thrilled to have forged this industry-academia partnership in which scientists with engineering and robotics expertise worked with clinical teams at the hospital to bring sophisticated technologies to the bedside,” says Giovanni Traverso, an MIT assistant professor of mechanical engineering, a gastroenterologist at Brigham and Women’s Hospital, and the senior author of the study.

The researchers have posted a paper on their system on the preprint server techRxiv, and have submitted it to a peer-reviewed journal. Huang is one of the lead authors of the study, along with Peter Chai, an assistant professor of emergency medicine at Brigham and Women’s Hospital, and Claas Ehmke, a visiting scholar from ETH Zurich.

Measuring vital signs

When Covid-19 cases began surging in Boston in March, many hospitals, including Brigham and Women’s, set up triage tents outside their emergency departments to evaluate people with Covid-19 symptoms. One major component of this initial evaluation is measuring vital signs, including body temperature.

The MIT and BWH researchers came up with the idea to use robotics to enable contactless monitoring of vital signs, to allow health care workers to minimize their exposure to potentially infectious patients. They decided to use existing computer vision technologies that can measure temperature, breathing rate, pulse, and blood oxygen saturation, and worked to make them mobile.

To achieve that, they used a robot known as Spot, which can walk on four legs, similarly to a dog. Health care workers can maneuver the robot to wherever patients are sitting, using a handheld controller. The researchers mounted four different cameras onto the robot — an infrared camera plus three monochrome cameras that filter different wavelengths of light.

The researchers developed algorithms that allow them to use the infrared camera to measure both elevated skin temperature and breathing rate. For body temperature, the camera measures skin temperature on the face, and the algorithm correlates that temperature with core body temperature. The algorithm also takes into account the ambient temperature and the distance between the camera and the patient, so that measurements can be taken from different distances, under different weather conditions, and still be accurate.

Measurements from the infrared camera can also be used to calculate the patient’s breathing rate. As the patient breathes in and out, wearing a mask, their breath changes the temperature of the mask. Measuring this temperature change allows the researchers to calculate how rapidly the patient is breathing.

The three monochrome cameras each filter a different wavelength of light — 670, 810, and 880 nanometers. These wavelengths allow the researchers to measure the slight color changes that result when hemoglobin in blood cells binds to oxygen and flows through blood vessels. The researchers’ algorithm uses these measurements to calculate both pulse rate and blood oxygen saturation.

“We didn’t really develop new technology to do the measurements,” Huang says. “What we did is integrate them together very specifically for the Covid application, to analyze different vital signs at the same time.”

Continuous monitoring

In this study, the researchers performed the measurements on healthy volunteers, and they are now making plans to test their robotic approach in people who are showing symptoms of Covid-19, in a hospital emergency department.

While in the near term, the researchers plan to focus on triage applications, in the longer term, they envision that the robots could be deployed in patients’ hospital rooms. This would allow the robots to continuously monitor patients and also allow doctors to check on them, via tablet, without having to enter the room. Both applications would require approval from the U.S. Food and Drug Administration.

The research was funded by the MIT Department of Mechanical Engineering and the Karl van Tassel (1925) Career Development Professorship, and Boston Dynamics.

Data systems that learn to be better

One of the biggest challenges in computing is handling a staggering onslaught of information while still being able to efficiently store and process it.

By Adam Conner-Simons

Big data has gotten really, really big: By 2025, all the world’s data will add up to an estimated 175 trillion gigabytes. For a visual, if you stored that amount of data on DVDs, it would stack up tall enough to circle the Earth 222 times. 

One of the biggest challenges in computing is handling this onslaught of information while still being able to efficiently store and process it. A team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) believe that the answer rests with something called “instance-optimized systems.”  

Traditional storage and database systems are designed to work for a wide range of applications because of how long it can take to build them — months or, often, several years. As a result, for any given workload such systems provide performance that is good, but usually not the best. Even worse, they sometimes require administrators to painstakingly tune the system by hand to provide even reasonable performance. 

In contrast, the goal of instance-optimized systems is to build systems that optimize and partially re-organize themselves for the data they store and the workload they serve. 

“It’s like building a database system for every application from scratch, which is not economically feasible with traditional system designs,” says MIT Professor Tim Kraska. 

As a first step toward this vision, Kraska and colleagues developed Tsunami and Bao. Tsunami uses machine learning to automatically re-organize a dataset’s storage layout based on the types of queries that its users make. Tests show that it can run queries up to 10 times faster than state-of-the-art systems. What’s more, its datasets can be organized via a series of “learned indexes” that are up to 100 times smaller than the indexes used in traditional systems. 

Kraska has been exploring the topic of learned indexes for several years, going back to his influential work with colleagues at Google in 2017. 

Harvard University Professor Stratos Idreos, who was not involved in the Tsunami project, says that a unique advantage of learned indexes is their small size, which, in addition to space savings, brings substantial performance improvements.

“I think this line of work is a paradigm shift that’s going to impact system design long-term,” says Idreos. “I expect approaches based on models will be one of the core components at the heart of a new wave of adaptive systems.”

Bao, meanwhile, focuses on improving the efficiency of query optimization through machine learning. A query optimizer rewrites a high-level declarative query to a query plan, which can actually be executed over the data to compute the result to the query. However, often there exists more than one query plan to answer any query; picking the wrong one can cause a query to take days to compute the answer, rather than seconds. 

Traditional query optimizers take years to build, are very hard to maintain, and, most importantly, do not learn from their mistakes. Bao is the first learning-based approach to query optimization that has been fully integrated into the popular database management system PostgreSQL. Lead author Ryan Marcus, a postdoc in Kraska’s group, says that Bao produces query plans that run up to 50 percent faster than those created by the PostgreSQL optimizer, meaning that it could help to significantly reduce the cost of cloud services, like Amazon’s Redshift, that are based on PostgreSQL.

By fusing the two systems together, Kraska hopes to build the first instance-optimized database system that can provide the best possible performance for each individual application without any manual tuning. 

The goal is to not only relieve developers from the daunting and laborious process of tuning database systems, but to also provide performance and cost benefits that are not possible with traditional systems.

Traditionally, the systems we use to store data are limited to only a few storage options and, because of it, they cannot provide the best possible performance for a given application. What Tsunami can do is dynamically change the structure of the data storage based on the kinds of queries that it receives and create new ways to store data, which are not feasible with more traditional approaches.

Johannes Gehrke, a managing director at Microsoft Research who also heads up machine learning efforts for Microsoft Teams, says that his work opens up many interesting applications, such as doing so-called “multidimensional queries” in main-memory data warehouses. Harvard’s Idreos also expects the project to spur further work on how to maintain the good performance of such systems when new data and new kinds of queries arrive.

Bao is short for “bandit optimizer,” a play on words related to the so-called “multi-armed bandit” analogy where a gambler tries to maximize their winnings at multiple slot machines that have different rates of return. The multi-armed bandit problem is commonly found in any situation that has tradeoffs between exploring multiple different options, versus exploiting a single option — from risk optimization to A/B testing.

“Query optimizers have been around for years, but they often make mistakes, and usually they don’t learn from them,” says Kraska. “That’s where we feel that our system can make key breakthroughs, as it can quickly learn for the given data and workload what query plans to use and which ones to avoid.”

Kraska says that in contrast to other learning-based approaches to query optimization, Bao learns much faster and can outperform open-source and commercial optimizers with as little as one hour of training time.In the future, his team aims to integrate Bao into cloud systems to improve resource utilization in environments where disk, RAM, and CPU time are scarce resources.

“Our hope is that a system like this will enable much faster query times, and that people will be able to answer questions they hadn’t been able to answer before,” says Kraska.

A related paper about Tsunami was co-written by Kraska, PhD students Jialin Ding and Vikram Nathan, and MIT Professor Mohammad Alizadeh. A paper about Bao was co-written by Kraska, Marcus, PhD students Parimarjan Negi and Hongzi Mao, visiting scientist Nesime Tatbul, and Alizadeh.

The work was done as part of the Data System and AI Lab (DSAIL@CSAIL), which is sponsored by Intel, Google, Microsoft, and the U.S. National Science Foundation. 

Letting robots manipulate cables


By Rachel Gordon
The opposing fingers are lightweight and quick moving, allowing nimble, real-time adjustments of force and position.
Photo courtesy of MIT CSAIL.

For humans, it can be challenging to manipulate thin flexible objects like ropes, wires, or cables. But if these problems are hard for humans, they are nearly impossible for robots. As a cable slides between the fingers, its shape is constantly changing, and the robot’s fingers must be constantly sensing and adjusting the cable’s position and motion.

Standard approaches have used a series of slow and incremental deformations, as well as mechanical fixtures, to get the job done. Recently, a group of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) pursued the task from a different angle, in a manner that more closely mimics us humans. The team’s new system uses a pair of soft robotic grippers with high-resolution tactile sensors (and no added mechanical constraints) to successfully manipulate freely moving cables.

One could imagine using a system like this for both industrial and household tasks, to one day enable robots to help us with things like tying knots, wire shaping, or even surgical suturing. 

The team’s first step was to build a novel two-fingered gripper. The opposing fingers are lightweight and quick moving, allowing nimble, real-time adjustments of force and position. On the tips of the fingers are vision-based “GelSight” sensors, built from soft rubber with embedded cameras. The gripper is mounted on a robot arm, which can move as part of the control system.

The team’s second step was to create a perception-and-control framework to allow cable manipulation. For perception, they used the GelSight sensors to estimate the pose of the cable between the fingers, and to measure the frictional forces as the cable slides. Two controllers run in parallel: one modulates grip strength, while the other adjusts the gripper pose to keep the cable within the gripper.

When mounted on the arm, the gripper could reliably follow a USB cable starting from a random grasp position. Then, in combination with a second gripper, the robot can move the cable “hand over hand” (as a human would) in order to find the end of the cable. It could also adapt to cables of different materials and thicknesses.

As a further demo of its prowess, the robot performed an action that humans routinely do when plugging earbuds into a cell phone. Starting with a free-floating earbud cable, the robot was able to slide the cable between its fingers, stop when it felt the plug touch its fingers, adjust the plug’s pose, and finally insert the plug into the jack. 

“Manipulating soft objects is so common in our daily lives, like cable manipulation, cloth folding, and string knotting,” says Yu She, MIT postdoc and lead author on a new paper about the system. “In many cases, we would like to have robots help humans do this kind of work, especially when the tasks are repetitive, dull, or unsafe.” 

String me along 

Cable following is challenging for two reasons. First, it requires controlling the “grasp force” (to enable smooth sliding), and the “grasp pose” (to prevent the cable from falling from the gripper’s fingers).  

This information is hard to capture from conventional vision systems during continuous manipulation, because it’s usually occluded, expensive to interpret, and sometimes inaccurate. 

What’s more, this information can’t be directly observed with just vision sensors, hence the team’s use of tactile sensors. The gripper’s joints are also flexible — protecting them from potential impact. 

The algorithms can also be generalized to different cables with various physical properties like material, stiffness, and diameter, and also to those at different speeds. 

When comparing different controllers applied to the team’s gripper, their control policy could retain the cable in hand for longer distances than three others. For example, the “open-loop” controller only followed 36 percent of the total length, the gripper easily lost the cable when it curved, and it needed many regrasps to finish the task. 

Looking ahead 

The team observed that it was difficult to pull the cable back when it reached the edge of the finger, because of the convex surface of the GelSight sensor. Therefore, they hope to improve the finger-sensor shape to enhance the overall performance. 

In the future, they plan to study more complex cable manipulation tasks such as cable routing and cable inserting through obstacles, and they want to eventually explore autonomous cable manipulation tasks in the auto industry.

Yu She wrote the paper alongside MIT PhD students Shaoxiong Wang, Siyuan Dong, and Neha Sunil; Alberto Rodriguez, MIT associate professor of mechanical engineering; and Edward Adelson, the John and Dorothy Wilson Professor in the MIT Department of Brain and Cognitive Sciences

Robots help some firms, even while workers across industries struggle

A new study co-authored by an MIT professor shows firms that move quickly to use robots tend to add workers to their payroll, while industry job losses are more concentrated in firms that make this change more slowly.
Image: Stock photo

This is part 2 of a three-part series examining the effects of robots and automation on employment, based on new research from economist and Institute Professor Daron Acemoglu. 

By Peter Dizikes

Overall, adding robots to manufacturing reduces jobs — by more than three per robot, in fact. But a new study co-authored by an MIT professor reveals an important pattern: Firms that move quickly to use robots tend to add workers to their payroll, while industry job losses are more concentrated in firms that make this change more slowly.

The study, by MIT economist Daron Acemoglu, examines the introduction of robots to French manufacturing in recent decades, illuminating the business dynamics and labor implications in granular detail.

“When you look at use of robots at the firm level, it is really interesting because there is an additional dimension,” says Acemoglu. “We know firms are adopting robots in order to reduce their costs, so it is quite plausible that firms adopting robots early are going to expand at the expense of their competitors whose costs are not going down. And that’s exactly what we find.”

Indeed, as the study shows, a 20 percentage point increase in robot use in manufacturing from 2010 to 2015 led to a 3.2 percent decline in industry-wide employment. And yet, for firms adopting robots during that timespan, employee hours worked rose by 10.9 percent, and wages rose modestly as well.

A new paper detailing the study, “Competing with Robots: Firm-Level Evidence from France,” will appear in the May issue of the American Economic Association: Papers and Proceedings. The authors are Acemoglu, who is an Institute Professor at MIT; Clair Lelarge, a senior research economist at the Banque de France and the Center for Economic Policy Research; and Pascual Restrepo Phd ’16, an assistant professor of economics at Boston University.

A French robot census

To conduct the study, the scholars examined 55,390 French manufacturing firms, of which 598 purchased robots during the period from 2010 to 2015. The study uses data provided by France’s Ministry of Industry, client data from French robot suppliers, customs data about imported robots, and firm-level financial data concerning sales, employment, and wages, among other things.

The 598 firms that did purchase robots, while comprising just 1 percent of manufacturing firms, accounted for about 20 percent of manufacturing production during that five-year period.

“Our paper is unique in that we have an almost comprehensive [view] of robot adoption,” Acemoglu says.

The manufacturing industries most heavily adding robots to their production lines in France were pharmaceutical companies, chemicals and plastic manufacturers, food and beverage producers, metal and machinery manufacturers, and automakers.

The industries investing least in robots from 2010 to 2015 included paper and printing, textiles and apparel manufacturing, appliance manufacturers, furniture makers, and minerals companies.

The firms that did add robots to their manufacturing processes became more productive and profitable, and the use of automation lowered their labor share — the part of their income going to workers — between roughly 4 and 6 percentage points. However, because their investments in technology fueled more growth and more market share, they added more workers overall.

By contrast, the firms that did not add robots saw no change in the labor share, and for every 10 percentage point increase in robot adoption by their competitors, these firms saw their own employment drop 2.5 percent. Essentially, the firms not investing in technology were losing ground to their competitors.

This dynamic — job growth at robot-adopting firms, but job losses overall — fits with another finding Acemoglu and Restrepo made in a separate paper about the effects of robots on employment in the U.S. There, the economists found that each robot added to the work force essentially eliminated 3.3 jobs nationally.

“Looking at the result, you might think [at first] it’s the opposite of the U.S. result, where the robot adoption goes hand in hand with destruction of jobs, whereas in France, robot-adopting firms are expanding their employment,” Acemoglu says. “But that’s only because they’re expanding at the expense of their competitors. What we show is that when we add the indirect effect on those competitors, the overall effect is negative and comparable to what we find the in the U.S.”

Superstar firms and the labor share issue

The competitive dynamics the researchers found in France resemble those in another high-profile piece of economics research recently published by MIT professors. In a recent paper, MIT economists David Autor and John Van Reenen, along with three co-authors, published evidence indicating the decline in the labor share in the U.S. as a whole was driven by gains made by “superstar firms,” which find ways to lower their labor share and gain market power.

While those elite firms may hire more workers and even pay relatively well as they grow, labor share declines in their industries, overall.

“It’s very complementary,” Acemoglu observes about the work of Autor and Van Reenen. However, he notes, “A slight difference is that superstar firms [in the work of Autor and Van Reenen, in the U.S.] could come from many different sources. By having this individual firm-level technology data, we are able to show that a lot of this is about automation.”

So, while economists have offered many possible explanations for the decline of the labor share generally — including technology, tax policy, changes in labor market institutions, and more — Acemoglu suspects technology, and automation specifically, is the prime candidate, certainly in France.

“A big part of the [economic] literature now on technology, globalization, labor market institutions, is turning to the question of what explains the decline in the labor share,” Acemoglu says. “Many of those are reasonably interesting hypotheses, but in France it’s only the firms that adopt robots — and they are very large firms — that are reducing their labor share, and that’s what accounts for the entirety of the decline in the labor share in French manufacturing. This really emphasizes that automation, and in particular robots, is a critical part in understanding what’s going on.”

How many jobs do robots really replace?

MIT professor Daron Acemoglu is co-author of a new study showing that each robot added to the workforce has the effect of replacing 3.3 jobs across the U.S.
Image: Stock image edited by MIT News
By Peter Dizikes

This is part 1 of a three-part series examining the effects of robots and automation on employment, based on new research from economist and Institute Professor Daron Acemoglu.  

In many parts of the U.S., robots have been replacing workers over the last few decades. But to what extent, really? Some technologists have forecast that automation will lead to a future without work, while other observers have been more skeptical about such scenarios.

Now a study co-authored by an MIT professor puts firm numbers on the trend, finding a very real impact — although one that falls well short of a robot takeover. The study also finds that in the U.S., the impact of robots varies widely by industry and region, and may play a notable role in exacerbating income inequality.

“We find fairly major negative employment effects,” MIT economist Daron Acemoglu says, although he notes that the impact of the trend can be overstated.

From 1990 to 2007, the study shows, adding one additional robot per 1,000 workers reduced the national employment-to-population ratio by about 0.2 percent, with some areas of the U.S. affected far more than others.

This means each additional robot added in manufacturing replaced about 3.3 workers nationally, on average.

That increased use of robots in the workplace also lowered wages by roughly 0.4 percent during the same time period.

“We find negative wage effects, that workers are losing in terms of real wages in more affected areas, because robots are pretty good at competing against them,” Acemoglu says.

The paper, “Robots and Jobs: Evidence from U.S. Labor Markets,” appears in advance online form in the Journal of Political Economy. The authors are Acemoglu and Pascual Restrepo PhD ’16, an assistant professor of economics at Boston University.

Displaced in Detroit

To conduct the study, Acemoglu and Restrepo used data on 19 industries, compiled by the International Federation of Robotics (IFR), a Frankfurt-based industry group that keeps detailed statistics on robot deployments worldwide. The scholars combined that with U.S.-based data on population, employment, business, and wages, from the U.S. Census Bureau, the Bureau of Economic Analysis, and the Bureau of Labor Statistics, among other sources.

The researchers also compared robot deployment in the U.S. to that of other countries, finding it lags behind that of Europe. From 1993 to 2007, U.S. firms actually did introduce almost exactly one new robot per 1,000 workers; in Europe, firms introduced 1.6 new robots per 1,000 workers.

“Even though the U.S. is a technologically very advanced economy, in terms of industrial robots’ production and usage and innovation, it’s behind many other advanced economies,” Acemoglu says.

In the U.S., four manufacturing industries account for 70 percent of robots: automakers (38 percent of robots in use), electronics (15 percent), the plastics and chemical industry (10 percent), and metals manufacturers (7 percent).

Across the U.S., the study analyzed the impact of robots in 722 commuting zones in the continental U.S. — essentially metropolitan areas — and found considerable geographic variation in how intensively robots are utilized.

Given industry trends in robot deployment, the area of the country most affected is the seat of the automobile industry. Michigan has the highest concentration of robots in the workplace, with employment in Detroit, Lansing, and Saginaw affected more than anywhere else in the country.

“Different industries have different footprints in different places in the U.S.,” Acemoglu observes. “The place where the robot issue is most apparent is Detroit. Whatever happens to automobile manufacturing has a much greater impact on the Detroit area [than elsewhere].”

In commuting zones where robots were added to the workforce, each robot replaces about 6.6 jobs locally, the researchers found. However, in a subtle twist, adding robots in manufacturing benefits people in other industries and other areas of the country — by lowering the cost of goods, among other things. These national economic benefits are the reason the researchers calculated that adding one robot replaces 3.3 jobs for the country as a whole.

The inequality issue

In conducting the study, Acemoglu and Restrepo went to considerable lengths to see if the employment trends in robot-heavy areas might have been caused by other factors, such as trade policy, but they found no complicating empirical effects.

The study does suggest, however, that robots have a direct influence on income inequality. The manufacturing jobs they replace come from parts of the workforce without many other good employment options; as a result, there is a direct connection between automation in robot-using industries and sagging incomes among blue-collar workers.

“There are major distributional implications,” Acemoglu says. When robots are added to manufacturing plants, “The burden falls on the low-skill and especially middle-skill workers. That’s really an important part of our overall research [on robots], that automation actually is a much bigger part of the technological factors that have contributed to rising inequality over the last 30 years.”

So while claims about machines wiping out human work entirely may be overstated, the research by Acemoglu and Restrepo shows that the robot effect is a very real one in manufacturing, with significant social implications.

“It certainly won’t give any support to those who think robots are going to take all of our jobs,” Acemoglu says. “But it does imply that automation is a real force to be grappled with.”

Study finds stronger links between automation and inequality


By Peter Dizikes

This is part 3 of a three-part series examining the effects of robots and automation on employment, based on new research from economist and Institute Professor Daron Acemoglu. 

Modern technology affects different workers in different ways. In some white-collar jobs — designer, engineer — people become more productive with sophisticated software at their side. In other cases, forms of automation, from robots to phone-answering systems, have simply replaced factory workers, receptionists, and many other kinds of employees.

Now a new study co-authored by an MIT economist suggests automation has a bigger impact on the labor market and income inequality than previous research would indicate — and identifies the year 1987 as a key inflection point in this process, the moment when jobs lost to automation stopped being replaced by an equal number of similar workplace opportunities.

“Automation is critical for understanding inequality dynamics,” says MIT economist Daron Acemoglu, co-author of a newly published paper detailing the findings.

Within industries adopting automation, the study shows, the average “displacement” (or job loss) from 1947-1987 was 17 percent of jobs, while the average “reinstatement” (new opportunities) was 19 percent. But from 1987-2016, displacement was 16 percent, while reinstatement was just 10 percent. In short, those factory positions or phone-answering jobs are not coming back.

“A lot of the new job opportunities that technology brought from the 1960s to the 1980s benefitted low-skill workers,” Acemoglu adds. “But from the 1980s, and especially in the 1990s and 2000s, there’s a double whammy for low-skill workers: They’re hurt by displacement, and the new tasks that are coming, are coming slower and benefitting high-skill workers.”

The new paper, “Unpacking Skill Bias: Automation and New Tasks,” will appear in the May issue of the American Economic Association: Papers and Proceedings. The authors are Acemoglu, who is an Institute Professor at MIT, and Pascual Restrepo PhD ’16, an assistant professor of economics at Boston University.

Low-skill workers: Moving backward

The new paper is one of several studies Acemoglu and Restrepo have conducted recently examining the effects of robots and automation in the workplace. In a just-published paper, they concluded that across the U.S. from 1993 to 2007, each new robot replaced 3.3 jobs.

In still another new paper, Acemoglu and Restrepo examined French industry from 2010 to 2015. They found that firms that quickly adopted robots became more productive and hired more workers, while their competitors fell behind and shed workers — with jobs again being reduced overall.

In the current study, Acemoglu and Restrepo construct a model of technology’s effects on the labor market, while testing the model’s strength by using empirical data from 44 relevant industries. (The study uses U.S. Census statistics on employment and wages, as well as economic data from the Bureau of Economic Analysis and the Bureau of Labor Studies, among other sources.)

The result is an alternative to the standard economic modeling in the field, which has emphasized the idea of “skill-biased” technological change — meaning that technology tends to benefit select high-skilled workers more than low-skill workers, helping the wages of high-skilled workers more, while the value of other workers stagnates. Think again of highly trained engineers who use new software to finish more projects more quickly: They become more productive and valuable, while workers lacking synergy with new technology are comparatively less valued.  

However, Acemoglu and Restrepo think even this scenario, with the prosperity gap it implies, is still too benign. Where automation occurs, lower-skill workers are not just failing to make gains; they are actively pushed backward financially. Moreover,  Acemoglu and Restrepo note, the standard model of skill-biased change does not fully account for this dynamic; it estimates that productivity gains and real (inflation-adjusted) wages of workers should be higher than they actually are.

More specifically, the standard model implies an estimate of about 2 percent annual growth in productivity since 1963, whereas annual productivity gains have been about 1.2 percent; it also estimates wage growth for low-skill workers of about 1 percent per year, whereas real wages for low-skill workers have actually dropped since the 1970s.

“Productivity growth has been lackluster, and real wages have fallen,” Acemoglu says. “Automation accounts for both of those.” Moreover, he adds, “Demand for skills has gone down almost exclusely in industries that have seen a lot of automation.”

Why “so-so technologies” are so, so bad

Indeed, Acemoglu says, automation is a special case within the larger set of technological changes in the workplace. As he puts it, automation “is different than garden-variety skill-biased technological change,” because it can replace jobs without adding much productivity to the economy.

Think of a self-checkout system in your supermarket or pharmacy: It reduces labor costs without making the task more efficient. The difference is the work is done by you, not paid employees. These kinds of systems are what Acemoglu and Restrepo have termed “so-so technologies,” because of the minimal value they offer.

“So-so technologies are not really doing a fantastic job, nobody’s enthusiastic about going one-by-one through their items at checkout, and nobody likes it when the airline they’re calling puts them through automated menus,” Acemoglu says. “So-so technologies are cost-saving devices for firms that just reduce their costs a little bit but don’t increase productivity by much. They create the usual displacement effect but don’t benefit other workers that much, and firms have no reason to hire more workers or pay other workers more.”

To be sure, not all automation resembles self-checkout systems, which were not around in 1987. Automation at that time consisted more of printed office records being converted into databases, or machinery being added to sectors like textiles and furniture-making. Robots became more commonly added to heavy industrial manufacturing in the 1990s. Automation is a suite of technologies, continuing today with software and AI, which are inherently worker-displacing.

“Displacement is really the center of our theory,” Acemoglu says. “And it has grimmer implications, because wage inequality is associated with disruptive changes for workers. It’s a much more Luddite explanation.”

After all, the Luddites — British textile mill workers who destroyed machinery in the 1810s — may be synonymous with technophobia, but their actions were motivated by economic concerns; they knew machines were replacing their jobs. That same displacement continues today, although, Acemoglu contends, the net negative consequences of technology on jobs is not inevitable. We could, perhaps, find more ways to produce job-enhancing technologies, rather than job-replacing innovations.

“It’s not all doom and gloom,” says Acemoglu. “There is nothing that says technology is all bad for workers. It is the choice we make about the direction to develop technology that is critical.”

System trains driverless cars in simulation before they hit the road


A simulation system invented at MIT to train driverless cars creates a photorealistic world with infinite steering possibilities, helping the cars learn to navigate a host of worse-case scenarios before cruising down real streets.

By Rob Matheson

A simulation system invented at MIT to train driverless cars creates a photorealistic world with infinite steering possibilities, helping the cars learn to navigate a host of worse-case scenarios before cruising down real streets.  

Control systems, or “controllers,” for autonomous vehicles largely rely on real-world datasets of driving trajectories from human drivers. From these data, they learn how to emulate safe steering controls in a variety of situations. But real-world data from hazardous “edge cases,” such as nearly crashing or being forced off the road or into other lanes, are — fortunately — rare.

Some computer programs, called “simulation engines,” aim to imitate these situations by rendering detailed virtual roads to help train the controllers to recover. But the learned control from simulation has never been shown to transfer to reality on a full-scale vehicle.

The MIT researchers tackle the problem with their photorealistic simulator, called Virtual Image Synthesis and Transformation for Autonomy (VISTA). It uses only a small dataset, captured by humans driving on a road, to synthesize a practically infinite number of new viewpoints from trajectories that the vehicle could take in the real world. The controller is rewarded for the distance it travels without crashing, so it must learn by itself how to reach a destination safely. In doing so, the vehicle learns to safely navigate any situation it encounters, including regaining control after swerving between lanes or recovering from near-crashes.  

In tests, a controller trained within the VISTA simulator safely was able to be safely deployed onto a full-scale driverless car and to navigate through previously unseen streets. In positioning the car at off-road orientations that mimicked various near-crash situations, the controller was also able to successfully recover the car back into a safe driving trajectory within a few seconds. A paper describing the system has been published in IEEE Robotics and Automation Letters and will be presented at the upcoming ICRA conference in May.

“It’s tough to collect data in these edge cases that humans don’t experience on the road,” says first author Alexander Amini, a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “In our simulation, however, control systems can experience those situations, learn for themselves to recover from them, and remain robust when deployed onto vehicles in the real world.”

The work was done in collaboration with the Toyota Research Institute. Joining Amini on the paper are Igor Gilitschenski, a postdoc in CSAIL; Jacob Phillips, Julia Moseyko, and Rohan Banerjee, all undergraduates in CSAIL and the Department of Electrical Engineering and Computer Science; Sertac Karaman, an associate professor of aeronautics and astronautics; and Daniela Rus, director of CSAIL and the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science.

Data-driven simulation

Historically, building simulation engines for training and testing autonomous vehicles has been largely a manual task. Companies and universities often employ teams of artists and engineers to sketch virtual environments, with accurate road markings, lanes, and even detailed leaves on trees. Some engines may also incorporate the physics of a car’s interaction with its environment, based on complex mathematical models.

But since there are so many different things to consider in complex real-world environments, it’s practically impossible to incorporate everything into the simulator. For that reason, there’s usually a mismatch between what controllers learn in simulation and how they operate in the real world.

Instead, the MIT researchers created what they call a “data-driven” simulation engine that synthesizes, from real data, new trajectories consistent with road appearance, as well as the distance and motion of all objects in the scene.

They first collect video data from a human driving down a few roads and feed that into the engine. For each frame, the engine projects every pixel into a type of 3D point cloud. Then, they place a virtual vehicle inside that world. When the vehicle makes a steering command, the engine synthesizes a new trajectory through the point cloud, based on the steering curve and the vehicle’s orientation and velocity.

Then, the engine uses that new trajectory to render a photorealistic scene. To do so, it uses a convolutional neural network — commonly used for image-processing tasks — to estimate a depth map, which contains information relating to the distance of objects from the controller’s viewpoint. It then combines the depth map with a technique that estimates the camera’s orientation within a 3D scene. That all helps pinpoint the vehicle’s location and relative distance from everything within the virtual simulator.

Based on that information, it reorients the original pixels to recreate a 3D representation of the world from the vehicle’s new viewpoint. It also tracks the motion of the pixels to capture the movement of the cars and people, and other moving objects, in the scene. “This is equivalent to providing the vehicle with an infinite number of possible trajectories,” Rus says. “Because when we collect physical data, we get data from the specific trajectory the car will follow. But we can modify that trajectory to cover all possible ways of and environments of driving. That’s really powerful.”

Reinforcement learning from scratch

Traditionally, researchers have been training autonomous vehicles by either following human defined rules of driving or by trying to imitate human drivers. But the researchers make their controller learn entirely from scratch under an “end-to-end” framework, meaning it takes as input only raw sensor data — such as visual observations of the road — and, from that data, predicts steering commands at outputs.

“We basically say, ‘Here’s an environment. You can do whatever you want. Just don’t crash into vehicles, and stay inside the lanes,’” Amini says.

This requires “reinforcement learning” (RL), a trial-and-error machine-learning technique that provides feedback signals whenever the car makes an error. In the researchers’ simulation engine, the controller begins by knowing nothing about how  to drive, what a lane marker is, or even other vehicles look like, so it starts executing random steering angles. It gets a feedback signal only when it crashes. At that point, it gets teleported to a new simulated location and has to execute a better set of steering angles to avoid crashing again. Over 10 to 15 hours of training, it uses these sparse feedback signals to learn to travel greater and greater distances without crashing.

After successfully driving 10,000 kilometers in simulation, the authors apply that learned controller onto their full-scale autonomous vehicle in the real world. The researchers say this is the first time a controller trained using end-to-end reinforcement learning in simulation has successful been deployed onto a full-scale autonomous car. “That was surprising to us. Not only has the controller never been on a real car before, but it’s also never even seen the roads before and has no prior knowledge on how humans drive,” Amini says.

Forcing the controller to run through all types of driving scenarios enabled it to regain control from disorienting positions — such as being half off the road or into another lane — and steer back into the correct lane within several seconds. “And other state-of-the-art controllers all tragically failed at that, because they never saw any data like this in training,” Amini says.

Next, the researchers hope to simulate all types of road conditions from a single driving trajectory, such as night and day, and sunny and rainy weather. They also hope to simulate more complex interactions with other vehicles on the road. “What if other cars start moving and jump in front of the vehicle?” Rus says. “Those are complex, real-world interactions we want to start testing.”

Showing robots how to do your chores

Roboticists are developing automated robots that can learn new tasks solely by observing humans. At home, you might someday show a domestic robot how to do routine chores.
Image: Christine Daniloff, MIT

By Rob Matheson

Training interactive robots may one day be an easy job for everyone, even those without programming expertise. Roboticists are developing automated robots that can learn new tasks solely by observing humans. At home, you might someday show a domestic robot how to do routine chores. In the workplace, you could train robots like new employees, showing them how to perform many duties.

Making progress on that vision, MIT researchers have designed a system that lets these types of robots learn complicated tasks that would otherwise stymie them with too many confusing rules. One such task is setting a dinner table under certain conditions.  

At its core, the researchers’ “Planning with Uncertain Specifications” (PUnS) system gives robots the humanlike planning ability to simultaneously weigh many ambiguous — and potentially contradictory — requirements to reach an end goal. In doing so, the system always chooses the most likely action to take, based on a “belief” about some probable specifications for the task it is supposed to perform.

In their work, the researchers compiled a dataset with information about how eight objects — a mug, glass, spoon, fork, knife, dinner plate, small plate, and bowl — could be placed on a table in various configurations. A robotic arm first observed randomly selected human demonstrations of setting the table with the objects. Then, the researchers tasked the arm with automatically setting a table in a specific configuration, in real-world experiments and in simulation, based on what it had seen.

To succeed, the robot had to weigh many possible placement orderings, even when items were purposely removed, stacked, or hidden. Normally, all of that would confuse robots too much. But the researchers’ robot made no mistakes over several real-world experiments, and only a handful of mistakes over tens of thousands of simulated test runs.  

“The vision is to put programming in the hands of domain experts, who can program robots through intuitive ways, rather than describing orders to an engineer to add to their code,” says first author Ankit Shah, a graduate student in the Department of Aeronautics and Astronautics (AeroAstro) and the Interactive Robotics Group, who emphasizes that their work is just one step in fulfilling that vision. “That way, robots won’t have to perform preprogrammed tasks anymore. Factory workers can teach a robot to do multiple complex assembly tasks. Domestic robots can learn how to stack cabinets, load the dishwasher, or set the table from people at home.”

Joining Shah on the paper are AeroAstro and Interactive Robotics Group graduate student Shen Li and Interactive Robotics Group leader Julie Shah, an associate professor in AeroAstro and the Computer Science and Artificial Intelligence Laboratory.

Bots hedging bets

Robots are fine planners in tasks with clear “specifications,” which help describe the task the robot needs to fulfill, considering its actions, environment, and end goal. Learning to set a table by observing demonstrations, is full of uncertain specifications. Items must be placed in certain spots, depending on the menu and where guests are seated, and in certain orders, depending on an item’s immediate availability or social conventions. Present approaches to planning are not capable of dealing with such uncertain specifications.

A popular approach to planning is “reinforcement learning,” a trial-and-error machine-learning technique that rewards and penalizes them for actions as they work to complete a task. But for tasks with uncertain specifications, it’s difficult to define clear rewards and penalties. In short, robots never fully learn right from wrong.

The researchers’ system, called PUnS (for Planning with Uncertain Specifications), enables a robot to hold a “belief” over a range of possible specifications. The belief itself can then be used to dish out rewards and penalties. “The robot is essentially hedging its bets in terms of what’s intended in a task, and takes actions that satisfy its belief, instead of us giving it a clear specification,” Ankit Shah says.

The system is built on “linear temporal logic” (LTL), an expressive language that enables robotic reasoning about current and future outcomes. The researchers defined templates in LTL that model various time-based conditions, such as what must happen now, must eventually happen, and must happen until something else occurs. The robot’s observations of 30 human demonstrations for setting the table yielded a probability distribution over 25 different LTL formulas. Each formula encoded a slightly different preference — or specification — for setting the table. That probability distribution becomes its belief.

“Each formula encodes something different, but when the robot considers various combinations of all the templates, and tries to satisfy everything together, it ends up doing the right thing eventually,” Ankit Shah says.

Following criteria

The researchers also developed several criteria that guide the robot toward satisfying the entire belief over those candidate formulas. One, for instance, satisfies the most likely formula, which discards everything else apart from the template with the highest probability. Others satisfy the largest number of unique formulas, without considering their overall probability, or they satisfy several formulas that represent highest total probability. Another simply minimizes error, so the system ignores formulas with high probability of failure.

Designers can choose any one of the four criteria to preset before training and testing. Each has its own tradeoff between flexibility and risk aversion. The choice of criteria depends entirely on the task. In safety critical situations, for instance, a designer may choose to limit possibility of failure. But where consequences of failure are not as severe, designers can choose to give robots greater flexibility to try different approaches.

With the criteria in place, the researchers developed an algorithm to convert the robot’s belief — the probability distribution pointing to the desired formula — into an equivalent reinforcement learning problem. This model will ping the robot with a reward or penalty for an action it takes, based on the specification it’s decided to follow.

In simulations asking the robot to set the table in different configurations, it only made six mistakes out of 20,000 tries. In real-world demonstrations, it showed behavior similar to how a human would perform the task. If an item wasn’t initially visible, for instance, the robot would finish setting the rest of the table without the item. Then, when the fork was revealed, it would set the fork in the proper place. “That’s where flexibility is very important,” Ankit Shah says. “Otherwise it would get stuck when it expects to place a fork and not finish the rest of table setup.”

Next, the researchers hope to modify the system to help robots change their behavior based on verbal instructions, corrections, or a user’s assessment of the robot’s performance. “Say a person demonstrates to a robot how to set a table at only one spot. The person may say, ‘do the same thing for all other spots,’ or, ‘place the knife before the fork here instead,’” Ankit Shah says. “We want to develop methods for the system to naturally adapt to handle those verbal commands, without needing additional demonstrations.”  

Using artificial intelligence to enrich digital maps

An AI model developed at MIT and Qatar Computing Research Institute that uses only satellite imagery to automatically tag road features in digital maps could improve GPS navigation, especially in countries with limited map data.
Image: Google Maps/MIT News

By Rob Matheson

A model invented by researchers at MIT and Qatar Computing Research Institute (QCRI) that uses satellite imagery to tag road features in digital maps could help improve GPS navigation.  

Showing drivers more details about their routes can often help them navigate in unfamiliar locations. Lane counts, for instance, can enable a GPS system to warn drivers of diverging or merging lanes. Incorporating information about parking spots can help drivers plan ahead, while mapping bicycle lanes can help cyclists negotiate busy city streets. Providing updated information on road conditions can also improve planning for disaster relief.

But creating detailed maps is an expensive, time-consuming process done mostly by big companies, such as Google, which sends vehicles around with cameras strapped to their hoods to capture video and images of an area’s roads. Combining that with other data can create accurate, up-to-date maps. Because this process is expensive, however, some parts of the world are ignored.

A solution is to unleash machine-learning models on satellite images — which are easier to obtain and updated fairly regularly — to automatically tag road features. But roads can be occluded by, say, trees and buildings, making it a challenging task. In a paper being presented at the Association for the Advancement of Artificial Intelligence conference, the MIT and QCRI researchers describe “RoadTagger,” which uses a combination of neural network architectures to automatically predict the number of lanes and road types (residential or highway) behind obstructions.

In testing RoadTagger on occluded roads from digital maps of 20 U.S. cities, the model counted lane numbers with 77 percent accuracy and inferred road types with 93 percent accuracy. The researchers are also planning to enable RoadTagger to predict other features, such as parking spots and bike lanes.

“Most updated digital maps are from places that big companies care the most about. If you’re in places they don’t care about much, you’re at a disadvantage with respect to the quality of map,” says co-author Sam Madden, a professor in the Department of Electrical Engineering and Computer Science (EECS) and a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “Our goal is to automate the process of generating high-quality digital maps, so they can be available in any country.”

The paper’s co-authors are CSAIL graduate students Songtao He, Favyen Bastani, and Edward Park; EECS undergraduate student Satvat Jagwani; CSAIL professors Mohammad Alizadeh and Hari Balakrishnan; and QCRI researchers Sanjay Chawla, Sofiane Abbar, and Mohammad Amin Sadeghi.

Combining CNN and GNN

Quatar, where QCRI is based, is “not a priority for the large companies building digital maps,” Madden says. Yet, it’s constantly building new roads and improving old ones, especially in preparation for hosting the 2022 FIFA World Cup.

“While visiting Qatar, we’ve had experiences where our Uber driver can’t figure out how to get where he’s going, because the map is so off,” Madden says. “If navigation apps don’t have the right information, for things such as lane merging, this could be frustrating or worse.”

RoadTagger relies on a novel combination of a convolutional neural network (CNN) — commonly used for images-processing tasks — and a graph neural network (GNN). GNNs model relationships between connected nodes in a graph and have become popular for analyzing things like social networks and molecular dynamics. The model is “end-to-end,” meaning it’s fed only raw data and automatically produces output, without human intervention.

The CNN takes as input raw satellite images of target roads. The GNN breaks the road into roughly 20-meter segments, or “tiles.” Each tile is a separate graph node, connected by lines along the road. For each node, the CNN extracts road features and shares that information with its immediate neighbors. Road information propagates along the whole graph, with each node receiving some information about road attributes in every other node. If a certain tile is occluded in an image, RoadTagger uses information from all tiles along the road to predict what’s behind the occlusion.

This combined architecture represents a more human-like intuition, the researchers say. Say part of a four-lane road is occluded by trees, so certain tiles show only two lanes. Humans can easily surmise that a couple lanes are hidden behind the trees. Traditional machine-learning models — say, just a CNN — extract features only of individual tiles and most likely predict the occluded tile is a two-lane road.

“Humans can use information from adjacent tiles to guess the number of lanes in the occluded tiles, but networks can’t do that,” He says. “Our approach tries to mimic the natural behavior of humans, where we capture local information from the CNN and global information from the GNN to make better predictions.”

Learning weights   

To train and test RoadTagger, the researchers used a real-world map dataset, called OpenStreetMap, which lets users edit and curate digital maps around the globe. From that dataset, they collected confirmed road attributes from 688 square kilometers of maps of 20 U.S. cities — including Boston, Chicago, Washington, and Seattle. Then, they gathered the corresponding satellite images from a Google Maps dataset.

In training, RoadTagger learns weights — which assign varying degrees of importance to features and node connections — of the CNN and GNN. The CNN extracts features from pixel patterns of tiles and the GNN propagates the learned features along the graph. From randomly selected subgraphs of the road, the system learns to predict the road features at each tile. In doing so, it automatically learns which image features are useful and how to propagate those features along the graph. For instance, if a target tile has unclear lane markings, but its neighbor tile has four lanes with clear lane markings and shares the same road width, then the target tile is likely to also have four lanes. In this case, the model automatically learns that the road width is a useful image feature, so if two adjacent tiles share the same road width, they’re likely to have the same lane count.

Given a road not seen in training from OpenStreetMap, the model breaks the road into tiles and uses its learned weights to make predictions. Tasked with predicting a number of lanes in an occluded tile, the model notes that neighboring tiles have matching pixel patterns and, therefore, a high likelihood to share information. So, if those tiles have four lanes, the occluded tile must also have four.

In another result, RoadTagger accurately predicted lane numbers in a dataset of synthesized, highly challenging road disruptions. As one example, an overpass with two lanes covered a few tiles of a target road with four lanes. The model detected mismatched pixel patterns of the overpass, so it ignored the two lanes over the covered tiles, accurately predicting four lanes were underneath.

The researchers hope to use RoadTagger to help humans rapidly validate and approve continuous modifications to infrastructure in datasets such as OpenStreetMap, where many maps don’t contain lane counts or other details. A specific area of interest is Thailand, Bastani says, where roads are constantly changing, but there are few if any updates in the dataset.

“Roads that were once labeled as dirt roads have been paved over so are better to drive on, and some intersections have been completely built over. There are changes every year, but digital maps are out of date,” he says. “We want to constantly update such road attributes based on the most recent imagery.”

Intelligent Towing Tank propels human-robot-computer research

A researcher’s hand hovers over the water’s surface in the Intelligent Towing Tank (ITT), an automated experimental facility guided by active learning to explore vortex-induced vibrations (VIVs), revealing a path to accelerated scientific discovery.
Image: Dixia Fan and Lily Keyes/MIT Sea Grant

By Lily Keyes/MIT Sea Grant

In its first year of operation, the Intelligent Towing Tank (ITT) conducted about 100,000 total experiments, essentially completing the equivalent of a PhD student’s five years’ worth of experiments in a matter of weeks.

The automated experimental facility, developed in the MIT Sea Grant Hydrodynamics Laboratory, automatically and adaptively performs, analyzes, and designs experiments exploring vortex-induced vibrations (VIVs). Important for engineering offshore ocean structures like marine drilling risers that connect underwater oil wells to the surface, VIVs remain somewhat of a phenomenon to researchers due to the high number of parameters involved.

Guided by active learning, the ITT conducts series of experiments wherein the parameters of each next experiment are selected by a computer. Using an “explore-and-exploit” methodology, the system dramatically reduces the number of experiments required to explore and map the complex forces governing VIVs.

What began as then-PhD candidate Dixia Fan’s quest to cut back on conducting a thousand or so laborious experiments — by hand — led to the design of the innovative system and a paper recently published in the journal Science Robotics.

Fan, now a postdoc, and a team of researchers from the MIT Sea Grant College Program and MIT’s Department of Mechanical Engineering, École Normale Supérieure de Rennes, and Brown University, reveal a potential paradigm shift in experimental research, where humans, computers, and robots can collaborate more effectively to accelerate scientific discovery.

The 33-foot whale of a tank comes alive, working without interruption or supervision on the venture at hand — in this case, exploring a canonical problem in the field of fluid-structure interactions. But the researchers envision applications of the active learning and automation approach to experimental research across disciplines, potentially leading to new insights and models in multi-input/multi-output nonlinear systems.

VIVs are inherently-nonlinear motions induced on a structure in an oncoming irregular cross-stream, which prove vexing to study. The researchers report that the number of experiments completed by the ITT is already comparable to the total number of experiments done to date worldwide on the subject of VIVs.

The reason for this is the large number of independent parameters, from flow velocity to pressure, involved in studying the complex forces at play. According to Fan, a systematic brute-force approach — blindly conducting 10 measurements per parameter in an eight-dimensional parametric space — would require 100 million experiments.

With the ITT, Fan and his collaborators have taken the problem into a wider parametric space than previously practicable to explore. “If we performed traditional techniques on the problem we studied,” he explains, “it would take 950 years to finish the experiment.” Clearly infeasible, so Fan and the team integrated a Gaussian process regression learning algorithm into the ITT. In doing so, the researchers reduced the experimental burden by several orders of magnitude, requiring only a few thousand experiments.

The robotic system automatically conducts an initial sequence of experiments, periodically towing a submerged structure along the length of the tank at a constant velocity. Then, the ITT takes partial control over the parameters of each next experiment by minimizing suitable acquisition functions of quantified uncertainties and adapting to achieve a range of objectives, like reduced drag.

Earlier this year, Fan was awarded an MIT Mechanical Engineering de Florez Award for “Outstanding Ingenuity and Creative Judgment” in the development of the ITT. “Dixia’s design of the Intelligent Towing Tank is an outstanding example of using novel methods to reinvigorate mature fields,” says Michael Triantafyllou, Henry L. and Grace Doherty Professor in Ocean Science and Engineering, who acted as Fan’s doctoral advisor.

Triantafyllou, a co-author on this paper and the director of the MIT Sea Grant College Program, says, “MIT Sea Grant has committed resources and funded projects using deep-learning methods in ocean-related problems for several years that are already paying off.” Funded by the National Oceanic and Atmospheric Administration and administered by the National Sea Grant Program, MIT Sea Grant is a federal-Institute partnership that brings the research and engineering core of MIT to bear on ocean-related challenges.

Fan’s research points to a number of others utilizing automation and artificial intelligence in science: At Caltech, a robot scientist named “Adam” generates and tests hypotheses; at the Defense Advanced Research Projects Agency, the Big Mechanism program reads tens of thousands of research papers to generate new models.

Similarly, the ITT applies human-computer-robot collaboration to accelerate experimental efforts. The system demonstrates a potential paradigm shift in conducting research, where automation and uncertainty quantification can considerably accelerate scientific discovery. The researchers assert that the machine learning methodology described in this paper can be adapted and applied in and beyond fluid mechanics, to other experimental fields.

Other contributors to the paper include George Karniadakis from Brown University, who is also affiliated with MIT Sea Grant; Gurvan Jodin from ENS Rennes; MIT PhD candidate in mechanical engineering Yu Ma; and Thomas Consi, Luca Bonfiglio, and Lily Keyes from MIT Sea Grant.

This work was supported by DARPA, Fariba Fahroo, and Jan Vandenbrande through an EQUiPS (Enabling Quantification of Uncertainty in Physical Systems) grant, as well as Shell, Subsea 7, and the MIT Sea Grant College Program.

Helping machines perceive some laws of physics

An MIT-invented model demonstrates an understanding of some basic “intuitive physics” by registering “surprise” when objects in simulations move in unexpected ways, such as rolling behind a wall and not reappearing on the other side.
Image: Christine Daniloff, MIT
By Rob Matheson

Humans have an early understanding of the laws of physical reality. Infants, for instance, hold expectations for how objects should move and interact with each other, and will show surprise when they do something unexpected, such as disappearing in a sleight-of-hand magic trick.

Now MIT researchers have designed a model that demonstrates an understanding of some basic “intuitive physics” about how objects should behave. The model could be used to help build smarter artificial intelligence and, in turn, provide information to help scientists understand infant cognition.

The model, called ADEPT, observes objects moving around a scene and makes predictions about how the objects should behave, based on their underlying physics. While tracking the objects, the model outputs a signal at each video frame that correlates to a level of “surprise” — the bigger the signal, the greater the surprise. If an object ever dramatically mismatches the model’s predictions — by, say, vanishing or teleporting across a scene — its surprise levels will spike.

In response to videos showing objects moving in physically plausible and implausible ways, the model registered levels of surprise that matched levels reported by humans who had watched the same videos.  

“By the time infants are 3 months old, they have some notion that objects don’t wink in and out of existence, and can’t move through each other or teleport,” says first author Kevin A. Smith, a research scientist in the Department of Brain and Cognitive Sciences (BCS) and a member of the Center for Brains, Minds, and Machines (CBMM). “We wanted to capture and formalize that knowledge to build infant cognition into artificial-intelligence agents. We’re now getting near human-like in the way models can pick apart basic implausible or plausible scenes.”

Joining Smith on the paper are co-first authors Lingjie Mei, an undergraduate in the Department of Electrical Engineering and Computer Science, and BCS research scientist Shunyu Yao; Jiajun Wu PhD ’19; CBMM investigator Elizabeth Spelke; Joshua B. Tenenbaum, a professor of computational cognitive science, and researcher in CBMM, BCS, and the Computer Science and Artificial Intelligence Laboratory (CSAIL); and CBMM investigator Tomer D. Ullman PhD ’15.

Mismatched realities

ADEPT relies on two modules: an “inverse graphics” module that captures object representations from raw images, and a “physics engine” that predicts the objects’ future representations from a distribution of possibilities.

Inverse graphics basically extracts information of objects — such as shape, pose, and velocity — from pixel inputs. This module captures frames of video as images and uses inverse graphics to extract this information from objects in the scene. But it doesn’t get bogged down in the details. ADEPT requires only some approximate geometry of each shape to function. In part, this helps the model generalize predictions to new objects, not just those it’s trained on.

“It doesn’t matter if an object is rectangle or circle, or if it’s a truck or a duck. ADEPT just sees there’s an object with some position, moving in a certain way, to make predictions,” Smith says. “Similarly, young infants also don’t seem to care much about some properties like shape when making physical predictions.”

These coarse object descriptions are fed into a physics engine — software that simulates behavior of physical systems, such as rigid or fluidic bodies, and is commonly used for films, video games, and computer graphics. The researchers’ physics engine “pushes the objects forward in time,” Ullman says. This creates a range of predictions, or a “belief distribution,” for what will happen to those objects in the next frame.

Next, the model observes the actual next frame. Once again, it captures the object representations, which it then aligns to one of the predicted object representations from its belief distribution. If the object obeyed the laws of physics, there won’t be much mismatch between the two representations. On the other hand, if the object did something implausible — say, it vanished from behind a wall — there will be a major mismatch.

ADEPT then resamples from its belief distribution and notes a very low probability that the object had simply vanished. If there’s a low enough probability, the model registers great “surprise” as a signal spike. Basically, surprise is inversely proportional to the probability of an event occurring. If the probability is very low, the signal spike is very high.  

“If an object goes behind a wall, your physics engine maintains a belief that the object is still behind the wall. If the wall goes down, and nothing is there, there’s a mismatch,” Ullman says. “Then, the model says, ‘There’s an object in my prediction, but I see nothing. The only explanation is that it disappeared, so that’s surprising.’”

Violation of expectations

In development psychology, researchers run “violation of expectations” tests in which infants are shown pairs of videos. One video shows a plausible event, with objects adhering to their expected notions of how the world works. The other video is the same in every way, except objects behave in a way that violates expectations in some way. Researchers will often use these tests to measure how long the infant looks at a scene after an implausible action has occurred. The longer they stare, researchers hypothesize, the more they may be surprised or interested in what just happened.

For their experiments, the researchers created several scenarios based on classical developmental research to examine the model’s core object knowledge. They employed 60 adults to watch 64 videos of known physically plausible and physically implausible scenarios. Objects, for instance, will move behind a wall and, when the wall drops, they’ll still be there or they’ll be gone. The participants rated their surprise at various moments on an increasing scale of 0 to 100. Then, the researchers showed the same videos to the model. Specifically, the scenarios examined the model’s ability to capture notions of permanence (objects do not appear or disappear for no reason), continuity (objects move along connected trajectories), and solidity (objects cannot move through one another).

ADEPT matched humans particularly well on videos where objects moved behind walls and disappeared when the wall was removed. Interestingly, the model also matched surprise levels on videos that humans weren’t surprised by but maybe should have been. For example, in a video where an object moving at a certain speed disappears behind a wall and immediately comes out the other side, the object might have sped up dramatically when it went behind the wall or it might have teleported to the other side. In general, humans and ADEPT were both less certain about whether that event was or wasn’t surprising. The researchers also found traditional neural networks that learn physics from observations — but don’t explicitly represent objects — are far less accurate at differentiating surprising from unsurprising scenes, and their picks for surprising scenes don’t often align with humans.

Next, the researchers plan to delve further into how infants observe and learn about the world, with aims of incorporating any new findings into their model. Studies, for example, show that infants up until a certain age actually aren’t very surprised when objects completely change in some ways — such as if a truck disappears behind a wall, but reemerges as a duck.

“We want to see what else needs to be built in to understand the world more like infants, and formalize what we know about psychology to build better AI agents,” Smith says.

How to design and control robots with stretchy, flexible bodies

MIT researchers have invented a way to efficiently optimize the control and design of soft robots for target tasks, which has traditionally been a monumental undertaking in computation.

Soft robots have springy, flexible, stretchy bodies that can essentially move an infinite number of ways at any given moment. Computationally, this represents a highly complex “state representation,” which describes how each part of the robot is moving. State representations for soft robots can have potentially millions of dimensions, making it difficult to calculate the optimal way to make a robot complete complex tasks.

At the Conference on Neural Information Processing Systems next month, the MIT researchers will present a model that learns a compact, or “low-dimensional,” yet detailed state representation, based on the underlying physics of the robot and its environment, among other factors. This helps the model iteratively co-optimize movement control and material design parameters catered to specific tasks.

“Soft robots are infinite-dimensional creatures that bend in a billion different ways at any given moment,” says first author Andrew Spielberg, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “But, in truth, there are natural ways soft objects are likely to bend. We find the natural states of soft robots can be described very compactly in a low-dimensional description. We optimize control and design of soft robots by learning a good description of the likely states.”

In simulations, the model enabled 2D and 3D soft robots to complete tasks — such as moving certain distances or reaching a target spot —more quickly and accurately than current state-of-the-art methods. The researchers next plan to implement the model in real soft robots.

Joining Spielberg on the paper are CSAIL graduate students Allan Zhao, Tao Du, and Yuanming Hu; Daniela Rus, director of CSAIL and the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science; and Wojciech Matusik, an MIT associate professor in electrical engineering and computer science and head of the Computational Fabrication Group.

“Learning-in-the-loop”

Soft robotics is a relatively new field of research, but it holds promise for advanced robotics. For instance, flexible bodies could offer safer interaction with humans, better object manipulation, and more maneuverability, among other benefits.

Control of robots in simulations relies on an “observer,” a program that computes variables that see how the soft robot is moving to complete a task. In previous work, the researchers decomposed the soft robot into hand-designed clusters of simulated particles. Particles contain important information that help narrow down the robot’s possible movements. If a robot attempts to bend a certain way, for instance, actuators may resist that movement enough that it can be ignored. But, for such complex robots, manually choosing which clusters to track during simulations can be tricky.

Building off that work, the researchers designed a “learning-in-the-loop optimization” method, where all optimized parameters are learned during a single feedback loop over many simulations. And, at the same time as learning optimization — or “in the loop” — the method also learns the state representation.

The model employs a technique called a material point method (MPM), which simulates the behavior of particles of continuum materials, such as foams and liquids, surrounded by a background grid. In doing so, it captures the particles of the robot and its observable environment into pixels or 3D pixels, known as voxels, without the need of any additional computation.     

In a learning phase, this raw particle grid information is fed into a machine-learning component that learns to input an image, compress it to a low-dimensional representation, and decompress the representation back into the input image. If this “autoencoder” retains enough detail while compressing the input image, it can accurately recreate the input image from the compression.

In the researchers’ work, the autoencoder’s learned compressed representations serve as the robot’s low-dimensional state representation. In an optimization phase, that compressed representation loops back into the controller, which outputs a calculated actuation for how each particle of the robot should move in the next MPM-simulated step.

Simultaneously, the controller uses that information to adjust the optimal stiffness for each particle to achieve its desired movement. In the future, that material information can be useful for 3D-printing soft robots, where each particle spot may be printed with slightly different stiffness. “This allows for creating robot designs catered to the robot motions that will be relevant to specific tasks,” Spielberg says. “By learning these parameters together, you keep everything as synchronized as much as possible to make that design process easier.”

Faster optimization

All optimization information is, in turn, fed back into the start of the loop to train the autoencoder. Over many simulations, the controller learns the optimal movement and material design, while the autoencoder learns the increasingly more detailed state representation. “The key is we want that low-dimensional state to be very descriptive,” Spielberg says.

After the robot gets to its simulated final state over a set period of time — say, as close as possible to the target destination — it updates a “loss function.” That’s a critical component of machine learning, which tries to minimize some error. In this case, it minimizes, say, how far away the robot stopped from the target. That loss function flows back to the controller, which uses the error signal to tune all the optimized parameters to best complete the task.

If the researchers tried to directly feed all the raw particles of the simulation into the controller, without the compression step, “running and optimization time would explode,” Spielberg says. Using the compressed representation, the researchers were able to decrease the running time for each optimization iteration from several minutes down to about 10 seconds.

The researchers validated their model on simulations of various 2D and 3D biped and quadruped robots. They researchers also found that, while robots using traditional methods can take up to 30,000 simulations to optimize these parameters, robots trained on their model took only about 400 simulations.

Deploying the model into real soft robots means tackling issues with real-world noise and uncertainty that may decrease the model’s efficiency and accuracy. But, in the future, the researchers hope to design a full pipeline, from simulation to fabrication, for soft robots.

Flexible yet sturdy robot is designed to “grow” like a plant


The new “growing robot” can be programmed to grow, or extend, in different directions, based on the sequence of chain units that are locked and fed out from the “growing tip,” or gearbox.
Image courtesy of researchers, edited by MIT News

In today’s factories and warehouses, it’s not uncommon to see robots whizzing about, shuttling items or tools from one station to another. For the most part, robots navigate pretty easily across open layouts. But they have a much harder time winding through narrow spaces to carry out tasks such as reaching for a product at the back of a cluttered shelf, or snaking around a car’s engine parts to unscrew an oil cap.

Now MIT engineers have developed a robot designed to extend a chain-like appendage flexible enough to twist and turn in any necessary configuration, yet rigid enough to support heavy loads or apply torque to assemble parts in tight spaces. When the task is complete, the robot can retract the appendage and extend it again, at a different length and shape, to suit the next task.

The appendage design is inspired by the way plants grow, which involves the transport of nutrients, in a fluidized form, up to the plant’s tip. There, they are converted into solid material to produce, bit by bit, a supportive stem.

Likewise, the robot consists of a “growing point,” or gearbox, that pulls a loose chain of interlocking blocks into the box. Gears in the box then lock the chain units together and feed the chain out, unit by unit, as a rigid appendage.

The researchers presented the plant-inspired “growing robot” this week at the IEEE International Conference on Intelligent Robots and Systems (IROS) in Macau. They envision that grippers, cameras, and other sensors could be mounted onto the robot’s gearbox, enabling it to meander through an aircraft’s propulsion system and tighten a loose screw, or to reach into a shelf and grab a product without disturbing the organization of surrounding inventory, among other tasks.

“Think about changing the oil in your car,” says Harry Asada, professor of mechanical engineering at MIT. “After you open the engine roof, you have to be flexible enough to make sharp turns, left and right, to get to the oil filter, and then you have to be strong enough to twist the oil filter cap to remove it.”

“Now we have a robot that can potentially accomplish such tasks,” says Tongxi Yan, a former graduate student in Asada’s lab, who led the work. “It can grow, retract, and grow again to a different shape, to adapt to its environment.”

The team also includes MIT graduate student Emily Kamienski and visiting scholar Seiichi Teshigawara, who presented the results at the conference.

The last foot

The design of the new robot is an offshoot of Asada’s work in addressing the “last one-foot problem” — an engineering term referring to the last step, or foot, of a robot’s task or exploratory mission. While a robot may spend most of its time traversing open space, the last foot of its mission may involve more nimble navigation through tighter, more complex spaces to complete a task.

Engineers have devised various concepts and prototypes to address the last one-foot problem, including robots made from soft, balloon-like materials that grow like vines to squeeze through narrow crevices. But Asada says such soft extendable robots aren’t sturdy enough to support “end effectors,” or add-ons such as grippers, cameras, and other sensors that would be necessary in carrying out a task, once the robot has wormed its way to its destination.

“Our solution is not actually soft, but a clever use of rigid materials,” says Asada, who is the Ford Foundation Professor of Engineering.

Chain links

Once the team defined the general functional elements of plant growth, they looked to mimic this in a general sense, in an extendable robot.

“The realization of the robot is totally different from a real plant, but it exhibits the same kind of functionality, at a certain abstract level,” Asada says.

The researchers designed a gearbox to represent the robot’s “growing tip,” akin to the bud of a plant, where, as more nutrients flow up to the site, the tip feeds out more rigid stem. Within the box, they fit a system of gears and motors, which works to pull up a fluidized material — in this case, a bendy sequence of 3-D-printed plastic units interlocked with each other, similar to a bicycle chain.

As the chain is fed into the box, it turns around a winch, which feeds it through a second set of motors programmed to lock certain units in the chain to their neighboring units, creating a rigid appendage as it is fed out of the box.

The researchers can program the robot to lock certain units together while leaving others unlocked, to form specific shapes, or to “grow” in certain directions. In experiments, they were able to program the robot to turn around an obstacle as it extended or grew out from its base.

“It can be locked in different places to be curved in different ways, and have a wide range of motions,” Yan says.

When the chain is locked and rigid, it is strong enough to support a heavy, one-pound weight. If a gripper were attached to the robot’s growing tip, or gearbox, the researchers say the robot could potentially grow long enough to meander through a narrow space, then apply enough torque to loosen a bolt or unscrew a cap.

Auto maintenance is a good example of tasks the robot could assist with, according to Kamienski. “The space under the hood is relatively open, but it’s that last bit where you have to navigate around an engine block or something to get to the oil filter, that a fixed arm wouldn’t be able to navigate around. This robot could do something like that.”

This research was funded, in part, by NSK Ltd.

Predicting people’s driving personalities

In lane-merging scenarios, a system developed at MIT could distinguish between altruistic and egoistic driving behavior.
Image courtesy of the researchers.

Self-driving cars are coming. But for all their fancy sensors and intricate data-crunching abilities, even the most cutting-edge cars lack something that (almost) every 16-year-old with a learner’s permit has: social awareness.


While autonomous technologies have improved substantially, they still ultimately view the drivers around them as obstacles made up of ones and zeros, rather than human beings with specific intentions, motivations, and personalities.

But recently a team led by researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has been exploring whether self-driving cars can be programmed to classify the social personalities of other drivers, so that they can better predict what different cars will do — and, therefore, be able to drive more safely among them.

In a new paper, the scientists integrated tools from social psychology to classify driving behavior with respect to how selfish or selfless a particular driver is.

Specifically, they used something called social value orientation (SVO), which represents the degree to which someone is selfish (“egoistic”) versus altruistic or cooperative (“prosocial”). The system then estimates drivers’ SVOs to create real-time driving trajectories for self-driving cars.

Testing their algorithm on the tasks of merging lanes and making unprotected left turns, the team showed that they could better predict the behavior of other cars by a factor of 25 percent. For example, in the left-turn simulations their car knew to wait when the approaching car had a more egoistic driver, and to then make the turn when the other car was more prosocial.

While not yet robust enough to be implemented on real roads, the system could have some intriguing use cases, and not just for the cars that drive themselves. Say you’re a human driving along and a car suddenly enters your blind spot — the system could give you a warning in the rear-view mirror that the car has an aggressive driver, allowing you to adjust accordingly. It could also allow self-driving cars to actually learn to exhibit more human-like behavior that will be easier for human drivers to understand.

“Working with and around humans means figuring out their intentions to better understand their behavior,” says graduate student Wilko Schwarting, who was lead author on the new paper that will be published this week in the latest issue of the Proceedings of the National Academy of Sciences. “People’s tendencies to be collaborative or competitive often spills over into how they behave as drivers. In this paper, we sought to understand if this was something we could actually quantify.”

Schwarting’s co-authors include MIT professors Sertac Karaman and Daniela Rus, as well as research scientist Alyssa Pierson and former CSAIL postdoc Javier Alonso-Mora.

A central issue with today’s self-driving cars is that they’re programmed to assume that all humans act the same way. This means that, among other things, they’re quite conservative in their decision-making at four-way stops and other intersections.

While this caution reduces the chance of fatal accidents, it also creates bottlenecks that can be frustrating for other drivers, not to mention hard for them to understand. (This may be why the majority of traffic incidents have involved getting rear-ended by impatient drivers.)

“Creating more human-like behavior in autonomous vehicles (AVs) is fundamental for the safety of passengers and surrounding vehicles, since behaving in a predictable manner enables humans to understand and appropriately respond to the AV’s actions,” says Schwarting.

To try to expand the car’s social awareness, the CSAIL team combined methods from social psychology with game theory, a theoretical framework for conceiving social situations among competing players.

The team modeled road scenarios where each driver tried to maximize their own utility and analyzed their “best responses” given the decisions of all other agents. Based on that small snippet of motion from other cars, the team’s algorithm could then predict the surrounding cars’ behavior as cooperative, altruistic, or egoistic — grouping the first two as “prosocial.” People’s scores for these qualities rest on a continuum with respect to how much a person demonstrates care for themselves versus care for others.

In the merging and left-turn scenarios, the two outcome options were to either let somebody merge into your lane (“prosocial”) or not (“egoistic”). The team’s results showed that, not surprisingly, merging cars are deemed more competitive than non-merging cars.

The system was trained to try to better understand when it’s appropriate to exhibit different behaviors. For example, even the most deferential of human drivers knows that certain types of actions — like making a lane change in heavy traffic — require a moment of being more assertive and decisive.

For the next phase of the research, the team plans to work to apply their model to pedestrians, bicycles, and other agents in driving environments. In addition, they will be investigating other robotic systems acting among humans, such as household robots, and integrating SVO into their prediction and decision-making algorithms. Pierson says that the ability to estimate SVO distributions directly from observed motion, instead of in laboratory conditions, will be important for fields far beyond autonomous driving.

“By modeling driving personalities and incorporating the models mathematically using the SVO in the decision-making module of a robot car, this work opens the door to safer and more seamless road-sharing between human-driven and robot-driven cars,” says Rus.

The research was supported by the Toyota Research Institute for the MIT team. The Netherlands Organization for Scientific Research provided support for the specific participation of Mora.

Two-legged robot mimics human balance while running and jumping

Joao Ramos (center), co-inventor of HERMES (left), and Little HERMES (right)
Photo: Tony Pulsone

By Jennifer Chu

Rescuing victims from a burning building, a chemical spill, or any disaster that is inaccessible to human responders could one day be a mission for resilient, adaptable robots. Imagine, for instance, rescue-bots that can bound through rubble on all fours, then rise up on two legs to push aside a heavy obstacle or break through a locked door.

Engineers are making strides on the design of four-legged robots and their ability to run, jump and even do backflips. But getting two-legged, humanoid robots to exert force or push against something without falling has been a significant stumbling block.

Now engineers at MIT and the University of Illinois at Urbana-Champaign have developed a method to control balance in a two-legged, teleoperated robot — an essential step toward enabling a humanoid to carry out high-impact tasks in challenging environments.

The team’s robot, physically resembling a machined torso and two legs, is controlled remotely by a human operator wearing a vest that transmits information about the human’s motion and ground reaction forces to the robot.

Through the vest, the human operator can both direct the robot’s locomotion and feel the robot’s motions. If the robot is starting to tip over, the human feels a corresponding pull on the vest and can adjust in a way to rebalance both herself and, synchronously, the robot.

In experiments with the robot to test this new “balance feedback” approach, the researchers were able to remotely maintain the robot’s balance as it jumped and walked in place in sync with its human operator.

“It’s like running with a heavy backpack — you can feel how the dynamics of the backpack move around you, and you can compensate properly,” says Joao Ramos, who developed the approach as an MIT postdoc. “Now if you want to open a heavy door, the human can command the robot to throw its body at the door and push it open, without losing balance.”

Ramos, who is now an assistant professor at the University of Illinois at Urbana-Champaign, has detailed the approach in a study appearing today in Science Robotics. His co-author on the study is Sangbae Kim, associate professor of mechanical engineering at MIT.

More than motion

Previously, Kim and Ramos built the two-legged robot HERMES (for Highly Efficient Robotic Mechanisms and Electromechanical System) and developed methods for it to mimic the motions of an operator via teleoperation, an approach that the researchers say comes with certain humanistic advantages.

“Because you have a person who can learn and adapt on the fly, a robot can perform motions that it’s never practiced before [via teleoperation],” Ramos says.

In demonstrations, HERMES has poured coffee into a cup, wielded an ax to chop wood, and handled an extinguisher to put out a fire.

All these tasks have involved the robot’s upper body and algorithms to match the robot’s limb positioning with that of its operator’s. HERMES was able to carry out high-impact motions because the robot  was rooted in place. Balance, in these cases, was much simpler to maintain. If the robot were required to take any steps, however, it would have likely tipped over in attempting to mimic the operator’s motions.

“We realized in order to generate high forces or move heavy objects, just copying motions wouldn’t be enough, because the robot would fall easily,” Kim says. “We needed to copy the operator’s dynamic balance.”

Enter Little HERMES, a miniature version of HERMES that is about a third the size of an average human adult. The team engineered the robot as simply a torso and two legs, and designed the system specifically to test lower-body tasks, such as locomotion and balance. As with its full-body counterpart, Little HERMES is designed for teleoperation, with an operator suited up in a vest to control the robot’s actions.

For the robot to copy the operator’s balance rather than just their motions, the team had to first find a simple way to represent balance. Ramos eventually realized that balance could be stripped down to two main ingredients: a person’s center of mass and their center of pressure — basically, a point on the ground where a force equivalent to all supporting forces is exerted.

The location of the center of mass in relation to the center of pressure, Ramos found, relates directly to how balanced a person is at any given time. He also found that the position of these two ingredients could be physically represented as an inverted  pendulum. Imagine swaying from side to side while staying rooted to the same spot. The effect is similar to the swaying of an upside-down pendulum, the top end representing a human’s center of mass (usually in the torso) and the bottom representing their center of pressure on the ground. 

Heavy lifting

To define how center of mass relates to center of pressure, Ramos gathered human motion data, including measurements in the lab, where he swayed back and forth, walked in place, and jumped on a force plate that measured the forces he exerted on the ground, as the position of his feet and torso were recorded. He then condensed this data into measurements of the center of mass and the center of pressure, and developed a model to represent each in relation to the other, as an inverted pendulum.

He then developed a second model, similar to the model for human balance but scaled to the dimensions of the smaller, lighter robot, and he developed a control algorithm to link and enable feedback between the two models.

The researchers tested this balance feedback model, first on a simple inverted pendulum that they built in the lab, in the form of a beam about the same height as Little HERMES. They connected the beam to their teleoperation system, and it swayed back and forth along a track in response to an operator’s movements. As the operator swayed to one side, the beam did likewise — a movement that the operator could also feel through the vest. If the beam swayed too far, the operator, feeling the pull, could lean the other way to compensate, and keep the beam balanced.

The experiments showed that the new feedback model could work to maintain balance on the beam, so the researchers then tried the model on Little HERMES. They also developed an algorithm for the robot to automatically translate the simple model of balance to the forces that each of its feet would have to generate, to copy the operator’s feet.

In the lab, Ramos found that as he wore the vest, he could not only control the robot’s motions and balance, but he also could feel the robot’s movements. When the robot was struck with a hammer from various directions, Ramos felt the vest jerk in the direction the robot moved. Ramos instinctively resisted the tug, which the robot registered as a subtle shift in the center of mass in relation to center of pressure, which it in turn mimicked. The result was that the robot was able to keep from tipping over, even amidst repeated blows to its body.

Little HERMES also mimicked Ramos in other exercises, including running and jumping in place, and walking on uneven ground, all while maintaining its balance without the aid of tethers or supports.

“Balance feedback is a difficult thing to define because it’s something we do without thinking,” Kim says. “This is the first time balance feedback is properly defined for the dynamic actions. This will change how we control a teleoperated humanoid.”

Kim and Ramos will continue to work on developing a full-body humanoid with similar balance control, to one day be able to gallop through a disaster zone and rise up to push away barriers as part of rescue or salvage missions.

“Now we can do heavy door opening or lifting or throwing heavy objects, with proper balance communication,” Kim says.

This research was supported, in part, by Hon Hai Precision Industry Co. Ltd. and Naver Labs Corporation.

Page 5 of 12
1 3 4 5 6 7 12