Page 9 of 12
1 7 8 9 10 11 12

How to control robots with brainwaves and hand gestures

A system developed at MIT allows a human supervisor to correct a robot’s mistakes using gestures and brainwaves.
Photo: Joseph DelPreto/MIT CSAIL
By Adam Conner-Simons

Getting robots to do things isn’t easy: Usually, scientists have to either explicitly program them or get them to understand how humans communicate via language.

But what if we could control robots more intuitively, using just hand gestures and brainwaves?

A new system spearheaded by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) aims to do exactly that, allowing users to instantly correct robot mistakes with nothing more than brain signals and the flick of a finger.

Building off the team’s past work focused on simple binary-choice activities, the new work expands the scope to multiple-choice tasks, opening up new possibilities for how human workers could manage teams of robots.

By monitoring brain activity, the system can detect in real-time if a person notices an error as a robot does a task. Using an interface that measures muscle activity, the person can then make hand gestures to scroll through and select the correct option for the robot to execute.

The team demonstrated the system on a task in which a robot moves a power drill to one of three possible targets on the body of a mock plane. Importantly, they showed that the system works on people it’s never seen before, meaning that organizations could deploy it in real-world settings without needing to train it on users.

“This work combining EEG and EMG feedback enables natural human-robot interactions for a broader set of applications than we’ve been able to do before using only EEG feedback,” says CSAIL Director Daniela Rus, who supervised the work. “By including muscle feedback, we can use gestures to command the robot spatially, with much more nuance and specificity.”

PhD candidate Joseph DelPreto was lead author on a paper about the project alongside Rus, former CSAIL postdoc Andres F. Salazar-Gomez, former CSAIL research scientist Stephanie Gil, research scholar Ramin M. Hasani, and Boston University Professor Frank H. Guenther. The paper will be presented at the Robotics: Science and Systems (RSS) conference taking place in Pittsburgh next week.

In most previous work, systems could generally only recognize brain signals when people trained themselves to “think” in very specific but arbitrary ways and when the system was trained on such signals. For instance, a human operator might have to look at different light displays that correspond to different robot tasks during a training session.

Not surprisingly, such approaches are difficult for people to handle reliably, especially if they work in fields like construction or navigation that already require intense concentration.

Meanwhile, Rus’ team harnessed the power of brain signals called “error-related potentials” (ErrPs), which researchers have found to naturally occur when people notice mistakes. If there’s an ErrP, the system stops so the user can correct it; if not, it carries on.

“What’s great about this approach is that there’s no need to train users to think in a prescribed way,” says DelPreto. “The machine adapts to you, and not the other way around.”

For the project the team used “Baxter,” a humanoid robot from Rethink Robotics. With human supervision, the robot went from choosing the correct target 70 percent of the time to more than 97 percent of the time.

To create the system the team harnessed the power of electroencephalography (EEG) for brain activity and electromyography (EMG) for muscle activity, putting a series of electrodes on the users’ scalp and forearm.

Both metrics have some individual shortcomings: EEG signals are not always reliably detectable, while EMG signals can sometimes be difficult to map to motions that are any more specific than “move left or right.” Merging the two, however, allows for more robust bio-sensing and makes it possible for the system to work on new users without training.

“By looking at both muscle and brain signals, we can start to pick up on a person’s natural gestures along with their snap decisions about whether something is going wrong,” says DelPreto. “This helps make communicating with a robot more like communicating with another person.”

The team says that they could imagine the system one day being useful for the elderly, or workers with language disorders or limited mobility.

“We’d like to move away from a world where people have to adapt to the constraints of machines,” says Rus. “Approaches like this show that it’s very much possible to develop robotic systems that are a more natural and intuitive extension of us.”

Teaching robots how to move objects

Doctoral student Maria Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” she says.
Photo: Tony Pulsone

By Mary Beth O’Leary
With the push of a button, months of hard work were about to be put to the test. Sixteen teams of engineers convened in a cavernous exhibit hall in Nagoya, Japan, for the 2017 Amazon Robotics Challenge. The robotic systems they built were tasked with removing items from bins and placing them into boxes. For graduate student Maria Bauza, who served as task-planning lead for the MIT-Princeton Team, the moment was particularly nerve-wracking.

“It was super stressful when the competition started,” recalls Bauza. “You just press play and the robot is autonomous. It’s going to do whatever you code it for, but you have no control. If something is broken, then that’s it.”

Robotics has been a major focus for Bauza since her undergraduate career. She studied mathematics and engineering physics at the Polytechnic University of Catalonia in Barcelona. During a year as a visiting student at MIT, Bauza was able to put her interest in computer science and artificial intelligence into practice. “When I came to MIT for that year, I starting applying the tools I had learned in machine learning to real problems in robotics,” she adds.

Two creative undergraduate projects gave her even more practice in this area. In one project, she hacked the controller of a toy remote control car to make it drive in a straight line. In another, she developed a portable robot that could draw on the blackboard for teachers. The robot was given an image of Mona Lisa and, after going through an algorithm, it drew that image on the blackboard. “That was the first small success in my robotics career,” says Bauza.

After graduating with her bachelor’s degree in 2016, she joined the Manipulation and Mechanisms Laboratory at MIT (known as MCube Lab) under Assistant Professor Alberto Rodriguez’s guidance. “Maria brings together experience in machine learning and a strong background in mathematics, computer science, and mechanics, which makes her a great candidate to grow into a leader in the fields of machine learning and robotics,” says Rodriguez.

For her PhD thesis, Bauza is developing machine-learning algorithms and software to improve how robots interact with the world. MCube’s multidisciplinary team provides the support needed to pursue this goal.

“In the end, machine learning can’t work if you don’t have good data,” Bauza explains. “Good data comes from good hardware, good sensors, good cameras — so in MCube we all collaborate to make sure the systems we build are powerful enough to be autonomous.”

To create these robust autonomous systems, Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” explains Bauza.

Improving how robots interact with the world and reason to find the best possible outcome was crucial to the Amazon Robotics Challenge. Bauza built the code that helped the MIT-Princeton Team robot understand what object it was interacting with, and where to place that object. “Maria was in charge of developing the software for high-level decision making,” explains Rodriguez. “She did it without having prior experience in big robotic systems and it worked out fantastic.”

Bauza’s mind was at ease within a few minutes of 2017 Amazon Robotics Challenge. “After a few objects that you do well, you start to relax,” she remembers. “You realize the system is working. By the end it was such a good feeling!”

Bauza and the rest of the MCube team walked away with first place in the “stow task” portion of the challenge. They will continue to work with Amazon on perfecting the technology they developed.

While Bauza tackles the challenge of developing software to help robots interact with their environments, she has her own personal challenge to tackle: surviving winter in Boston. “I’m from the island of Menorca off the coast of Spain, so Boston winters have definitely been an adjustment,” she adds. “Every year I buy warmer clothes. But I’m really lucky to be here and be able to collaborate with Professor Rodriguez and the MCube team on developing smart robots that interact with their environment.”

Teaching robots how to move objects

Doctoral student Maria Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” she says.
Photo: Tony Pulsone

By Mary Beth O’Leary
With the push of a button, months of hard work were about to be put to the test. Sixteen teams of engineers convened in a cavernous exhibit hall in Nagoya, Japan, for the 2017 Amazon Robotics Challenge. The robotic systems they built were tasked with removing items from bins and placing them into boxes. For graduate student Maria Bauza, who served as task-planning lead for the MIT-Princeton Team, the moment was particularly nerve-wracking.

“It was super stressful when the competition started,” recalls Bauza. “You just press play and the robot is autonomous. It’s going to do whatever you code it for, but you have no control. If something is broken, then that’s it.”

Robotics has been a major focus for Bauza since her undergraduate career. She studied mathematics and engineering physics at the Polytechnic University of Catalonia in Barcelona. During a year as a visiting student at MIT, Bauza was able to put her interest in computer science and artificial intelligence into practice. “When I came to MIT for that year, I starting applying the tools I had learned in machine learning to real problems in robotics,” she adds.

Two creative undergraduate projects gave her even more practice in this area. In one project, she hacked the controller of a toy remote control car to make it drive in a straight line. In another, she developed a portable robot that could draw on the blackboard for teachers. The robot was given an image of Mona Lisa and, after going through an algorithm, it drew that image on the blackboard. “That was the first small success in my robotics career,” says Bauza.

After graduating with her bachelor’s degree in 2016, she joined the Manipulation and Mechanisms Laboratory at MIT (known as MCube Lab) under Assistant Professor Alberto Rodriguez’s guidance. “Maria brings together experience in machine learning and a strong background in mathematics, computer science, and mechanics, which makes her a great candidate to grow into a leader in the fields of machine learning and robotics,” says Rodriguez.

For her PhD thesis, Bauza is developing machine-learning algorithms and software to improve how robots interact with the world. MCube’s multidisciplinary team provides the support needed to pursue this goal.

“In the end, machine learning can’t work if you don’t have good data,” Bauza explains. “Good data comes from good hardware, good sensors, good cameras — so in MCube we all collaborate to make sure the systems we build are powerful enough to be autonomous.”

To create these robust autonomous systems, Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” explains Bauza.

Improving how robots interact with the world and reason to find the best possible outcome was crucial to the Amazon Robotics Challenge. Bauza built the code that helped the MIT-Princeton Team robot understand what object it was interacting with, and where to place that object. “Maria was in charge of developing the software for high-level decision making,” explains Rodriguez. “She did it without having prior experience in big robotic systems and it worked out fantastic.”

Bauza’s mind was at ease within a few minutes of 2017 Amazon Robotics Challenge. “After a few objects that you do well, you start to relax,” she remembers. “You realize the system is working. By the end it was such a good feeling!”

Bauza and the rest of the MCube team walked away with first place in the “stow task” portion of the challenge. They will continue to work with Amazon on perfecting the technology they developed.

While Bauza tackles the challenge of developing software to help robots interact with their environments, she has her own personal challenge to tackle: surviving winter in Boston. “I’m from the island of Menorca off the coast of Spain, so Boston winters have definitely been an adjustment,” she adds. “Every year I buy warmer clothes. But I’m really lucky to be here and be able to collaborate with Professor Rodriguez and the MCube team on developing smart robots that interact with their environment.”

Magnetic 3-D-printed structures crawl, roll, jump, and play catch

MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.
Photo: Felice Frankel

By Jennifer Chu
MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.

The menagerie of structures that can be magnetically manipulated includes a smooth ring that wrinkles up, a long tube that squeezes shut, a sheet that folds itself, and a spider-like “grabber” that can crawl, roll, jump, and snap together fast enough to catch a passing ball. It can even be directed to wrap itself around a small pill and carry it across a table.

The researchers fabricated each structure from a new type of 3-D-printable ink that they infused with tiny magnetic particles. They fitted an electromagnet around the nozzle of a 3-D printer, which caused the magnetic particles to swing into a single orientation as the ink was fed through the nozzle. By controlling the magnetic orientation of individual sections in the structure, the researchers can produce structures and devices that can almost instantaneously shift into intricate formations, and even move about, as the various sections respond to an external magnetic field.

Xuanhe Zhao, the Noyce Career Development Professor in MIT’s Department of Mechanical Engineering and Department of Civil and Environmental Engineering, says the group’s technique may be used to fabricate magnetically controlled biomedical devices.

“We think in biomedicine this technique will find promising applications,” Zhao says. “For example, we could put a structure around a blood vessel to control the pumping of blood, or use a magnet to guide a device through the GI tract to take images, extract tissue samples, clear a blockage, or deliver certain drugs to a specific location. You can design, simulate, and then just print to achieve various functions.”

Zhao and his colleagues have published their results today in the journal Nature. His co-authors include Yoonho Kim, Hyunwoo Yuk, and Ruike Zhao of MIT, and Shawn Chester of the New Jersey Institute of Technology.

A shifting field

The team’s magnetically activated structures fall under the general category of soft actuated devices — squishy, moldable materials that are designed to shape-shift or move about through a variety of mechanical means. For instance, hydrogel devices swell when temperature or pH changes; shape-memory polymers and liquid crystal elastomers deform with sufficient stimuli such as heat or light; pneumatic and hydraulic devices can be actuated by air or water pumped into them; and dielectric elastomers stretch under electric voltages.

But hydrogels, shape-memory polymers, and liquid crystal elastomers are slow to respond, and change shape over the course of minutes to hours. Air- and water-driven devices require tubes that connect them to pumps, making them inefficient for remotely controlled applications. Dielectric elastomers require high voltages, usually above a thousand volts.

“There is no ideal candidate for a soft robot that can perform in an enclosed space like a human body, where you’d want to carry out certain tasks untethered,” Kim says. “That’s why we think there’s great promise in this idea of magnetic actuation, because it is fast, forceful, body-benign, and can be remotely controlled.”

Other groups have fabricated magnetically activated materials, though the movements they have achieved have been relatively simple. For the most part, researchers mix a polymer solution with magnetic beads, and pour the mixture into a mold. Once the material cures, they apply a magnetic field to uniformly magnetize the beads, before removing the structure from the mold.

“People have only made structures that elongate, shrink, or bend,” Yuk says. “The challenge is, how do you design a structure or robot that can perform much more complicated tasks?”

Domain game

Instead of making structures with magnetic particles of the same, uniform orientation, the team looked for ways to create magnetic “domains” — individual sections of a structure, each with a distinct orientation of magnetic particles. When exposed to an external magnetic field, each section should move in a distinct way, depending on the direction its particles move in response to the magnetic field. In this way, the group surmised that structures should carry out more complex articulations and movements.

With their new 3-D-printing platform, the researchers can print sections, or domains, of a structure, and tune the orientation of magnetic particles in a particular domain by changing the direction of the electromagnet encircling the printer’s nozzle, as the domain is printed.  

The team also developed a physical model that predicts how a printed structure will deform under a magnetic field. Given the elasticity of the printed material, the pattern of domains in a structure, and the way in which an external magnetic field is applied, the model can predict the way an overall structure will deform or move. Ruike found that the model’s predictions closely matched with experiments the team carried out with a number of different printed structures.

In addition to a rippling ring, a self-squeezing tube, and a spider-like grabber, the team printed other complex structures, such as a set of “auxetic” structures that rapidly shrink or expand along two directions. Zhao and his colleagues also printed a ring embedded with electrical circuits and red and green LED lights. Depending on the orientation of an external magnetic field, the ring deforms to light up either red or green, in a programmed manner.

“We have developed a printing platform and a predictive model for others to use. People can design their own structures and domain patterns, validate them with the model, and print them to actuate various functions,” Zhao says. “By programming complex information of structure, domain, and magnetic field, one can even print intelligent machines such as robots.”

Jerry Qi, professor of mechanical engineering at Georgia Tech, says the group’s design can enable a range of fast, remotely controlled soft robotics, particularly in the biomedical field.

“This work is very novel,” says Qi, who was not involved in the research. “One could use a soft robot inside a human body or somewhere that is not easily accessible. With this technology reported in this paper, one can apply a magnetic field outside the human body, without using any wiring. Because of its fast responsive speed, the soft robot can fulfill many actions in a short time. These are important for practical applications.”

This research was supported, in part, by the National Science Foundation, the Office of Naval Research, and the MIT Institute for Soldier Nanotechnologies.

Magnetic 3-D-printed structures crawl, roll, jump, and play catch

MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.
Photo: Felice Frankel

By Jennifer Chu
MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.

The menagerie of structures that can be magnetically manipulated includes a smooth ring that wrinkles up, a long tube that squeezes shut, a sheet that folds itself, and a spider-like “grabber” that can crawl, roll, jump, and snap together fast enough to catch a passing ball. It can even be directed to wrap itself around a small pill and carry it across a table.

The researchers fabricated each structure from a new type of 3-D-printable ink that they infused with tiny magnetic particles. They fitted an electromagnet around the nozzle of a 3-D printer, which caused the magnetic particles to swing into a single orientation as the ink was fed through the nozzle. By controlling the magnetic orientation of individual sections in the structure, the researchers can produce structures and devices that can almost instantaneously shift into intricate formations, and even move about, as the various sections respond to an external magnetic field.

Xuanhe Zhao, the Noyce Career Development Professor in MIT’s Department of Mechanical Engineering and Department of Civil and Environmental Engineering, says the group’s technique may be used to fabricate magnetically controlled biomedical devices.

“We think in biomedicine this technique will find promising applications,” Zhao says. “For example, we could put a structure around a blood vessel to control the pumping of blood, or use a magnet to guide a device through the GI tract to take images, extract tissue samples, clear a blockage, or deliver certain drugs to a specific location. You can design, simulate, and then just print to achieve various functions.”

Zhao and his colleagues have published their results today in the journal Nature. His co-authors include Yoonho Kim, Hyunwoo Yuk, and Ruike Zhao of MIT, and Shawn Chester of the New Jersey Institute of Technology.

A shifting field

The team’s magnetically activated structures fall under the general category of soft actuated devices — squishy, moldable materials that are designed to shape-shift or move about through a variety of mechanical means. For instance, hydrogel devices swell when temperature or pH changes; shape-memory polymers and liquid crystal elastomers deform with sufficient stimuli such as heat or light; pneumatic and hydraulic devices can be actuated by air or water pumped into them; and dielectric elastomers stretch under electric voltages.

But hydrogels, shape-memory polymers, and liquid crystal elastomers are slow to respond, and change shape over the course of minutes to hours. Air- and water-driven devices require tubes that connect them to pumps, making them inefficient for remotely controlled applications. Dielectric elastomers require high voltages, usually above a thousand volts.

“There is no ideal candidate for a soft robot that can perform in an enclosed space like a human body, where you’d want to carry out certain tasks untethered,” Kim says. “That’s why we think there’s great promise in this idea of magnetic actuation, because it is fast, forceful, body-benign, and can be remotely controlled.”

Other groups have fabricated magnetically activated materials, though the movements they have achieved have been relatively simple. For the most part, researchers mix a polymer solution with magnetic beads, and pour the mixture into a mold. Once the material cures, they apply a magnetic field to uniformly magnetize the beads, before removing the structure from the mold.

“People have only made structures that elongate, shrink, or bend,” Yuk says. “The challenge is, how do you design a structure or robot that can perform much more complicated tasks?”

Domain game

Instead of making structures with magnetic particles of the same, uniform orientation, the team looked for ways to create magnetic “domains” — individual sections of a structure, each with a distinct orientation of magnetic particles. When exposed to an external magnetic field, each section should move in a distinct way, depending on the direction its particles move in response to the magnetic field. In this way, the group surmised that structures should carry out more complex articulations and movements.

With their new 3-D-printing platform, the researchers can print sections, or domains, of a structure, and tune the orientation of magnetic particles in a particular domain by changing the direction of the electromagnet encircling the printer’s nozzle, as the domain is printed.  

The team also developed a physical model that predicts how a printed structure will deform under a magnetic field. Given the elasticity of the printed material, the pattern of domains in a structure, and the way in which an external magnetic field is applied, the model can predict the way an overall structure will deform or move. Ruike found that the model’s predictions closely matched with experiments the team carried out with a number of different printed structures.

In addition to a rippling ring, a self-squeezing tube, and a spider-like grabber, the team printed other complex structures, such as a set of “auxetic” structures that rapidly shrink or expand along two directions. Zhao and his colleagues also printed a ring embedded with electrical circuits and red and green LED lights. Depending on the orientation of an external magnetic field, the ring deforms to light up either red or green, in a programmed manner.

“We have developed a printing platform and a predictive model for others to use. People can design their own structures and domain patterns, validate them with the model, and print them to actuate various functions,” Zhao says. “By programming complex information of structure, domain, and magnetic field, one can even print intelligent machines such as robots.”

Jerry Qi, professor of mechanical engineering at Georgia Tech, says the group’s design can enable a range of fast, remotely controlled soft robotics, particularly in the biomedical field.

“This work is very novel,” says Qi, who was not involved in the research. “One could use a soft robot inside a human body or somewhere that is not easily accessible. With this technology reported in this paper, one can apply a magnetic field outside the human body, without using any wiring. Because of its fast responsive speed, the soft robot can fulfill many actions in a short time. These are important for practical applications.”

This research was supported, in part, by the National Science Foundation, the Office of Naval Research, and the MIT Institute for Soldier Nanotechnologies.

Teaching chores to an artificial agent

A system developed at MIT aims to teach artificial agents a range of chores, including setting the table and making coffee.
Image: MIT CSAIL

By Adam Conner-Simons | Rachel Gordon

For many people, household chores are a dreaded, inescapable part of life that we often put off or do with little care. But what if a robot assistant could help lighten the load?

Recently, computer scientists have been working on teaching machines to do a wider range of tasks around the house. In a new paper spearheaded by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the University of Toronto, researchers demonstrate “VirtualHome,” a system that can simulate detailed household tasks and then have artificial “agents” execute them, opening up the possibility of one day teaching robots to do such tasks.

The team trained the system using nearly 3,000 programs of various activities, which are further broken down into subtasks for the computer to understand. A simple task like “making coffee,” for example, would also include the step “grabbing a cup.” The researchers demonstrated VirtualHome in a 3-D world inspired by the Sims video game.

The team’s artificial agent can execute 1,000 of these interactions in the Sims-style world, with eight different scenes including a living room, kitchen, dining room, bedroom, and home office.

“Describing actions as computer programs has the advantage of providing clear and unambiguous descriptions of all the steps needed to complete a task,” says MIT PhD student Xavier Puig, who was lead author on the paper. “These programs can instruct a robot or a virtual character, and can also be used as a representation for complex tasks with simpler actions.”

The project was co-developed by CSAIL and the University of Toronto alongside researchers from McGill University and the University of Ljubljana. It will be presented at the Computer Vision and Pattern Recognition (CVPR) conference, which takes place this month in Salt Lake City.

Unlike humans, robots need more explicit instructions to complete easy tasks; they can’t just infer and reason with ease.

For example, one might tell a human to “switch on the TV and watch it from the sofa.” Here, actions like “grab the remote control” and “sit/lie on sofa” have been omitted, since they’re part of the commonsense knowledge that humans have.

To better demonstrate these kinds of tasks to robots, the descriptions for actions needed to be much more detailed. To do so, the team first collected verbal descriptions of household activities, and then translated them into simple code. A program like this might include steps like: walk to the television, switch on the television, walk to the sofa, sit on the sofa, and watch television.

Once the programs were created, the team fed them to the VirtualHome 3-D simulator to be turned into videos. Then, a virtual agent would execute the tasks defined by the programs, whether it was watching television, placing a pot on the stove, or turning a toaster on and off.

The end result is not just a system for training robots to do chores, but also a large database of household tasks described using natural language. Companies like Amazon that are working to develop Alexa-like robotic systems at home could eventually use data like these to train their models to do more complex tasks.

The team’s model successfully demonstrated that their agents could learn to reconstruct a program, and therefore perform a task, given either a description: “pour milk into glass” or a video demonstration of the activity.

“This line of work could facilitate true robotic personal assistants in the future,” says Qiao Wang, a research assistant in arts, media, and engineering at Arizona State University. “Instead of each task programmed by the manufacturer, the robot can learn tasks just by listening to or watching the specific person it accompanies. This allows the robot to do tasks in a personalized way, or even some day invoke an emotional connection as a result of this personalized learning process.”

In the future, the team hopes to train the robots using actual videos instead of Sims-style simulation videos, which would enable a robot to learn simply by watching a YouTube video. The team is also working on implementing a reward-learning system in which the agent gets positive feedback when it does tasks correctly.

“You can imagine a setting where robots are assisting with chores at home and can eventually anticipate personalized wants and needs, or impending action,” says Puig. “This could be especially helpful as an assistive technology for the elderly, or those who may have limited mobility.”

Teaching chores to an artificial agent

A system developed at MIT aims to teach artificial agents a range of chores, including setting the table and making coffee.
Image: MIT CSAIL

By Adam Conner-Simons | Rachel Gordon

For many people, household chores are a dreaded, inescapable part of life that we often put off or do with little care. But what if a robot assistant could help lighten the load?

Recently, computer scientists have been working on teaching machines to do a wider range of tasks around the house. In a new paper spearheaded by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the University of Toronto, researchers demonstrate “VirtualHome,” a system that can simulate detailed household tasks and then have artificial “agents” execute them, opening up the possibility of one day teaching robots to do such tasks.

The team trained the system using nearly 3,000 programs of various activities, which are further broken down into subtasks for the computer to understand. A simple task like “making coffee,” for example, would also include the step “grabbing a cup.” The researchers demonstrated VirtualHome in a 3-D world inspired by the Sims video game.

The team’s artificial agent can execute 1,000 of these interactions in the Sims-style world, with eight different scenes including a living room, kitchen, dining room, bedroom, and home office.

“Describing actions as computer programs has the advantage of providing clear and unambiguous descriptions of all the steps needed to complete a task,” says MIT PhD student Xavier Puig, who was lead author on the paper. “These programs can instruct a robot or a virtual character, and can also be used as a representation for complex tasks with simpler actions.”

The project was co-developed by CSAIL and the University of Toronto alongside researchers from McGill University and the University of Ljubljana. It will be presented at the Computer Vision and Pattern Recognition (CVPR) conference, which takes place this month in Salt Lake City.

Unlike humans, robots need more explicit instructions to complete easy tasks; they can’t just infer and reason with ease.

For example, one might tell a human to “switch on the TV and watch it from the sofa.” Here, actions like “grab the remote control” and “sit/lie on sofa” have been omitted, since they’re part of the commonsense knowledge that humans have.

To better demonstrate these kinds of tasks to robots, the descriptions for actions needed to be much more detailed. To do so, the team first collected verbal descriptions of household activities, and then translated them into simple code. A program like this might include steps like: walk to the television, switch on the television, walk to the sofa, sit on the sofa, and watch television.

Once the programs were created, the team fed them to the VirtualHome 3-D simulator to be turned into videos. Then, a virtual agent would execute the tasks defined by the programs, whether it was watching television, placing a pot on the stove, or turning a toaster on and off.

The end result is not just a system for training robots to do chores, but also a large database of household tasks described using natural language. Companies like Amazon that are working to develop Alexa-like robotic systems at home could eventually use data like these to train their models to do more complex tasks.

The team’s model successfully demonstrated that their agents could learn to reconstruct a program, and therefore perform a task, given either a description: “pour milk into glass” or a video demonstration of the activity.

“This line of work could facilitate true robotic personal assistants in the future,” says Qiao Wang, a research assistant in arts, media, and engineering at Arizona State University. “Instead of each task programmed by the manufacturer, the robot can learn tasks just by listening to or watching the specific person it accompanies. This allows the robot to do tasks in a personalized way, or even some day invoke an emotional connection as a result of this personalized learning process.”

In the future, the team hopes to train the robots using actual videos instead of Sims-style simulation videos, which would enable a robot to learn simply by watching a YouTube video. The team is also working on implementing a reward-learning system in which the agent gets positive feedback when it does tasks correctly.

“You can imagine a setting where robots are assisting with chores at home and can eventually anticipate personalized wants and needs, or impending action,” says Puig. “This could be especially helpful as an assistive technology for the elderly, or those who may have limited mobility.”

Surgical technique improves sensation, control of prosthetic limb

Two agonist-antagonist myoneural interface devices (AMIs) were surgically created in the patient’s residual limb: One was electrically linked to the robotic ankle joint, and the other to the robotic subtalar joint.
Image: MIT Media Lab/Biomechatronics group. Original artwork by Stephanie Ku.

By Helen Knight

Humans can accurately sense the position, speed, and torque of their limbs, even with their eyes shut. This sense, known as proprioception, allows humans to precisely control their body movements.

Despite significant improvements to prosthetic devices in recent years, researchers have been unable to provide this essential sensation to people with artificial limbs, limiting their ability to accurately control their movements.

Researchers at the Center for Extreme Bionics at the MIT Media Lab have invented a new neural interface and communication paradigm that is able to send movement commands from the central nervous system to a robotic prosthesis, and relay proprioceptive feedback describing movement of the joint back to the central nervous system in return.

This new paradigm, known as the agonist-antagonist myoneural interface, involves a novel surgical approach to limb amputation in which dynamic muscle relationships are preserved within the amputated limb. The AMI was validated in extensive preclinical experimentation at MIT prior to its first surgical implementation in a human patient at Brigham and Women’s Faulkner Hospital.

In a paper published today in Science Translational Medicine, the researchers describe the first human implementation of the agonist-antagonist myoneural interface (AMI), in a person with below-knee amputation.

The paper represents the first time information on joint position, speed, and torque has been fed from a prosthetic limb into the nervous system, according to senior author and project director Hugh Herr, a professor of media arts and sciences at the MIT Media Lab.

“Our goal is to close the loop between the peripheral nervous system’s muscles and nerves, and the bionic appendage,” says Herr.

To do this, the researchers used the same biological sensors that create the body’s natural proprioceptive sensations.

The AMI consists of two opposing muscle-tendons, known as an agonist and an antagonist, which are surgically connected in series so that when one muscle contracts and shortens — upon either volitional or electrical activation — the other stretches, and vice versa.

This coupled movement enables natural biological sensors within the muscle-tendon to transmit electrical signals to the central nervous system, communicating muscle length, speed, and force information, which is interpreted by the brain as natural joint proprioception. 

This is how muscle-tendon proprioception works naturally in human joints, Herr says.

“Because the muscles have a natural nerve supply, when this agonist-antagonist muscle movement occurs information is sent through the nerve to the brain, enabling the person to feel those muscles moving, both their position, speed, and load,” he says.

By connecting the AMI with electrodes, the researchers can detect electrical pulses from the muscle, or apply electricity to the muscle to cause it to contract.

“When a person is thinking about moving their phantom ankle, the AMI that maps to that bionic ankle is moving back and forth, sending signals through the nerves to the brain, enabling the person with an amputation to actually feel their bionic ankle moving throughout the whole angular range,” Herr says.

Decoding the electrical language of proprioception within nerves is extremely difficult, according to Tyler Clites, first author of the paper and graduate student lead on the project.

“Using this approach, rather than needing to speak that electrical language ourselves, we use these biological sensors to speak the language for us,” Clites says. “These sensors translate mechanical stretch into electrical signals that can be interpreted by the brain as sensations of position, speed, and force.”

The AMI was first implemented surgically in a human patient at Brigham and Women’s Faulkner Hospital, Boston, by Matthew Carty, one of the paper’s authors, a surgeon in the Division of Plastic and Reconstructive Surgery, and an MIT research scientist.

In this operation, two AMIs were constructed in the residual limb at the time of primary below-knee amputation, with one AMI to control the prosthetic ankle joint, and the other to control the prosthetic subtalar joint.

“We knew that in order for us to validate the success of this new approach to amputation, we would need to couple the procedure with a novel prosthesis that could take advantage of the additional capabilities of this new type of residual limb,” Carty says. “Collaboration was critical, as the design of the procedure informed the design of the robotic limb, and vice versa.”

Toward this end, an advanced prosthetic limb was built at MIT and electrically linked to the patient’s peripheral nervous system using electrodes placed over each AMI muscle following the amputation surgery.

The researchers then compared the movement of the AMI patient with that of four people who had undergone a traditional below-knee amputation procedure, using the same advanced prosthetic limb.

They found that the AMI patient had more stable control over movement of the prosthetic device and was able to move more efficiently than those with the conventional amputation. They also found that the AMI patient quickly displayed natural, reflexive behaviors such as extending the toes toward the next step when walking down a set of stairs.

These behaviors are essential to natural human movement and were absent in all of the people who had undergone a traditional amputation.

What’s more, while the patients with conventional amputation reported feeling disconnected to the prosthesis, the AMI patient quickly described feeling that the bionic ankle and foot had become a part of their own body.

“This is pretty significant evidence that the brain and the spinal cord in this patient adopted the prosthetic leg as if it were their biological limb, enabling those biological pathways to become active once again,” Clites says. “We believe proprioception is fundamental to that adoption.”

It is difficult for an individual with a lower limb amputation to gain a sense of embodiment with their artificial limb, according to Daniel Ferris, the Robert W. Adenbaum Professor of Engineering Innovation at the University of Florida, who was not involved in the research.

“This is ground breaking. The increased sense of embodiment by the amputee subject is a powerful result of having better control of and feedback from the bionic limb,” Ferris says. “I expect that we will see individuals with traumatic amputations start to seek out this type of surgery and interface for their prostheses — it could provide a much greater quality of life for amputees.”

The researchers have since carried out the AMI procedure on nine other below-knee amputees and are planning to adapt the technique for those needing above-knee, below-elbow, and above-elbow amputations.

“Previously, humans have used technology in a tool-like fashion,” Herr says. “We are now starting to see a new era of human-device interaction, of full neurological embodiment, in which what we design becomes truly part of us, part of our identity.”

Surgical technique improves sensation, control of prosthetic limb

Two agonist-antagonist myoneural interface devices (AMIs) were surgically created in the patient’s residual limb: One was electrically linked to the robotic ankle joint, and the other to the robotic subtalar joint.
Image: MIT Media Lab/Biomechatronics group. Original artwork by Stephanie Ku.

By Helen Knight

Humans can accurately sense the position, speed, and torque of their limbs, even with their eyes shut. This sense, known as proprioception, allows humans to precisely control their body movements.

Despite significant improvements to prosthetic devices in recent years, researchers have been unable to provide this essential sensation to people with artificial limbs, limiting their ability to accurately control their movements.

Researchers at the Center for Extreme Bionics at the MIT Media Lab have invented a new neural interface and communication paradigm that is able to send movement commands from the central nervous system to a robotic prosthesis, and relay proprioceptive feedback describing movement of the joint back to the central nervous system in return.

This new paradigm, known as the agonist-antagonist myoneural interface, involves a novel surgical approach to limb amputation in which dynamic muscle relationships are preserved within the amputated limb. The AMI was validated in extensive preclinical experimentation at MIT prior to its first surgical implementation in a human patient at Brigham and Women’s Faulkner Hospital.

In a paper published today in Science Translational Medicine, the researchers describe the first human implementation of the agonist-antagonist myoneural interface (AMI), in a person with below-knee amputation.

The paper represents the first time information on joint position, speed, and torque has been fed from a prosthetic limb into the nervous system, according to senior author and project director Hugh Herr, a professor of media arts and sciences at the MIT Media Lab.

“Our goal is to close the loop between the peripheral nervous system’s muscles and nerves, and the bionic appendage,” says Herr.

To do this, the researchers used the same biological sensors that create the body’s natural proprioceptive sensations.

The AMI consists of two opposing muscle-tendons, known as an agonist and an antagonist, which are surgically connected in series so that when one muscle contracts and shortens — upon either volitional or electrical activation — the other stretches, and vice versa.

This coupled movement enables natural biological sensors within the muscle-tendon to transmit electrical signals to the central nervous system, communicating muscle length, speed, and force information, which is interpreted by the brain as natural joint proprioception. 

This is how muscle-tendon proprioception works naturally in human joints, Herr says.

“Because the muscles have a natural nerve supply, when this agonist-antagonist muscle movement occurs information is sent through the nerve to the brain, enabling the person to feel those muscles moving, both their position, speed, and load,” he says.

By connecting the AMI with electrodes, the researchers can detect electrical pulses from the muscle, or apply electricity to the muscle to cause it to contract.

“When a person is thinking about moving their phantom ankle, the AMI that maps to that bionic ankle is moving back and forth, sending signals through the nerves to the brain, enabling the person with an amputation to actually feel their bionic ankle moving throughout the whole angular range,” Herr says.

Decoding the electrical language of proprioception within nerves is extremely difficult, according to Tyler Clites, first author of the paper and graduate student lead on the project.

“Using this approach, rather than needing to speak that electrical language ourselves, we use these biological sensors to speak the language for us,” Clites says. “These sensors translate mechanical stretch into electrical signals that can be interpreted by the brain as sensations of position, speed, and force.”

The AMI was first implemented surgically in a human patient at Brigham and Women’s Faulkner Hospital, Boston, by Matthew Carty, one of the paper’s authors, a surgeon in the Division of Plastic and Reconstructive Surgery, and an MIT research scientist.

In this operation, two AMIs were constructed in the residual limb at the time of primary below-knee amputation, with one AMI to control the prosthetic ankle joint, and the other to control the prosthetic subtalar joint.

“We knew that in order for us to validate the success of this new approach to amputation, we would need to couple the procedure with a novel prosthesis that could take advantage of the additional capabilities of this new type of residual limb,” Carty says. “Collaboration was critical, as the design of the procedure informed the design of the robotic limb, and vice versa.”

Toward this end, an advanced prosthetic limb was built at MIT and electrically linked to the patient’s peripheral nervous system using electrodes placed over each AMI muscle following the amputation surgery.

The researchers then compared the movement of the AMI patient with that of four people who had undergone a traditional below-knee amputation procedure, using the same advanced prosthetic limb.

They found that the AMI patient had more stable control over movement of the prosthetic device and was able to move more efficiently than those with the conventional amputation. They also found that the AMI patient quickly displayed natural, reflexive behaviors such as extending the toes toward the next step when walking down a set of stairs.

These behaviors are essential to natural human movement and were absent in all of the people who had undergone a traditional amputation.

What’s more, while the patients with conventional amputation reported feeling disconnected to the prosthesis, the AMI patient quickly described feeling that the bionic ankle and foot had become a part of their own body.

“This is pretty significant evidence that the brain and the spinal cord in this patient adopted the prosthetic leg as if it were their biological limb, enabling those biological pathways to become active once again,” Clites says. “We believe proprioception is fundamental to that adoption.”

It is difficult for an individual with a lower limb amputation to gain a sense of embodiment with their artificial limb, according to Daniel Ferris, the Robert W. Adenbaum Professor of Engineering Innovation at the University of Florida, who was not involved in the research.

“This is ground breaking. The increased sense of embodiment by the amputee subject is a powerful result of having better control of and feedback from the bionic limb,” Ferris says. “I expect that we will see individuals with traumatic amputations start to seek out this type of surgery and interface for their prostheses — it could provide a much greater quality of life for amputees.”

The researchers have since carried out the AMI procedure on nine other below-knee amputees and are planning to adapt the technique for those needing above-knee, below-elbow, and above-elbow amputations.

“Previously, humans have used technology in a tool-like fashion,” Herr says. “We are now starting to see a new era of human-device interaction, of full neurological embodiment, in which what we design becomes truly part of us, part of our identity.”

Fleet of autonomous boats could service some cities, reducing road traffic

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Senseable City Lab have designed a fleet of autonomous boats that offer high maneuverability and precise control.
Courtesy of the researchers
By Rob Matheson

The future of transportation in waterway-rich cities such as Amsterdam, Bangkok, and Venice — where canals run alongside and under bustling streets and bridges — may include autonomous boats that ferry goods and people, helping clear up road congestion.

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Senseable City Lab in the Department of Urban Studies and Planning (DUSP), have taken a step toward that future by designing a fleet of autonomous boats that offer high maneuverability and precise control. The boats can also be rapidly 3-D printed using a low-cost printer, making mass manufacturing more feasible.

The boats could be used to taxi people around and to deliver goods, easing street traffic. In the future, the researchers also envision the driverless boats being adapted to perform city services overnight, instead of during busy daylight hours, further reducing congestion on both roads and canals.

“Imagine shifting some of infrastructure services that usually take place during the day on the road — deliveries, garbage management, waste management — to the middle of the night, on the water, using a fleet of autonomous boats,” says CSAIL Director Daniela Rus, co-author on a paper describing the technology that’s being presented at this week’s IEEE International Conference on Robotics and Automation.

Moreover, the boats — rectangular 4-by-2-meter hulls equipped with sensors, microcontrollers, GPS modules, and other hardware — could be programmed to self-assemble into floating bridges, concert stages, platforms for food markets, and other structures in a matter of hours. “Again, some of the activities that are usually taking place on land, and that cause disturbance in how the city moves, can be done on a temporary basis on the water,” says Rus, who is the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science.

The boats could also be equipped with environmental sensors to monitor a city’s waters and gain insight into urban and human health.

Co-authors on the paper are: first author Wei Wang, a joint postdoc in CSAIL and the Senseable City Lab; Luis A. Mateos and Shinkyu Park, both DUSP postdocs; Pietro Leoni, a research fellow, and Fábio Duarte, a research scientist, both in DUSP and the Senseable City Lab; Banti Gheneti, a graduate student in the Department of Electrical Engineering and Computer Science; and Carlo Ratti, a principal investigator and professor of the practice in the DUSP and director of the MIT Senseable City Lab.

Better design and control

The work was conducted as part of the “Roboat” project, a collaboration between the MIT Senseable City Lab and the Amsterdam Institute for Advanced Metropolitan Solutions (AMS). In 2016, as part of the project, the researchers tested a prototype that cruised around the city’s canals, moving forward, backward, and laterally along a preprogrammed path.

The ICRA paper details several important new innovations: a rapid fabrication technique, a more efficient and agile design, and advanced trajectory-tracking algorithms that improve control, precision docking and latching, and other tasks. 

To make the boats, the researchers 3-D-printed a rectangular hull with a commercial printer, producing 16 separate sections that were spliced together. Printing took around 60 hours. The completed hull was then sealed by adhering several layers of fiberglass.

Integrated onto the hull are a power supply, Wi-Fi antenna, GPS, and a minicomputer and microcontroller. For precise positioning, the researchers incorporated an indoor ultrasound beacon system and outdoor real-time kinematic GPS modules, which allow for centimeter-level localization, as well as an inertial measurement unit (IMU) module that monitors the boat’s yaw and angular velocity, among other metrics.

The boat is a rectangular shape, instead of the traditional kayak or catamaran shapes, to allow the vessel to move sideways and to attach itself to other boats when assembling other structures. Another simple yet effective design element was thruster placement. Four thrusters are positioned in the center of each side, instead of at the four corners, generating forward and backward forces. This makes the boat more agile and efficient, the researchers say.

The team also developed a method that enables the boat to track its position and orientation more quickly and accurately. To do so, they developed an efficient version of a nonlinear model predictive control (NMPC) algorithm, generally used to control and navigate robots within various constraints.

The NMPC and similar algorithms have been used to control autonomous boats before. But typically those algorithms are tested only in simulation or don’t account for the dynamics of the boat. The researchers instead incorporated in the algorithm simplified nonlinear mathematical models that account for a few known parameters, such as drag of the boat, centrifugal and Coriolis forces, and added mass due to accelerating or decelerating in water. The researchers also used an identification algorithm that then identifies any unknown parameters as the boat is trained on a path.

Finally, the researchers used an efficient predictive-control platform to run their algorithm, which can rapidly determine upcoming actions and increases the algorithm’s speed by two orders of magnitude over similar systems. While other algorithms execute in about 100 milliseconds, the researchers’ algorithm takes less than 1 millisecond.

Testing the waters

To demonstrate the control algorithm’s efficacy, the researchers deployed a smaller prototype of the boat along preplanned paths in a swimming pool and in the Charles River. Over the course of 10 test runs, the researchers observed average tracking errors — in positioning and orientation — smaller than tracking errors of traditional control algorithms.

That accuracy is thanks, in part, to the boat’s onboard GPS and IMU modules, which determine position and direction, respectively, down to the centimeter. The NMPC algorithm crunches the data from those modules and weighs various metrics to steer the boat true. The algorithm is implemented in a controller computer and regulates each thruster individually, updating every 0.2 seconds.

“The controller considers the boat dynamics, current state of the boat, thrust constraints, and reference position for the coming several seconds, to optimize how the boat drives on the path,” Wang says. “We can then find optimal force for the thrusters that can take the boat back to the path and minimize errors.”

The innovations in design and fabrication, as well as faster and more precise control algorithms, point toward feasible driverless boats used for transportation, docking, and self-assembling into platforms, the researchers say.

A next step for the work is developing adaptive controllers to account for changes in mass and drag of the boat when transporting people and goods. The researchers are also refining the controller to account for wave disturbances and stronger currents.

“We actually found that the Charles River has much more current than in the canals in Amsterdam,” Wang says. “But there will be a lot of boats moving around, and big boats will bring big currents, so we still have to consider this.”

The work was supported by a grant from AMS.

Making driverless cars change lanes more like human drivers do

At the International Conference on Robotics and Automation tomorrow, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) will present a new lane-change algorithm.
By Larry Hardesty

In the field of self-driving cars, algorithms for controlling lane changes are an important topic of study. But most existing lane-change algorithms have one of two drawbacks: Either they rely on detailed statistical models of the driving environment, which are difficult to assemble and too complex to analyze on the fly; or they’re so simple that they can lead to impractically conservative decisions, such as never changing lanes at all.

At the International Conference on Robotics and Automation tomorrow, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) will present a new lane-change algorithm that splits the difference. It allows for more aggressive lane changes than the simple models do but relies only on immediate information about other vehicles’ directions and velocities to make decisions.

“The motivation is, ‘What can we do with as little information as possible?’” says Alyssa Pierson, a postdoc at CSAIL and first author on the new paper. “How can we have an autonomous vehicle behave as a human driver might behave? What is the minimum amount of information the car needs to elicit that human-like behavior?”

Pierson is joined on the paper by Daniela Rus, the Viterbi Professor of Electrical Engineering and Computer Science; Sertac Karaman, associate professor of aeronautics and astronautics; and Wilko Schwarting, a graduate student in electrical engineering and computer science.

“The optimization solution will ensure navigation with lane changes that can model an entire range of driving styles, from conservative to aggressive, with safety guarantees,” says Rus, who is the director of CSAIL.

One standard way for autonomous vehicles to avoid collisions is to calculate buffer zones around the other vehicles in the environment. The buffer zones describe not only the vehicles’ current positions but their likely future positions within some time frame. Planning lane changes then becomes a matter of simply staying out of other vehicles’ buffer zones.

For any given method of computing buffer zones, algorithm designers must prove that it guarantees collision avoidance, within the context of the mathematical model used to describe traffic patterns. That proof can be complex, so the optimal buffer zones are usually computed in advance. During operation, the autonomous vehicle then calls up the precomputed buffer zones that correspond to its situation.

The problem is that if traffic is fast enough and dense enough, precomputed buffer zones may be too restrictive. An autonomous vehicle will fail to change lanes at all, whereas a human driver would cheerfully zip around the roadway.

With the MIT researchers’ system, if the default buffer zones are leading to performance that’s far worse than a human driver’s, the system will compute new buffer zones on the fly — complete with proof of collision avoidance.

That approach depends on a mathematically efficient method of describing buffer zones, so that the collision-avoidance proof can be executed quickly. And that’s what the MIT researchers developed.

They begin with a so-called Gaussian distribution — the familiar bell-curve probability distribution. That distribution represents the current position of the car, factoring in both its length and the uncertainty of its location estimation.

Then, based on estimates of the car’s direction and velocity, the researchers’ system constructs a so-called logistic function. Multiplying the logistic function by the Gaussian distribution skews the distribution in the direction of the car’s movement, with higher speeds increasing the skew.

The skewed distribution defines the vehicle’s new buffer zone. But its mathematical description is so simple — using only a few equation variables — that the system can evaluate it on the fly.

The researchers tested their algorithm in a simulation including up to 16 autonomous cars driving in an environment with several hundred other vehicles.

“The autonomous vehicles were not in direct communication but ran the proposed algorithm in parallel without conflict or collisions,” explains Pierson. “Each car used a different risk threshold that produced a different driving style, allowing us to create conservative and aggressive drivers. Using the static, precomputed buffer zones would only allow for conservative driving, whereas our dynamic algorithm allows for a broader range of driving styles.”

This project was supported, in part, by the Toyota Research Institute and the Office of Naval Research.

Researchers develop virtual-reality testing ground for drones

MIT engineers have developed a new virtual-reality training system for drones that enables a vehicle to “see” a rich, virtual environment while flying in an empty physical space.
Image: William Litant
By Jennifer Chu

Training drones to fly fast, around even the simplest obstacles, is a crash-prone exercise that can have engineers repairing or replacing vehicles with frustrating regularity.

Now MIT engineers have developed a new virtual-reality training system for drones that enables a vehicle to “see” a rich, virtual environment while flying in an empty physical space.

The system, which the team has dubbed “Flight Goggles,” could significantly reduce the number of crashes that drones experience in actual training sessions. It can also serve as a virtual testbed for any number of environments and conditions in which researchers might want to train fast-flying drones.

“We think this is a game-changer in the development of drone technology, for drones that go fast,” says Sertac Karaman, associate professor of aeronautics and astronautics at MIT. “If anything, the system can make autonomous vehicles more responsive, faster, and more efficient.”

Karaman and his colleagues will present details of their virtual training system at the IEEE International Conference on Robotics and Automation next week. Co-authors include Thomas Sayre-McCord, Winter Guerra, Amado Antonini, Jasper Arneberg, Austin Brown, Guilherme Cavalheiro, Dave McCoy, Sebastian Quilter, Fabian Riether, Ezra Tal, Yunus Terzioglu, and Luca Carlone of MIT’s Laboratory for Information and Decision Systems, along with Yajun Fang of MIT’s Computer Science and Artificial Intelligence Laboratory, and Alex Gorodetsky of Sandia National Laboratories.

Pushing boundaries

Karaman was initially motivated by a new, extreme robo-sport: competitive drone racing, in which remote-controlled drones, driven by human players, attempt to out-fly each other through an intricate maze of windows, doors, and other obstacles. Karaman wondered: Could an autonomous drone be trained to fly just as fast, if not faster, than these human-controlled vehicles, with even better precision and control?

“In the next two or three years, we want to enter a drone racing competition with an autonomous drone, and beat the best human player,” Karaman says. To do so, the team would have to develop an entirely new training regimen.

Currently, training autonomous drones is a physical task: Researchers fly drones in large, enclosed testing grounds, in which they often hang large nets to catch any careening vehicles. They also set up props, such as windows and doors, through which a drone can learn to fly. When vehicles crash, they must be repaired or replaced, which delays development and adds to a project’s cost.

Karaman says testing drones in this way can work for vehicles that are not meant to fly fast, such as drones that are programmed to slowly map their surroundings. But for fast-flying vehicles that need to process visual information quickly as they fly through an environment, a new training system is necessary.

“The moment you want to do high-throughput computing and go fast, even the slightest changes you make to its environment will cause the drone to crash,” Karaman says. “You can’t learn in that environment. If you want to push boundaries on how fast you can go and compute, you need some sort of virtual-reality environment.”

Flight Goggles

The team’s new virtual training system comprises a motion capture system, an image rendering program, and electronics that enable the team to quickly process images and transmit them to the drone.

The actual test space — a hangar-like gymnasium in MIT’s new drone-testing facility in Building 31 — is lined with motion-capture cameras that track the orientation of the drone as it’s flying.

With the image-rendering system, Karaman and his colleagues can draw up photorealistic scenes, such as a loft apartment or a living room, and beam these virtual images to the drone as it’s flying through the empty facility.    

“The drone will be flying in an empty room, but will be ‘hallucinating’ a completely different environment, and will learn in that environment,” Karaman explains.

The virtual images can be processed by the drone at a rate of about 90 frames per second — around three times as fast as the human eye can see and process images. To enable this, the team custom-built circuit boards that integrate a powerful embedded supercomputer, along with an inertial measurement unit and a camera. They fit all this hardware into a small, 3-D-printed nylon and carbon-fiber-reinforced drone frame. 

A crash course

The researchers carried out a set of experiments, including one in which the drone learned to fly through a virtual window about twice its size. The window was set within a virtual living room. As the drone flew in the actual, empty testing facility, the researchers beamed images of the living room scene, from the drone’s perspective, back to the vehicle. As the drone flew through this virtual room, the researchers tuned a navigation algorithm, enabling the drone to learn on the fly.

Over 10 flights, the drone, flying at around 2.3 meters per second (5 miles per hour), successfully flew through the virtual window 361 times, only “crashing” into the window three times, according to positioning information provided by the facility’s motion-capture cameras. Karaman points out that, even if the drone crashed thousands of times, it wouldn’t make much of an impact on the cost or time of development, as it’s crashing in a virtual environment and not making any physical contact with the real world.

In a final test, the team set up an actual window in the test facility, and turned on the drone’s onboard camera to enable it to see and process its actual surroundings. Using the navigation algorithm that the researchers tuned in the virtual system, the drone, over eight flights, was able to fly through the real window 119 times, only crashing or requiring human intervention six times.

“It does the same thing in reality,” Karaman says. “It’s something we programmed it to do in the virtual environment, by making mistakes, falling apart, and learning. But we didn’t break any actual windows in this process.”

He says the virtual training system is highly malleable. For instance, researchers can pipe in their own scenes or layouts in which to train drones, including detailed, drone-mapped replicas of actual buildings — something the team is considering doing with MIT’s Stata Center. The training system may also be used to test out new sensors, or specifications for existing sensors, to see how they might handle on a fast-flying drone.

“We could try different specs in this virtual environment and say, ‘If you build a sensor with these specs, how would it help a drone in this environment?’’ Karaman says.

The system can also be used to train drones to fly safely around humans. For instance, Karaman envisions splitting the actual test facility in two, with a drone flying in one half, while a human, wearing a motion-capture suit, walks in the other half. The drone would “see” the human in virtual reality as it flies around its own space. If it crashes into the person, the result is virtual, and harmless.

“One day, when you’re really confident, you can do it in reality, and have a drone flying around a person as they’re running, in a safe way,” Karaman says. “There are a lot of mind-bending experiments you can do in this whole virtual reality thing. Over time, we will showcase all the things you can do.”

This research was supported, in part, by U.S. Office of Naval Research, MIT Lincoln Laboratory, and the NVIDIA Corporation.

Albatross robot takes flight

An albatross glider, designed by MIT engineers, skims the Charles River.
Photo: Gabriel Bousquet

By Jennifer Chu

MIT engineers have designed a robotic glider that can skim along the water’s surface, riding the wind like an albatross while also surfing the waves like a sailboat.

In regions of high wind, the robot is designed to stay aloft, much like its avian counterpart. Where there are calmer winds, the robot can dip a keel into the water to ride like a highly efficient sailboat instead.

The robotic system, which borrows from both nautical and biological designs, can cover a given distance using one-third as much wind as an albatross and traveling 10 times faster than a typical sailboat. The glider is also relatively lightweight, weighing about 6 pounds. The researchers hope that in the near future, such compact, speedy robotic water-skimmers may be deployed in teams to survey large swaths of the ocean.

“The oceans remain vastly undermonitored,” says Gabriel Bousquet, a former postdoc in MIT’s Department of Aeronautics and Astronautics, who led the design of the robot as part of his graduate thesis. “In particular, it’s very important to understand the Southern Ocean and how it is interacting with climate change. But it’s very hard to get there. We can now use the energy from the environment in an efficient way to do this long-distance travel, with a system that remains small-scale.”

Bousquet will present details of the robotic system this week at IEEE’s International Conference on Robotics and Automation, in Brisbane, Australia. His collaborators on the project are Jean-Jacques Slotine, professor of mechanical engineering and information sciences and of brain sciences; and Michael Triantafyllou, the Henry L. and Grace Doherty Professor in Ocean Science and Engineering.

The physics of speed

Last year, Bousquet, Slotine, and Triantafyllou published a study on the dynamics of albatross flight, in which they identified the mechanics that enable the tireless traveler to cover vast distances while expending minimal energy. The key to the bird’s marathon voyages is its ability to ride in and out of high- and low-speed layers of air.

Specifically, the researchers found the bird is able to perform a mechanical process called a “transfer of momentum,” in which it takes momentum from higher, faster layers of air, and by diving down transfers that momentum to lower, slower layers, propelling itself without having to continuously flap its wings.

Interestingly, Bousquet observed that the physics of albatross flight is very similar to that of sailboat travel. Both the albatross and the sailboat transfer momentum in order to keep moving. But in the case of the sailboat, that transfer occurs not between layers of air, but between the air and water.

“Sailboats take momentum from the wind with their sail, and inject it into the water by pushing back with their keel,” Bousquet explains. “That’s how energy is extracted for sailboats.”

An albatross glider, designed by MIT engineers, skims the Charles River.

Bousquet also realized that the speed at which both an albatross and a sailboat can travel depends upon the same general equation, related to the transfer of momentum. Essentially, both the bird and the boat can travel faster if they can either stay aloft easily or interact with two layers, or mediums, of very different speeds.

The albatross does well with the former, as its wings provide natural lift, though it flies between air layers with a relatively small difference in windspeeds. Meanwhile, the sailboat excels at the latter, traveling between two mediums of very different speeds — air versus water — though its hull creates a lot of friction and prevents it from getting much speed.  Bousquet wondered: What if a vehicle could be designed to perform well in both metrics, marrying the high-speed qualities of both the albatross and the sailboat?

“We thought, how could we take the best from both worlds?” Bousquet says.

Out on the water

The team drafted a design for such a hybrid vehicle, which ultimately resembled an autonomous glider with a 3-meter wingspan, similar to that of a typical albatross. They added a tall, triangular sail, as well as a slender, wing-like keel. They then performed some mathematical modeling to predict how such a design would travel.

According to their calculations, the wind-powered vehicle would only need relatively calm winds of about 5 knots to zip across waters at a velocity of about 20 knots, or 23 miles per hour.

“We found that in light winds you can travel about three to 10 times faster than a traditional sailboat, and you need about half as much wind as an albatross, to reach 20 knots,” Bousquet says. “It’s very efficient, and you can travel very fast, even if there is not too much wind.”

The team built a prototype of their design, using a glider airframe designed by Mark Drela, professor of aeronautics and astronautics at MIT. To the bottom of the glider they added a keel, along with various instruments, such as GPS, inertial measurement sensors, auto-pilot instrumentation, and ultrasound, to track the height of the glider above the water.

“The goal here was to show we can control very precisely how high we are above the water, and that we can have the robot fly above the water, then down to where the keel can go under the water to generate a force, and the plane can still fly,” Bousquet says.

The researchers decided to test this “critical maneuver” — the act of transitioning between flying in the air and dipping the keel down to sail in the water. Accomplishing this move doesn’t necessarily require a sail, so Bousquet and his colleagues decided not to include one in order to simplify preliminary experiments.

In the fall of 2016, the team put its design to the test, launching the robot from the MIT Sailing Pavilion out onto the Charles River. As the robot lacked a sail and any mechanism to get it started, the team hung it from a fishing rod attached to a whaler boat. With this setup, the boat towed the robot along the river until it reached about 20 miles per hour, at which point the robot autonomously “took off,” riding the wind on its own.

Once it was flying autonomously, Bousquet used a remote control to give the robot a “down” command, prompting it to dip low enough to submerge its keel in the river. Next, he adjusted the direction of the keel, and observed that the robot was able to steer away from the boat as expected. He then gave a command for the robot to fly back up, lifting the keel out of the water.

“We were flying very close to the surface, and there was very little margin for error — everything had to be in place,” Bousquet says. “So it was very high stress, but very exciting.”

The experiments, he says, prove that the team’s conceptual device can travel successfully, powered by the wind and the water. Eventually, he envisions fleets of such vehicles autonomously and efficiently monitoring large expanses of the ocean.

“Imagine you could fly like an albatross when it’s really windy, and then when there’s not enough wind, the keel allows you to sail like a sailboat,” Bousquet says. “This dramatically expands the kinds of regions where you can go.”

This research was supported, in part, by the Link Ocean Instrumentation fellowship.

Self-driving cars for country roads


A team of MIT researchers tested MapLite on a Toyota Prius outfitted with a range of LIDAR and IMU sensors.
Photo courtesy of CSAIL.

By Adam Conner-Simons | Rachel Gordon

Uber’s recent self-driving car fatality underscores the fact that the technology is still not ready for widespread adoption. The reality is that there aren’t many places where today’s self-driving cars can actually reliably drive. Companies like Google only test their fleets in major cities, where they’ve spent countless hours meticulously labeling the exact 3-D positions of lanes, curbs, and stop signs.

“The cars use these maps to know where they are and what to do in the presence of new obstacles like pedestrians and other cars,” says Daniela Rus, director of MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). “The need for dense 3-D maps limits the places where self-driving cars can operate.”

Indeed, if you live along the millions of miles of U.S. roads that are unpaved, unlit, or unreliably marked, you’re out of luck. Such streets are often much more complicated to map, and get a lot less traffic, so companies aren’t incentivized to develop 3-D maps for them anytime soon. From California’s Mojave Desert to Vermont’s White Mountains, there are huge swaths of America that self-driving cars simply aren’t ready for.

One way around this is to create systems advanced enough to navigate without these maps. In an important first step, Rus and colleagues at CSAIL have developed MapLite, a framework that allows self-driving cars to drive on roads they’ve never been on before without 3-D maps.

MapLite combines simple GPS data that you’d find on Google Maps with a series of sensors that observe the road conditions. In tandem, these two elements allowed the team to autonomously drive on multiple unpaved country roads in Devens, Massachusetts, and reliably detect the road more than 100 feet in advance. (As part of a collaboration with the Toyota Research Institute, researchers used a Toyota Prius that they outfitted with a range of LIDAR and IMU sensors.)

“The reason this kind of ‘map-less’ approach hasn’t really been done before is because it is generally much harder to reach the same accuracy and reliability as with detailed maps,” says CSAIL graduate student Teddy Ort, who was a lead author on a related paper about the system. “A system like this that can navigate just with on-board sensors shows the potential of self-driving cars being able to actually handle roads beyond the small number that tech companies have mapped.”

The paper, which will be presented in May at the International Conference on Robotics and Automation (ICRA) in Brisbane, Australia, was co-written by Ort, Rus, and PhD graduate Liam Paull, who is now an assistant professor at the University of Montreal.

For all the progress that has been made with self-driving cars, their navigation skills still pale in comparison to humans’. Consider how you yourself get around: If you’re trying to get to a specific location, you probably plug an address into your phone and then consult it occasionally along the way, like when you approach intersections or highway exits.

However, if you were to move through the world like most self-driving cars, you’d essentially be staring at your phone the whole time you’re walking. Existing systems still rely heavily on maps, only using sensors and vision algorithms to avoid dynamic objects like pedestrians and other cars.

In contrast, MapLite uses sensors for all aspects of navigation, relying on GPS data only to obtain a rough estimate of the car’s location. The system first sets both a final destination and what researchers call a “local navigation goal,” which has to be within view of the car. Its perception sensors then generate a path to get to that point, using LIDAR to estimate the location of the road’s edges. MapLite can do this without physical road markings by making basic assumptions about how the road will be relatively more flat than the surrounding areas.

“Our minimalist approach to mapping enables autonomous driving on country roads using local appearance and semantic features such as the presence of a parking spot or a side road,” says Rus.

The team developed a system of models that are “parameterized,” which means that they describe multiple situations that are somewhat similar. For example, one model might be broad enough to determine what to do at intersections, or what to do on a specific type of road.

MapLite differs from other map-less driving approaches that rely more on machine learning by training on data from one set of roads and then being tested on other ones.

“At the end of the day we want to be able to ask the car questions like ‘how many roads are merging at this intersection?’” says Ort. “By using modeling techniques, if the system doesn’t work or is involved in an accident, we can better understand why.”

MapLite still has some limitations. For example, it isn’t yet reliable enough for mountain roads, since it doesn’t account for dramatic changes in elevation. As a next step, the team hopes to expand the variety of roads that the vehicle can handle. Ultimately they aspire to have their system reach comparable levels of performance and reliability as mapped systems but with a much wider range.

“I imagine that the self-driving cars of the future will always make some use of 3-D maps in urban areas,” says Ort. “But when called upon to take a trip off the beaten path, these vehicles will need to be as good as humans at driving on unfamiliar roads they have never seen before. We hope our work is a step in that direction.”

This project was supported, in part, by the National Science Foundation and the Toyota Research Initiative.

Artificial intelligence in action

Aude Oliva (right), a principal research scientist at the Computer Science and Artificial Intelligence Laboratory and Dan Gutfreund (left), a principal investigator at the MIT–IBM Watson AI Laboratory and a staff member at IBM Research, are the principal investigators for the Moments in Time Dataset, one of the projects related to AI algorithms funded by the MIT–IBM Watson AI Laboratory.
Photo: John Mottern/Feature Photo Service for IBM

By Meg Murphy
A person watching videos that show things opening — a door, a book, curtains, a blooming flower, a yawning dog — easily understands the same type of action is depicted in each clip.

“Computer models fail miserably to identify these things. How do humans do it so effortlessly?” asks Dan Gutfreund, a principal investigator at the MIT-IBM Watson AI Laboratory and a staff member at IBM Research. “We process information as it happens in space and time. How can we teach computer models to do that?”

Such are the big questions behind one of the new projects underway at the MIT-IBM Watson AI Laboratory, a collaboration for research on the frontiers of artificial intelligence. Launched last fall, the lab connects MIT and IBM researchers together to work on AI algorithms, the application of AI to industries, the physics of AI, and ways to use AI to advance shared prosperity.

The Moments in Time dataset is one of the projects related to AI algorithms that is funded by the lab. It pairs Gutfreund with Aude Oliva, a principal research scientist at the MIT Computer Science and Artificial Intelligence Laboratory, as the project’s principal investigators. Moments in Time is built on a collection of 1 million annotated videos of dynamic events unfolding within three seconds. Gutfreund and Oliva, who is also the MIT executive director at the MIT-IBM Watson AI Lab, are using these clips to address one of the next big steps for AI: teaching machines to recognize actions.

Learning from dynamic scenes

The goal is to provide deep-learning algorithms with large coverage of an ecosystem of visual and auditory moments that may enable models to learn information that isn’t necessarily taught in a supervised manner and to generalize to novel situations and tasks, say the researchers.

“As we grow up, we look around, we see people and objects moving, we hear sounds that people and object make. We have a lot of visual and auditory experiences. An AI system needs to learn the same way and be fed with videos and dynamic information,” Oliva says.

For every action category in the dataset, such as cooking, running, or opening, there are more than 2,000 videos. The short clips enable computer models to better learn the diversity of meaning around specific actions and events.

“This dataset can serve as a new challenge to develop AI models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis,” Oliva adds, describing the factors involved. Events can include people, objects, animals, and nature. They may be symmetrical in time — for example, opening means closing in reverse order. And they can be transient or sustained.

Oliva and Gutfreund, along with additional researchers from MIT and IBM, met weekly for more than a year to tackle technical issues, such as how to choose the action categories for annotations, where to find the videos, and how to put together a wide array so the AI system learns without bias. The team also developed machine-learning models, which were then used to scale the data collection. “We aligned very well because we have the same enthusiasm and the same goal,” says Oliva.

Augmenting human intelligence

One key goal at the lab is the development of AI systems that move beyond specialized tasks to tackle more complex problems and benefit from robust and continuous learning. “We are seeking new algorithms that not only leverage big data when available, but also learn from limited data to augment human intelligence,” says Sophie V. Vandebroek, chief operating officer of IBM Research, about the collaboration.

In addition to pairing the unique technical and scientific strengths of each organization, IBM is also bringing MIT researchers an influx of resources, signaled by its $240 million investment in AI efforts over the next 10 years, dedicated to the MIT-IBM Watson AI Lab. And the alignment of MIT-IBM interest in AI is proving beneficial, according to Oliva.

“IBM came to MIT with an interest in developing new ideas for an artificial intelligence system based on vision. I proposed a project where we build data sets to feed the model about the world. It had not been done before at this level. It was a novel undertaking. Now we have reached the milestone of 1 million videos for visual AI training, and people can go to our website, download the dataset and our deep-learning computer models, which have been taught to recognize actions.”

Qualitative results so far have shown models can recognize moments well when the action is well-framed and close up, but they misfire when the category is fine-grained or there is background clutter, among other things. Oliva says that MIT and IBM researchers have submitted an article describing the performance of neural network models trained on the dataset, which itself was deepened by shared viewpoints. “IBM researchers gave us ideas to add action categories to have more richness in areas like health care and sports. They broadened our view. They gave us ideas about how AI can make an impact from the perspective of business and the needs of the world,” she says.

This first version of the Moments in Time dataset is one of the largest human-annotated video datasets capturing visual and audible short events, all of which are tagged with an action or activity label among 339 different classes that include a wide range of common verbs. The researchers intend to produce more datasets with a variety of levels of abstraction to serve as stepping stones toward the development of learning algorithms that can build analogies between things, imagine and synthesize novel events, and interpret scenarios.

In other words, they are just getting started, says Gutfreund. “We expect the Moments in Time dataset to enable models to richly understand actions and dynamics in videos.”

Page 9 of 12
1 7 8 9 10 11 12