Page 8 of 12
1 6 7 8 9 10 12

A step toward personalized, automated smart homes


MIT researchers have built a system that takes a step toward fully automated smart homes, by identifying occupants even when they’re not carrying mobile devices. Image: Chelsea Turner, MIT

By Rob Matheson

Developing automated systems that track occupants and self-adapt to their preferences is a major next step for the future of smart homes. When you walk into a room, for instance, a system could set to your preferred temperature. Or when you sit on the couch, a system could instantly flick the television to your favorite channel.

But enabling a home system to recognize occupants as they move around the house is a more complex problem. Recently, systems have been built that localize humans by measuring the reflections of wireless signals off their bodies. But these systems can’t identify the individuals. Other systems can identify people, but only if they’re always carrying their mobile devices. Both systems also rely on tracking signals that could be weak or get blocked by various structures.

MIT researchers have built a system that takes a step toward fully automated smart home by identifying occupants, even when they’re not carrying mobile devices. The system, called Duet, uses reflected wireless signals to localize individuals. But it also incorporates algorithms that ping nearby mobile devices to predict the individuals’ identities, based on who last used the device and their predicted movement trajectory. It also uses logic to figure out who’s who, even in signal-denied areas.

“Smart homes are still based on explicit input from apps or telling Alexa to do something. Ideally, we want homes to be more reactive to what we do, to adapt to us,” says Deepak Vasisht, a PhD student in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and lead author on a paper describing the system that was presented at last week’s Ubicomp conference. “If you enable location awareness and identification awareness for smart homes, you could do this automatically. Your home knows it’s you walking, and where you’re walking, and it can update itself.”

Experiments done in a two-bedroom apartment with four people and an office with nine people, over two weeks, showed the system can identify individuals with 96 percent and 94 percent accuracy, respectively, including when people weren’t carrying their smartphones or were in blocked areas.

But the system isn’t just novelty. Duet could potentially be used to recognize intruders or ensure visitors don’t enter private areas of your home. Moreover, Vasisht says, the system could capture behavioral-analytics insights for health care applications. Someone suffering from depression, for instance, may move around more or less, depending on how they’re feeling on any given day. Such information, collected over time, could be valuable for monitoring and treatment.

“In behavioral studies, you care about how people are moving over time and how people are behaving,” Vasisht says. “All those questions can be answered by getting information on people’s locations and how they’re moving.”

The researchers envision that their system would be used with explicit consent from anyone who would be identified and tracked with Duet. If needed, they could also develop an app for users to grant or revoke Duet’s access to their location information at any time, Vasisht adds.

Co-authors on the paper are: Dina Katabi, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science; former CSAIL researcher Anubhav Jain ’16; and CSAIL PhD students Chen-Yu Hsu and Zachary Kabelac.

Tracking and identification

Duet is a wireless sensor installed on a wall that’s about a foot and a half squared. It incorporates a floor map with annotated areas, such as the bedroom, kitchen, bed, and living room couch. It also collects identification tags from the occupants’ phones.

The system builds upon a device-based localization system built by Vasisht, Katabi, and other researchers that tracks individuals within tens of centimeters, based on wireless signal reflections from their devices. It does so by using a central node to calculate the time it takes the signals to hit a person’s device and travel back. In experiments, the system was able to pinpoint where people were in a two-bedroom apartment and in a café.

The system, however, relied on people carrying mobile devices. “But in building [Duet] we realized, at home you don’t always carry your phone,” Vasisht says. “Most people leave devices on desks or tables, and walk around the house.”

The researchers combined their device-based localization with a device-free tracking system, called WiTrack, developed by Katabi and other CSAIL researchers, that localizes people by measuring the reflections of wireless signals off their bodies.

Duet locates a smartphone and correlates its movement with individual movement captured by the device-free localization. If both are moving in tightly correlated trajectories, the system pairs the device with the individual and, therefore, knows the identity of the individual.

To ensure Duet knows someone’s identity when they’re away from their device, the researchers designed the system to capture the power profile of the signal received from the phone when it’s used. That profile changes, depending on the orientation of the signal, and that change be mapped to an individual’s trajectory to identify them. For example, when a phone is used and then put down, the system will capture the initial power profile. Then it will estimate how the power profile would look if it were still being carried along a path by a nearby moving individual. The closer the changing power profile correlates to the moving individual’s path, the more likely it is that individual owns the phone.

Logical thinking

One final issue is that structures such as bathroom tiles, television screens, mirrors, and various metal equipment can block signals.

To compensate for that, the researchers incorporated probabilistic algorithms to apply logical reasoning to localization. To do so, they designed the system to recognize entrance and exit boundaries of specific spaces in the home, such as doors to each room, the bedside, and the side of a couch. At any moment, the system will recognize the most likely identity for each individual in each boundary. It then infers who is who by process of elimination.

Suppose an apartment has two occupants: Alisha and Betsy. Duet sees Alisha and Betsy walk into the living room, by pairing their smartphone motion with their movement trajectories. Both then leave their phones on a nearby coffee table to charge — Betsy goes into the bedroom to nap; Alisha stays on the couch to watch television. Duet infers that Betsy has entered the bed boundary and didn’t exit, so must be on the bed. After a while, Alisha and Betsy move into, say, the kitchen — and the signal drops. Duet reasons that two people are in the kitchen, but it doesn’t know their identities. When Betsy returns to the living room and picks up her phone, however, the system automatically re-tags the individual as Betsy. By process of elimination, the other person still in the kitchen is Alisha.

“There are blind spots in homes where systems won’t work. But, because you have logical framework, you can make these inferences,” Vasisht says.

“Duet takes a smart approach of combining the location of different devices and associating it to humans, and leverages device-free localization techniques for localizing humans,” says Ranveer Chandra, a principal researcher at Microsoft, who was not involved in the work. “Accurately determining the location of all residents in a home has the potential to significantly enhance the in-home experience of users. … The home assistant can personalize the responses based on who all are around it; the temperature can be automatically controlled based on personal preferences, thereby resulting in energy savings. Future robots in the home could be more intelligent if they knew who was where in the house. The potential is endless.”

Next, the researchers aim for long-term deployments of Duet in more spaces and to provide high-level analytic services for applications such as health monitoring and responsive smart homes.

Model helps robots navigate more like humans do

MIT researchers have devised a way to help robots navigate environments more like humans do.

By Rob Matheson

When moving through a crowd to reach some end goal, humans can usually navigate the space safely without thinking too much. They can learn from the behavior of others and note any obstacles to avoid. Robots, on the other hand, struggle with such navigational concepts.

MIT researchers have now devised a way to help robots navigate environments more like humans do. Their novel motion-planning model lets robots determine how to reach a goal by exploring the environment, observing other agents, and exploiting what they’ve learned before in similar situations. A paper describing the model was presented at this week’s IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

Popular motion-planning algorithms will create a tree of possible decisions that branches out until it finds good paths for navigation. A robot that needs to navigate a room to reach a door, for instance, will create a step-by-step search tree of possible movements and then execute the best path to the door, considering various constraints. One drawback, however, is these algorithms rarely learn: Robots can’t leverage information about how they or other agents acted previously in similar environments.

“Just like when playing chess, these decisions branch out until [the robots] find a good way to navigate. But unlike chess players, [the robots] explore what the future looks like without learning much about their environment and other agents,” says co-author Andrei Barbu, a researcher at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Center for Brains, Minds, and Machines (CBMM) within MIT’s McGovern Institute. “The thousandth time they go through the same crowd is as complicated as the first time. They’re always exploring, rarely observing, and never using what’s happened in the past.”

The researchers developed a model that combines a planning algorithm with a neural network that learns to recognize paths that could lead to the best outcome, and uses that knowledge to guide the robot’s movement in an environment.

In their paper, “Deep sequential models for sampling-based planning,” the researchers demonstrate the advantages of their model in two settings: navigating through challenging rooms with traps and narrow passages, and navigating areas while avoiding collisions with other agents. A promising real-world application is helping autonomous cars navigate intersections, where they have to quickly evaluate what others will do before merging into traffic. The researchers are currently pursuing such applications through the Toyota-CSAIL Joint Research Center.

“When humans interact with the world, we see an object we’ve interacted with before, or are in some location we’ve been to before, so we know how we’re going to act,” says Yen-Ling Kuo, a PhD in CSAIL and first author on the paper. “The idea behind this work is to add to the search space a machine-learning model that knows from past experience how to make planning more efficient.”

Boris Katz, a principal research scientist and head of the InfoLab Group at CSAIL, is also a co-author on the paper.

Trading off exploration and exploitation

Traditional motion planners explore an environment by rapidly expanding a tree of decisions that eventually blankets an entire space. The robot then looks at the tree to find a way to reach the goal, such as a door. The researchers’ model, however, offers “a tradeoff between exploring the world and exploiting past knowledge,” Kuo says.

The learning process starts with a few examples. A robot using the model is trained on a few ways to navigate similar environments. The neural network learns what makes these examples succeed by interpreting the environment around the robot, such as the shape of the walls, the actions of other agents, and features of the goals. In short, the model “learns that when you’re stuck in an environment, and you see a doorway, it’s probably a good idea to go through the door to get out,” Barbu says.

The model combines the exploration behavior from earlier methods with this learned information. The underlying planner, called RRT*, was developed by MIT professors Sertac Karaman and Emilio Frazzoli. (It’s a variant of a widely used motion-planning algorithm known as Rapidly-exploring Random Trees, or  RRT.) The planner creates a search tree while the neural network mirrors each step and makes probabilistic predictions about where the robot should go next. When the network makes a prediction with high confidence, based on learned information, it guides the robot on a new path. If the network doesn’t have high confidence, it lets the robot explore the environment instead, like a traditional planner.

For example, the researchers demonstrated the model in a simulation known as a “bug trap,” where a 2-D robot must escape from an inner chamber through a central narrow channel and reach a location in a surrounding larger room. Blind allies on either side of the channel can get robots stuck. In this simulation, the robot was trained on a few examples of how to escape different bug traps. When faced with a new trap, it recognizes features of the trap, escapes, and continues to search for its goal in the larger room. The neural network helps the robot find the exit to the trap, identify the dead ends, and gives the robot a sense of its surroundings so it can quickly find the goal.

Results in the paper are based on the chances that a path is found after some time, total length of the path that reached a given goal, and how consistent the paths were. In both simulations, the researchers’ model more quickly plotted far shorter and consistent paths than a traditional planner.

Working with multiple agents

In one other experiment, the researchers trained and tested the model in navigating environments with multiple moving agents, which is a useful test for autonomous cars, especially navigating intersections and roundabouts. In the simulation, several agents are circling an obstacle. A robot agent must successfully navigate around the other agents, avoid collisions, and reach a goal location, such as an exit on a roundabout.

“Situations like roundabouts are hard, because they require reasoning about how others will respond to your actions, how you will then respond to theirs, what they will do next, and so on,” Barbu says. “You eventually discover your first action was wrong, because later on it will lead to a likely accident. This problem gets exponentially worse the more cars you have to contend with.”

Results indicate that the researchers’ model can capture enough information about the future behavior of the other agents (cars) to cut off the process early, while still making good decisions in navigation. This makes planning more efficient. Moreover, they only needed to train the model on a few examples of roundabouts with only a few cars. “The plans the robots make take into account what the other cars are going to do, as any human would,” Barbu says.

Going through intersections or roundabouts is one of the most challenging scenarios facing autonomous cars. This work might one day let cars learn how humans behave and how to adapt to drivers in different environments, according to the researchers. This is the focus of the Toyota-CSAIL Joint Research Center work.

“Not everybody behaves the same way, but people are very stereotypical. There are people who are shy, people who are aggressive. The model recognizes that quickly and that’s why it can plan efficiently,” Barbu says.

More recently, the researchers have been applying this work to robots with manipulators that face similarly daunting challenges when reaching for objects in ever-changing environments.

Machine-learning system tackles speech and object recognition, all at once

MIT computer scientists have developed a system that learns to identify objects within an image, based on a spoken description of the image.
Image: Christine Daniloff

By Rob Matheson

MIT computer scientists have developed a system that learns to identify objects within an image, based on a spoken description of the image. Given an image and an audio caption, the model will highlight in real-time the relevant regions of the image being described.

Unlike current speech-recognition technologies, the model doesn’t require manual transcriptions and annotations of the examples it’s trained on. Instead, it learns words directly from recorded speech clips and objects in raw images, and associates them with one another.

The model can currently recognize only several hundred different words and object types. But the researchers hope that one day their combined speech-object recognition technique could save countless hours of manual labor and open new doors in speech and image recognition.

Speech-recognition systems such as Siri and Google Voice, for instance, require transcriptions of many thousands of hours of speech recordings. Using these data, the systems learn to map speech signals with specific words. Such an approach becomes especially problematic when, say, new terms enter our lexicon, and the systems must be retrained.

“We wanted to do speech recognition in a way that’s more natural, leveraging additional signals and information that humans have the benefit of using, but that machine learning algorithms don’t typically have access to. We got the idea of training a model in a manner similar to walking a child through the world and narrating what you’re seeing,” says David Harwath, a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Spoken Language Systems Group. Harwath co-authored a paper describing the model that was presented at the recent European Conference on Computer Vision.

In the paper, the researchers demonstrate their model on an image of a young girl with blonde hair and blue eyes, wearing a blue dress, with a white lighthouse with a red roof in the background. The model learned to associate which pixels in the image corresponded with the words “girl,” “blonde hair,” “blue eyes,” “blue dress,” “white light house,” and “red roof.” When an audio caption was narrated, the model then highlighted each of those objects in the image as they were described.

One promising application is learning translations between different languages, without need of a bilingual annotator. Of the estimated 7,000 languages spoken worldwide, only 100 or so have enough transcription data for speech recognition. Consider, however, a situation where two different-language speakers describe the same image. If the model learns speech signals from language A that correspond to objects in the image, and learns the signals in language B that correspond to those same objects, it could assume those two signals — and matching words — are translations of one another.

“There’s potential there for a Babel Fish-type of mechanism,” Harwath says, referring to the fictitious living earpiece in the “Hitchhiker’s Guide to the Galaxy” novels that translates different languages to the wearer.

The CSAIL co-authors are: graduate student Adria Recasens; visiting student Didac Suris; former researcher Galen Chuang; Antonio Torralba, a professor of electrical engineering and computer science who also heads the MIT-IBM Watson AI Lab; and Senior Research Scientist James Glass, who leads the Spoken Language Systems Group at CSAIL.

Audio-visual associations

This work expands on an earlier model developed by Harwath, Glass, and Torralba that correlates speech with groups of thematically related images. In the earlier research, they put images of scenes from a classification database on the crowdsourcing Mechanical Turk platform. They then had people describe the images as if they were narrating to a child, for about 10 seconds. They compiled more than 200,000 pairs of images and audio captions, in hundreds of different categories, such as beaches, shopping malls, city streets, and bedrooms.

They then designed a model consisting of two separate convolutional neural networks (CNNs). One processes images, and one processes spectrograms, a visual representation of audio signals as they vary over time. The highest layer of the model computes outputs of the two networks and maps the speech patterns with image data.

The researchers would, for instance, feed the model caption A and image A, which is correct. Then, they would feed it a random caption B with image A, which is an incorrect pairing. After comparing thousands of wrong captions with image A, the model learns the speech signals corresponding with image A, and associates those signals with words in the captions. As described in a 2016 study, the model learned, for instance, to pick out the signal corresponding to the word “water,” and to retrieve images with bodies of water.

“But it didn’t provide a way to say, ‘This is exact point in time that somebody said a specific word that refers to that specific patch of pixels,’” Harwath says.

Making a matchmap

In the new paper, the researchers modified the model to associate specific words with specific patches of pixels. The researchers trained the model on the same database, but with a new total of 400,000 image-captions pairs. They held out 1,000 random pairs for testing.

In training, the model is similarly given correct and incorrect images and captions. But this time, the image-analyzing CNN divides the image into a grid of cells consisting of patches of pixels. The audio-analyzing CNN divides the spectrogram into segments of, say, one second to capture a word or two.

With the correct image and caption pair, the model matches the first cell of the grid to the first segment of audio, then matches that same cell with the second segment of audio, and so on, all the way through each grid cell and across all time segments. For each cell and audio segment, it provides a similarity score, depending on how closely the signal corresponds to the object.

The challenge is that, during training, the model doesn’t have access to any true alignment information between the speech and the image. “The biggest contribution of the paper,” Harwath says, “is demonstrating that these cross-modal alignments can be inferred automatically by simply teaching the network which images and captions belong together and which pairs don’t.”

The authors dub this automatic-learning association between a spoken caption’s waveform with the image pixels a “matchmap.” After training on thousands of image-caption pairs, the network narrows down those alignments to specific words representing specific objects in that matchmap.

“It’s kind of like the Big Bang, where matter was really dispersed, but then coalesced into planets and stars,” Harwath says. “Predictions start dispersed everywhere but, as you go through training, they converge into an alignment that represents meaningful semantic groundings between spoken words and visual objects.”

Robots can now pick up any object after inspecting it

With the DON system, a robot can do novel tasks like look at a shoe it has never seen before and successfully grab it by its tongue.
Photo: Tom Buehler/CSAIL

Humans have long been masters of dexterity, a skill that can largely be credited to the help of our eyes. Robots, meanwhile, are still catching up.

Certainly there’s been some progress: For decades, robots in controlled environments like assembly lines have been able to pick up the same object over and over again. More recently, breakthroughs in computer vision have enabled robots to make basic distinctions between objects. Even then, though, the systems don’t truly understand objects’ shapes, so there’s little the robots can do after a quick pick-up.  

In a new paper, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), say that they’ve made a key development in this area of work: a system that lets robots inspect random objects, and visually understand them enough to accomplish specific tasks without ever having seen them before.

The system, called Dense Object Nets (DON), looks at objects as collections of points that serve as sort of visual roadmaps. This approach lets robots better understand and manipulate items, and, most importantly, allows them to even pick up a specific object among a clutter of similar — a valuable skill for the kinds of machines that companies like Amazon and Walmart use in their warehouses.

For example, someone might use DON to get a robot to grab onto a specific spot on an object, say, the tongue of a shoe. From that, it can look at a shoe it has never seen before, and successfully grab its tongue.

“Many approaches to manipulation can’t identify specific parts of an object across the many orientations that object may encounter,” says PhD student Lucas Manuelli, who wrote a new paper about the system with lead author and fellow PhD student Pete Florence, alongside MIT Professor Russ Tedrake. “For example, existing algorithms would be unable to grasp a mug by its handle, especially if the mug could be in multiple orientations, like upright, or on its side.”

The team views potential applications not just in manufacturing settings, but also in homes. Imagine giving the system an image of a tidy house, and letting it clean while you’re at work, or using an image of dishes so that the system puts your plates away while you’re on vacation.

What’s also noteworthy is that none of the data was actually labeled by humans. Instead, the system is what the team calls “self-supervised,” not requiring any human annotations.

Two common approaches to robot grasping involve either task-specific learning, or creating a general grasping algorithm. These techniques both have obstacles: Task-specific methods are difficult to generalize to other tasks, and general grasping doesn’t get specific enough to deal with the nuances of particular tasks, like putting objects in specific spots.

The DON system, however, essentially creates a series of coordinates on a given object, which serve as a kind of visual roadmap, to give the robot a better understanding of what it needs to grasp, and where.

The team trained the system to look at objects as a series of points that make up a larger coordinate system. It can then map different points together to visualize an object’s 3-D shape, similar to how panoramic photos are stitched together from multiple photos. After training, if a person specifies a point on a object, the robot can take a photo of that object, and identify and match points to be able to then pick up the object at that specified point.

This is different from systems like UC-Berkeley’s DexNet, which can grasp many different items, but can’t satisfy a specific request. Imagine a child at 18 months old, who doesn’t understand which toy you want it to play with but can still grab lots of items, versus a four-year old who can respond to “go grab your truck by the red end of it.”

In one set of tests done on a soft caterpillar toy, a Kuka robotic arm powered by DON could grasp the toy’s right ear from a range of different configurations. This showed that, among other things, the system has the ability to distinguish left from right on symmetrical objects.

When testing on a bin of different baseball hats, DON could pick out a specific target hat despite all of the hats having very similar designs — and having never seen pictures of the hats in training data before.

“In factories robots often need complex part feeders to work reliably,” says Florence. “But a system like this that can understand objects’ orientations could just take a picture and be able to grasp and adjust the object accordingly.”

In the future, the team hopes to improve the system to a place where it can perform specific tasks with a deeper understanding of the corresponding objects, like learning how to grasp an object and move it with the ultimate goal of say, cleaning a desk.

The team will present their paper on the system next month at the Conference on Robot Learning in Zürich, Switzerland.

A GPS for inside your body


By Adam Conner-Simons | Rachel Gordon

Investigating inside the human body often requires cutting open a patient or swalloing long tubes with built-in cameras. But what if physicians could get a better glimpse in a less expensive, invasive, and time-consuming manner?

A team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) led by Professor Dina Katabi is working on doing exactly that with an “in-body GPS” system dubbed ReMix. The new method can pinpoint the location of ingestible implants inside the body using low-power wireless signals. These implants could be used as tiny tracking devices on shifting tumors to help monitor their slight movements.

In animal tests, the team demonstrated that they can track the implants with centimeter-level accuracy. The team says that, one day, similar implants could be used to deliver drugs to specific regions in the body.

ReMix was developed in collaboration with researchers from Massachusetts General Hospital (MGH). The team describes the system in a paper that’s being presented at this week’s Association for Computing Machinery’s Special Interest Group on Data Communications (SIGCOMM) conference in Budapest, Hungary.

Tracking inside the body

To test ReMix, Katabi’s group first implanted a small marker in animal tissues. To track its movement, the researchers used a wireless device that reflects radio signals off the patient. This was based on a wireless technology that the researchers previously demonstrated to detect heart rate, breathing, and movement. A special algorithm then uses that signal to pinpoint the exact location of the marker.

Interestingly, the marker inside the body does not need to transmit any wireless signal. It simply reflects the signal transmitted by the wireless device outside the body. Therefore, it doesn’t need a battery or any other external source of energy.

A key challenge in using wireless signals in this way is the many competing reflections that bounce off a person’s body. In fact, the signals that reflect off a person’s skin are actually 100 million times more powerful than the signals of the metal marker itself.

To overcome this, the team designed an approach that essentially separates the interfering skin signals from the ones they’re trying to measure. They did this using a small semiconductor device, called a “diode,” that mixes signals together so the team can then filter out the skin-related signals. For example, if the skin reflects at frequencies of F1 and F2, the diode creates new combinations of those frequencies, such as F1-F2 and F1+F2. When all of the signals reflect back to the system, the system only picks up the combined frequencies, filtering out the original frequencies that came from the patient’s skin.

One potential application for ReMix is in proton therapy, a type of cancer treatment that involves bombarding tumors with beams of magnet-controlled protons. The approach allows doctors to prescribe higher doses of radiation, but requires a very high degree of precision, which means that it’s usually limited to only certain cancers.

Its success hinges on something that’s actually quite unreliable: a tumor staying exactly where it is during the radiation process. If a tumor moves, then healthy areas could be exposed to the radiation. But with a small marker like ReMix’s, doctors could better determine the location of a tumor in real-time and either pause the treatment or steer the beam into the right position. (To be clear, ReMix is not yet accurate enough to be used in clinical settings. Katabi says a margin of error closer to a couple of millimeters would be necessary for actual implementation.)

“The ability to continuously sense inside the human body has largely been a distant dream,” says Romit Roy Choudhury, a professor of electrical engineering and computer science at the University of Illinois, who was not involved in the research. “One of the roadblocks has been wireless communication to a device and its continuous localization. ReMix makes a leap in this direction by showing that the wireless component of implantable devices may no longer be the bottleneck.”

Looking ahead

There are still many ongoing challenges for improving ReMix. The team next hopes to combine the wireless data with medical data, such as that from magnetic resonance imaging (MRI) scans, to further improve the system’s accuracy. In addition, the team will continue to reassess the algorithm and the various tradeoffs needed to account for the complexity of different bodies.

“We want a model that’s technically feasible, while still complex enough to accurately represent the human body,” says MIT PhD student Deepak Vasisht, lead author on the new paper. “If we want to use this technology on actual cancer patients one day, it will have to come from better modeling a person’s physical structure.”

The researchers say that such systems could help enable more widespread adoption of proton therapy centers. Today, there are only about 100 centers globally.

“One reason that [proton therapy] is so expensive is because of the cost of installing the hardware,” Vasisht says. “If these systems can encourage more applications of the technology, there will be more demand, which will mean more therapy centers, and lower prices for patients.”

Katabi and Vasisht co-wrote the paper with MIT PhD student Guo Zhang, University of Waterloo professor Omid Abari, MGH physicist Hsaio-Ming Lu, and MGH technical director Jacob Flanz.

Personalized “deep learning” equips robots for autism therapy

An example of a therapy session augmented with humanoid robot NAO [SoftBank Robotics], which was used in the EngageMe study. Tracking of limbs/faces was performed using the CMU Perceptual Lab’s OpenPose utility.
Image: MIT Media Lab

By Becky Ham

Children with autism spectrum conditions often have trouble recognizing the emotional states of people around them — distinguishing a happy face from a fearful face, for instance. To remedy this, some therapists use a kid-friendly robot to demonstrate those emotions and to engage the children in imitating the emotions and responding to them in appropriate ways.

This type of therapy works best, however, if the robot can smoothly interpret the child’s own behavior — whether he or she is interested and excited or paying attention — during the therapy. Researchers at the MIT Media Lab have now developed a type of personalized machine learning that helps robots estimate the engagement and interest of each child during these interactions, using data that are unique to that child.

Armed with this personalized “deep learning” network, the robots’ perception of the children’s responses agreed with assessments by human experts, with a correlation score of 60 percent, the scientists report June 27 in Science Robotics.

It can be challenging for human observers to reach high levels of agreement about a child’s engagement and behavior. Their correlation scores are usually between 50 and 55 percent. Rudovic and his colleagues suggest that robots that are trained on human observations, as in this study, could someday provide more consistent estimates of these behaviors.

“The long-term goal is not to create robots that will replace human therapists, but to augment them with key information that the therapists can use to personalize the therapy content and also make more engaging and naturalistic interactions between the robots and children with autism,” explains Oggi Rudovic, a postdoc at the Media Lab and first author of the study.

Rosalind Picard, a co-author on the paper and professor at MIT who leads research in affective computing, says that personalization is especially important in autism therapy: A famous adage is, “If you have met one person, with autism, you have met one person with autism.”

“The challenge of creating machine learning and AI [artificial intelligence] that works in autism is particularly vexing, because the usual AI methods require a lot of data that are similar for each category that is learned. In autism where heterogeneity reigns, the normal AI approaches fail,” says Picard. Rudovic, Picard, and their teammates have also been using personalized deep learning in other areas, finding that it improves results for pain monitoring and for forecasting Alzheimer’s disease progression.  

Meeting NAO

Robot-assisted therapy for autism often works something like this: A human therapist shows a child photos or flash cards of different faces meant to represent different emotions, to teach them how to recognize expressions of fear, sadness, or joy. The therapist then programs the robot to show these same emotions to the child, and observes the child as she or he engages with the robot. The child’s behavior provides valuable feedback that the robot and therapist need to go forward with the lesson.

The researchers used SoftBank Robotics NAO humanoid robots in this study. Almost 2 feet tall and resembling an armored superhero or a droid, NAO conveys different emotions by changing the color of its eyes, the motion of its limbs, and the tone of its voice.

The 35 children with autism who participated in this study, 17 from Japan and 18 from Serbia, ranged in age from 3 to 13. They reacted in various ways to the robots during their 35-minute sessions, from looking bored and sleepy in some cases to jumping around the room with excitement, clapping their hands, and laughing or touching the robot.

Most of the children in the study reacted to the robot “not just as a toy but related to NAO respectfully as it if was a real person,” especially during storytelling, where the therapists asked how NAO would feel if the children took the robot for an ice cream treat, according to Rudovic.

One 4-year-old girl hid behind her mother while participating in the session but became much more open to the robot and ended up laughing by the end of the therapy. The sister of one of the Serbian children gave NAO a hug and said “Robot, I love you!” at the end of a session, saying she was happy to see how much her brother liked playing with the robot.

“Therapists say that engaging the child for even a few seconds can be a big challenge for them, and robots attract the attention of the child,” says Rudovic, explaining why robots have been useful in this type of therapy. “Also, humans change their expressions in many different ways, but the robots always do it in the same way, and this is less frustrating for the child because the child learns in a very structured way how the expressions will be shown.”

Personalized machine learning

The MIT research team realized that a kind of machine learning called deep learning would be useful for the therapy robots to have, to perceive the children’s behavior more naturally. A deep-learning system uses hierarchical, multiple layers of data processing to improve its tasks, with each successive layer amounting to a slightly more abstract representation of the original raw data.

Although the concept of deep learning has been around since the 1980s, says Rudovic, it’s only recently that there has been enough computing power to implement this kind of artificial intelligence. Deep learning has been used in automatic speech and object-recognition programs, making it well-suited for a problem such as making sense of the multiple features of the face, body, and voice that go into understanding a more abstract concept such as a child’s engagement.

“In the case of facial expressions, for instance, what parts of the face are the most important for estimation of engagement?” Rudovic says. “Deep learning allows the robot to directly extract the most important information from that data without the need for humans to manually craft those features.”

For the therapy robots, Rudovic and his colleagues took the idea of deep learning one step further and built a personalized framework that could learn from data collected on each individual child. The researchers captured video of each child’s facial expressions, head and body movements, poses and gestures, audio recordings and data on heart rate, body temperature, and skin sweat response from a monitor on the child’s wrist.

The robots’ personalized deep learning networks were built from layers of these video, audio, and physiological data, information about the child’s autism diagnosis and abilities, their culture and their gender. The researchers then compared their estimates of the children’s behavior with estimates from five human experts, who coded the children’s video and audio recordings on a continuous scale to determine how pleased or upset, how interested, and how engaged the child seemed during the session.

Trained on these personalized data coded by the humans, and tested on data not used in training or tuning the models, the networks significantly improved the robot’s automatic estimation of the child’s behavior for most of the children in the study, beyond what would be estimated if the network combined all the children’s data in a “one-size-fits-all” approach, the researchers found.

Rudovic and colleagues were also able to probe how the deep learning network made its estimations, which uncovered some interesting cultural differences between the children. “For instance, children from Japan showed more body movements during episodes of high engagement, while in Serbs large body movements were associated with disengagement episodes,” Rudovic says.

The study was funded by grants from the Japanese Ministry of Education, Culture, Sports, Science and Technology; Chubu University; and the European Union’s HORIZON 2020 grant (EngageME).

How to control robots with brainwaves and hand gestures

A system developed at MIT allows a human supervisor to correct a robot’s mistakes using gestures and brainwaves.
Photo: Joseph DelPreto/MIT CSAIL
By Adam Conner-Simons

Getting robots to do things isn’t easy: Usually, scientists have to either explicitly program them or get them to understand how humans communicate via language.

But what if we could control robots more intuitively, using just hand gestures and brainwaves?

A new system spearheaded by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) aims to do exactly that, allowing users to instantly correct robot mistakes with nothing more than brain signals and the flick of a finger.

Building off the team’s past work focused on simple binary-choice activities, the new work expands the scope to multiple-choice tasks, opening up new possibilities for how human workers could manage teams of robots.

By monitoring brain activity, the system can detect in real-time if a person notices an error as a robot does a task. Using an interface that measures muscle activity, the person can then make hand gestures to scroll through and select the correct option for the robot to execute.

The team demonstrated the system on a task in which a robot moves a power drill to one of three possible targets on the body of a mock plane. Importantly, they showed that the system works on people it’s never seen before, meaning that organizations could deploy it in real-world settings without needing to train it on users.

“This work combining EEG and EMG feedback enables natural human-robot interactions for a broader set of applications than we’ve been able to do before using only EEG feedback,” says CSAIL Director Daniela Rus, who supervised the work. “By including muscle feedback, we can use gestures to command the robot spatially, with much more nuance and specificity.”

PhD candidate Joseph DelPreto was lead author on a paper about the project alongside Rus, former CSAIL postdoc Andres F. Salazar-Gomez, former CSAIL research scientist Stephanie Gil, research scholar Ramin M. Hasani, and Boston University Professor Frank H. Guenther. The paper will be presented at the Robotics: Science and Systems (RSS) conference taking place in Pittsburgh next week.

In most previous work, systems could generally only recognize brain signals when people trained themselves to “think” in very specific but arbitrary ways and when the system was trained on such signals. For instance, a human operator might have to look at different light displays that correspond to different robot tasks during a training session.

Not surprisingly, such approaches are difficult for people to handle reliably, especially if they work in fields like construction or navigation that already require intense concentration.

Meanwhile, Rus’ team harnessed the power of brain signals called “error-related potentials” (ErrPs), which researchers have found to naturally occur when people notice mistakes. If there’s an ErrP, the system stops so the user can correct it; if not, it carries on.

“What’s great about this approach is that there’s no need to train users to think in a prescribed way,” says DelPreto. “The machine adapts to you, and not the other way around.”

For the project the team used “Baxter,” a humanoid robot from Rethink Robotics. With human supervision, the robot went from choosing the correct target 70 percent of the time to more than 97 percent of the time.

To create the system the team harnessed the power of electroencephalography (EEG) for brain activity and electromyography (EMG) for muscle activity, putting a series of electrodes on the users’ scalp and forearm.

Both metrics have some individual shortcomings: EEG signals are not always reliably detectable, while EMG signals can sometimes be difficult to map to motions that are any more specific than “move left or right.” Merging the two, however, allows for more robust bio-sensing and makes it possible for the system to work on new users without training.

“By looking at both muscle and brain signals, we can start to pick up on a person’s natural gestures along with their snap decisions about whether something is going wrong,” says DelPreto. “This helps make communicating with a robot more like communicating with another person.”

The team says that they could imagine the system one day being useful for the elderly, or workers with language disorders or limited mobility.

“We’d like to move away from a world where people have to adapt to the constraints of machines,” says Rus. “Approaches like this show that it’s very much possible to develop robotic systems that are a more natural and intuitive extension of us.”

How to control robots with brainwaves and hand gestures

A system developed at MIT allows a human supervisor to correct a robot’s mistakes using gestures and brainwaves.
Photo: Joseph DelPreto/MIT CSAIL
By Adam Conner-Simons

Getting robots to do things isn’t easy: Usually, scientists have to either explicitly program them or get them to understand how humans communicate via language.

But what if we could control robots more intuitively, using just hand gestures and brainwaves?

A new system spearheaded by researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) aims to do exactly that, allowing users to instantly correct robot mistakes with nothing more than brain signals and the flick of a finger.

Building off the team’s past work focused on simple binary-choice activities, the new work expands the scope to multiple-choice tasks, opening up new possibilities for how human workers could manage teams of robots.

By monitoring brain activity, the system can detect in real-time if a person notices an error as a robot does a task. Using an interface that measures muscle activity, the person can then make hand gestures to scroll through and select the correct option for the robot to execute.

The team demonstrated the system on a task in which a robot moves a power drill to one of three possible targets on the body of a mock plane. Importantly, they showed that the system works on people it’s never seen before, meaning that organizations could deploy it in real-world settings without needing to train it on users.

“This work combining EEG and EMG feedback enables natural human-robot interactions for a broader set of applications than we’ve been able to do before using only EEG feedback,” says CSAIL Director Daniela Rus, who supervised the work. “By including muscle feedback, we can use gestures to command the robot spatially, with much more nuance and specificity.”

PhD candidate Joseph DelPreto was lead author on a paper about the project alongside Rus, former CSAIL postdoc Andres F. Salazar-Gomez, former CSAIL research scientist Stephanie Gil, research scholar Ramin M. Hasani, and Boston University Professor Frank H. Guenther. The paper will be presented at the Robotics: Science and Systems (RSS) conference taking place in Pittsburgh next week.

In most previous work, systems could generally only recognize brain signals when people trained themselves to “think” in very specific but arbitrary ways and when the system was trained on such signals. For instance, a human operator might have to look at different light displays that correspond to different robot tasks during a training session.

Not surprisingly, such approaches are difficult for people to handle reliably, especially if they work in fields like construction or navigation that already require intense concentration.

Meanwhile, Rus’ team harnessed the power of brain signals called “error-related potentials” (ErrPs), which researchers have found to naturally occur when people notice mistakes. If there’s an ErrP, the system stops so the user can correct it; if not, it carries on.

“What’s great about this approach is that there’s no need to train users to think in a prescribed way,” says DelPreto. “The machine adapts to you, and not the other way around.”

For the project the team used “Baxter,” a humanoid robot from Rethink Robotics. With human supervision, the robot went from choosing the correct target 70 percent of the time to more than 97 percent of the time.

To create the system the team harnessed the power of electroencephalography (EEG) for brain activity and electromyography (EMG) for muscle activity, putting a series of electrodes on the users’ scalp and forearm.

Both metrics have some individual shortcomings: EEG signals are not always reliably detectable, while EMG signals can sometimes be difficult to map to motions that are any more specific than “move left or right.” Merging the two, however, allows for more robust bio-sensing and makes it possible for the system to work on new users without training.

“By looking at both muscle and brain signals, we can start to pick up on a person’s natural gestures along with their snap decisions about whether something is going wrong,” says DelPreto. “This helps make communicating with a robot more like communicating with another person.”

The team says that they could imagine the system one day being useful for the elderly, or workers with language disorders or limited mobility.

“We’d like to move away from a world where people have to adapt to the constraints of machines,” says Rus. “Approaches like this show that it’s very much possible to develop robotic systems that are a more natural and intuitive extension of us.”

Teaching robots how to move objects

Doctoral student Maria Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” she says.
Photo: Tony Pulsone

By Mary Beth O’Leary
With the push of a button, months of hard work were about to be put to the test. Sixteen teams of engineers convened in a cavernous exhibit hall in Nagoya, Japan, for the 2017 Amazon Robotics Challenge. The robotic systems they built were tasked with removing items from bins and placing them into boxes. For graduate student Maria Bauza, who served as task-planning lead for the MIT-Princeton Team, the moment was particularly nerve-wracking.

“It was super stressful when the competition started,” recalls Bauza. “You just press play and the robot is autonomous. It’s going to do whatever you code it for, but you have no control. If something is broken, then that’s it.”

Robotics has been a major focus for Bauza since her undergraduate career. She studied mathematics and engineering physics at the Polytechnic University of Catalonia in Barcelona. During a year as a visiting student at MIT, Bauza was able to put her interest in computer science and artificial intelligence into practice. “When I came to MIT for that year, I starting applying the tools I had learned in machine learning to real problems in robotics,” she adds.

Two creative undergraduate projects gave her even more practice in this area. In one project, she hacked the controller of a toy remote control car to make it drive in a straight line. In another, she developed a portable robot that could draw on the blackboard for teachers. The robot was given an image of Mona Lisa and, after going through an algorithm, it drew that image on the blackboard. “That was the first small success in my robotics career,” says Bauza.

After graduating with her bachelor’s degree in 2016, she joined the Manipulation and Mechanisms Laboratory at MIT (known as MCube Lab) under Assistant Professor Alberto Rodriguez’s guidance. “Maria brings together experience in machine learning and a strong background in mathematics, computer science, and mechanics, which makes her a great candidate to grow into a leader in the fields of machine learning and robotics,” says Rodriguez.

For her PhD thesis, Bauza is developing machine-learning algorithms and software to improve how robots interact with the world. MCube’s multidisciplinary team provides the support needed to pursue this goal.

“In the end, machine learning can’t work if you don’t have good data,” Bauza explains. “Good data comes from good hardware, good sensors, good cameras — so in MCube we all collaborate to make sure the systems we build are powerful enough to be autonomous.”

To create these robust autonomous systems, Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” explains Bauza.

Improving how robots interact with the world and reason to find the best possible outcome was crucial to the Amazon Robotics Challenge. Bauza built the code that helped the MIT-Princeton Team robot understand what object it was interacting with, and where to place that object. “Maria was in charge of developing the software for high-level decision making,” explains Rodriguez. “She did it without having prior experience in big robotic systems and it worked out fantastic.”

Bauza’s mind was at ease within a few minutes of 2017 Amazon Robotics Challenge. “After a few objects that you do well, you start to relax,” she remembers. “You realize the system is working. By the end it was such a good feeling!”

Bauza and the rest of the MCube team walked away with first place in the “stow task” portion of the challenge. They will continue to work with Amazon on perfecting the technology they developed.

While Bauza tackles the challenge of developing software to help robots interact with their environments, she has her own personal challenge to tackle: surviving winter in Boston. “I’m from the island of Menorca off the coast of Spain, so Boston winters have definitely been an adjustment,” she adds. “Every year I buy warmer clothes. But I’m really lucky to be here and be able to collaborate with Professor Rodriguez and the MCube team on developing smart robots that interact with their environment.”

Teaching robots how to move objects

Doctoral student Maria Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” she says.
Photo: Tony Pulsone

By Mary Beth O’Leary
With the push of a button, months of hard work were about to be put to the test. Sixteen teams of engineers convened in a cavernous exhibit hall in Nagoya, Japan, for the 2017 Amazon Robotics Challenge. The robotic systems they built were tasked with removing items from bins and placing them into boxes. For graduate student Maria Bauza, who served as task-planning lead for the MIT-Princeton Team, the moment was particularly nerve-wracking.

“It was super stressful when the competition started,” recalls Bauza. “You just press play and the robot is autonomous. It’s going to do whatever you code it for, but you have no control. If something is broken, then that’s it.”

Robotics has been a major focus for Bauza since her undergraduate career. She studied mathematics and engineering physics at the Polytechnic University of Catalonia in Barcelona. During a year as a visiting student at MIT, Bauza was able to put her interest in computer science and artificial intelligence into practice. “When I came to MIT for that year, I starting applying the tools I had learned in machine learning to real problems in robotics,” she adds.

Two creative undergraduate projects gave her even more practice in this area. In one project, she hacked the controller of a toy remote control car to make it drive in a straight line. In another, she developed a portable robot that could draw on the blackboard for teachers. The robot was given an image of Mona Lisa and, after going through an algorithm, it drew that image on the blackboard. “That was the first small success in my robotics career,” says Bauza.

After graduating with her bachelor’s degree in 2016, she joined the Manipulation and Mechanisms Laboratory at MIT (known as MCube Lab) under Assistant Professor Alberto Rodriguez’s guidance. “Maria brings together experience in machine learning and a strong background in mathematics, computer science, and mechanics, which makes her a great candidate to grow into a leader in the fields of machine learning and robotics,” says Rodriguez.

For her PhD thesis, Bauza is developing machine-learning algorithms and software to improve how robots interact with the world. MCube’s multidisciplinary team provides the support needed to pursue this goal.

“In the end, machine learning can’t work if you don’t have good data,” Bauza explains. “Good data comes from good hardware, good sensors, good cameras — so in MCube we all collaborate to make sure the systems we build are powerful enough to be autonomous.”

To create these robust autonomous systems, Bauza has been exploring the notion of uncertainty when robots pick up, grasp, or push an object. “If the robot could touch the object, have a notion of tactile information, and be able to react to that information, it will have much more success,” explains Bauza.

Improving how robots interact with the world and reason to find the best possible outcome was crucial to the Amazon Robotics Challenge. Bauza built the code that helped the MIT-Princeton Team robot understand what object it was interacting with, and where to place that object. “Maria was in charge of developing the software for high-level decision making,” explains Rodriguez. “She did it without having prior experience in big robotic systems and it worked out fantastic.”

Bauza’s mind was at ease within a few minutes of 2017 Amazon Robotics Challenge. “After a few objects that you do well, you start to relax,” she remembers. “You realize the system is working. By the end it was such a good feeling!”

Bauza and the rest of the MCube team walked away with first place in the “stow task” portion of the challenge. They will continue to work with Amazon on perfecting the technology they developed.

While Bauza tackles the challenge of developing software to help robots interact with their environments, she has her own personal challenge to tackle: surviving winter in Boston. “I’m from the island of Menorca off the coast of Spain, so Boston winters have definitely been an adjustment,” she adds. “Every year I buy warmer clothes. But I’m really lucky to be here and be able to collaborate with Professor Rodriguez and the MCube team on developing smart robots that interact with their environment.”

Magnetic 3-D-printed structures crawl, roll, jump, and play catch

MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.
Photo: Felice Frankel

By Jennifer Chu
MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.

The menagerie of structures that can be magnetically manipulated includes a smooth ring that wrinkles up, a long tube that squeezes shut, a sheet that folds itself, and a spider-like “grabber” that can crawl, roll, jump, and snap together fast enough to catch a passing ball. It can even be directed to wrap itself around a small pill and carry it across a table.

The researchers fabricated each structure from a new type of 3-D-printable ink that they infused with tiny magnetic particles. They fitted an electromagnet around the nozzle of a 3-D printer, which caused the magnetic particles to swing into a single orientation as the ink was fed through the nozzle. By controlling the magnetic orientation of individual sections in the structure, the researchers can produce structures and devices that can almost instantaneously shift into intricate formations, and even move about, as the various sections respond to an external magnetic field.

Xuanhe Zhao, the Noyce Career Development Professor in MIT’s Department of Mechanical Engineering and Department of Civil and Environmental Engineering, says the group’s technique may be used to fabricate magnetically controlled biomedical devices.

“We think in biomedicine this technique will find promising applications,” Zhao says. “For example, we could put a structure around a blood vessel to control the pumping of blood, or use a magnet to guide a device through the GI tract to take images, extract tissue samples, clear a blockage, or deliver certain drugs to a specific location. You can design, simulate, and then just print to achieve various functions.”

Zhao and his colleagues have published their results today in the journal Nature. His co-authors include Yoonho Kim, Hyunwoo Yuk, and Ruike Zhao of MIT, and Shawn Chester of the New Jersey Institute of Technology.

A shifting field

The team’s magnetically activated structures fall under the general category of soft actuated devices — squishy, moldable materials that are designed to shape-shift or move about through a variety of mechanical means. For instance, hydrogel devices swell when temperature or pH changes; shape-memory polymers and liquid crystal elastomers deform with sufficient stimuli such as heat or light; pneumatic and hydraulic devices can be actuated by air or water pumped into them; and dielectric elastomers stretch under electric voltages.

But hydrogels, shape-memory polymers, and liquid crystal elastomers are slow to respond, and change shape over the course of minutes to hours. Air- and water-driven devices require tubes that connect them to pumps, making them inefficient for remotely controlled applications. Dielectric elastomers require high voltages, usually above a thousand volts.

“There is no ideal candidate for a soft robot that can perform in an enclosed space like a human body, where you’d want to carry out certain tasks untethered,” Kim says. “That’s why we think there’s great promise in this idea of magnetic actuation, because it is fast, forceful, body-benign, and can be remotely controlled.”

Other groups have fabricated magnetically activated materials, though the movements they have achieved have been relatively simple. For the most part, researchers mix a polymer solution with magnetic beads, and pour the mixture into a mold. Once the material cures, they apply a magnetic field to uniformly magnetize the beads, before removing the structure from the mold.

“People have only made structures that elongate, shrink, or bend,” Yuk says. “The challenge is, how do you design a structure or robot that can perform much more complicated tasks?”

Domain game

Instead of making structures with magnetic particles of the same, uniform orientation, the team looked for ways to create magnetic “domains” — individual sections of a structure, each with a distinct orientation of magnetic particles. When exposed to an external magnetic field, each section should move in a distinct way, depending on the direction its particles move in response to the magnetic field. In this way, the group surmised that structures should carry out more complex articulations and movements.

With their new 3-D-printing platform, the researchers can print sections, or domains, of a structure, and tune the orientation of magnetic particles in a particular domain by changing the direction of the electromagnet encircling the printer’s nozzle, as the domain is printed.  

The team also developed a physical model that predicts how a printed structure will deform under a magnetic field. Given the elasticity of the printed material, the pattern of domains in a structure, and the way in which an external magnetic field is applied, the model can predict the way an overall structure will deform or move. Ruike found that the model’s predictions closely matched with experiments the team carried out with a number of different printed structures.

In addition to a rippling ring, a self-squeezing tube, and a spider-like grabber, the team printed other complex structures, such as a set of “auxetic” structures that rapidly shrink or expand along two directions. Zhao and his colleagues also printed a ring embedded with electrical circuits and red and green LED lights. Depending on the orientation of an external magnetic field, the ring deforms to light up either red or green, in a programmed manner.

“We have developed a printing platform and a predictive model for others to use. People can design their own structures and domain patterns, validate them with the model, and print them to actuate various functions,” Zhao says. “By programming complex information of structure, domain, and magnetic field, one can even print intelligent machines such as robots.”

Jerry Qi, professor of mechanical engineering at Georgia Tech, says the group’s design can enable a range of fast, remotely controlled soft robotics, particularly in the biomedical field.

“This work is very novel,” says Qi, who was not involved in the research. “One could use a soft robot inside a human body or somewhere that is not easily accessible. With this technology reported in this paper, one can apply a magnetic field outside the human body, without using any wiring. Because of its fast responsive speed, the soft robot can fulfill many actions in a short time. These are important for practical applications.”

This research was supported, in part, by the National Science Foundation, the Office of Naval Research, and the MIT Institute for Soldier Nanotechnologies.

Magnetic 3-D-printed structures crawl, roll, jump, and play catch

MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.
Photo: Felice Frankel

By Jennifer Chu
MIT engineers have created soft, 3-D-printed structures whose movements can be controlled with a wave of a magnet, much like marionettes without the strings.

The menagerie of structures that can be magnetically manipulated includes a smooth ring that wrinkles up, a long tube that squeezes shut, a sheet that folds itself, and a spider-like “grabber” that can crawl, roll, jump, and snap together fast enough to catch a passing ball. It can even be directed to wrap itself around a small pill and carry it across a table.

The researchers fabricated each structure from a new type of 3-D-printable ink that they infused with tiny magnetic particles. They fitted an electromagnet around the nozzle of a 3-D printer, which caused the magnetic particles to swing into a single orientation as the ink was fed through the nozzle. By controlling the magnetic orientation of individual sections in the structure, the researchers can produce structures and devices that can almost instantaneously shift into intricate formations, and even move about, as the various sections respond to an external magnetic field.

Xuanhe Zhao, the Noyce Career Development Professor in MIT’s Department of Mechanical Engineering and Department of Civil and Environmental Engineering, says the group’s technique may be used to fabricate magnetically controlled biomedical devices.

“We think in biomedicine this technique will find promising applications,” Zhao says. “For example, we could put a structure around a blood vessel to control the pumping of blood, or use a magnet to guide a device through the GI tract to take images, extract tissue samples, clear a blockage, or deliver certain drugs to a specific location. You can design, simulate, and then just print to achieve various functions.”

Zhao and his colleagues have published their results today in the journal Nature. His co-authors include Yoonho Kim, Hyunwoo Yuk, and Ruike Zhao of MIT, and Shawn Chester of the New Jersey Institute of Technology.

A shifting field

The team’s magnetically activated structures fall under the general category of soft actuated devices — squishy, moldable materials that are designed to shape-shift or move about through a variety of mechanical means. For instance, hydrogel devices swell when temperature or pH changes; shape-memory polymers and liquid crystal elastomers deform with sufficient stimuli such as heat or light; pneumatic and hydraulic devices can be actuated by air or water pumped into them; and dielectric elastomers stretch under electric voltages.

But hydrogels, shape-memory polymers, and liquid crystal elastomers are slow to respond, and change shape over the course of minutes to hours. Air- and water-driven devices require tubes that connect them to pumps, making them inefficient for remotely controlled applications. Dielectric elastomers require high voltages, usually above a thousand volts.

“There is no ideal candidate for a soft robot that can perform in an enclosed space like a human body, where you’d want to carry out certain tasks untethered,” Kim says. “That’s why we think there’s great promise in this idea of magnetic actuation, because it is fast, forceful, body-benign, and can be remotely controlled.”

Other groups have fabricated magnetically activated materials, though the movements they have achieved have been relatively simple. For the most part, researchers mix a polymer solution with magnetic beads, and pour the mixture into a mold. Once the material cures, they apply a magnetic field to uniformly magnetize the beads, before removing the structure from the mold.

“People have only made structures that elongate, shrink, or bend,” Yuk says. “The challenge is, how do you design a structure or robot that can perform much more complicated tasks?”

Domain game

Instead of making structures with magnetic particles of the same, uniform orientation, the team looked for ways to create magnetic “domains” — individual sections of a structure, each with a distinct orientation of magnetic particles. When exposed to an external magnetic field, each section should move in a distinct way, depending on the direction its particles move in response to the magnetic field. In this way, the group surmised that structures should carry out more complex articulations and movements.

With their new 3-D-printing platform, the researchers can print sections, or domains, of a structure, and tune the orientation of magnetic particles in a particular domain by changing the direction of the electromagnet encircling the printer’s nozzle, as the domain is printed.  

The team also developed a physical model that predicts how a printed structure will deform under a magnetic field. Given the elasticity of the printed material, the pattern of domains in a structure, and the way in which an external magnetic field is applied, the model can predict the way an overall structure will deform or move. Ruike found that the model’s predictions closely matched with experiments the team carried out with a number of different printed structures.

In addition to a rippling ring, a self-squeezing tube, and a spider-like grabber, the team printed other complex structures, such as a set of “auxetic” structures that rapidly shrink or expand along two directions. Zhao and his colleagues also printed a ring embedded with electrical circuits and red and green LED lights. Depending on the orientation of an external magnetic field, the ring deforms to light up either red or green, in a programmed manner.

“We have developed a printing platform and a predictive model for others to use. People can design their own structures and domain patterns, validate them with the model, and print them to actuate various functions,” Zhao says. “By programming complex information of structure, domain, and magnetic field, one can even print intelligent machines such as robots.”

Jerry Qi, professor of mechanical engineering at Georgia Tech, says the group’s design can enable a range of fast, remotely controlled soft robotics, particularly in the biomedical field.

“This work is very novel,” says Qi, who was not involved in the research. “One could use a soft robot inside a human body or somewhere that is not easily accessible. With this technology reported in this paper, one can apply a magnetic field outside the human body, without using any wiring. Because of its fast responsive speed, the soft robot can fulfill many actions in a short time. These are important for practical applications.”

This research was supported, in part, by the National Science Foundation, the Office of Naval Research, and the MIT Institute for Soldier Nanotechnologies.

Teaching chores to an artificial agent

A system developed at MIT aims to teach artificial agents a range of chores, including setting the table and making coffee.
Image: MIT CSAIL

By Adam Conner-Simons | Rachel Gordon

For many people, household chores are a dreaded, inescapable part of life that we often put off or do with little care. But what if a robot assistant could help lighten the load?

Recently, computer scientists have been working on teaching machines to do a wider range of tasks around the house. In a new paper spearheaded by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the University of Toronto, researchers demonstrate “VirtualHome,” a system that can simulate detailed household tasks and then have artificial “agents” execute them, opening up the possibility of one day teaching robots to do such tasks.

The team trained the system using nearly 3,000 programs of various activities, which are further broken down into subtasks for the computer to understand. A simple task like “making coffee,” for example, would also include the step “grabbing a cup.” The researchers demonstrated VirtualHome in a 3-D world inspired by the Sims video game.

The team’s artificial agent can execute 1,000 of these interactions in the Sims-style world, with eight different scenes including a living room, kitchen, dining room, bedroom, and home office.

“Describing actions as computer programs has the advantage of providing clear and unambiguous descriptions of all the steps needed to complete a task,” says MIT PhD student Xavier Puig, who was lead author on the paper. “These programs can instruct a robot or a virtual character, and can also be used as a representation for complex tasks with simpler actions.”

The project was co-developed by CSAIL and the University of Toronto alongside researchers from McGill University and the University of Ljubljana. It will be presented at the Computer Vision and Pattern Recognition (CVPR) conference, which takes place this month in Salt Lake City.

Unlike humans, robots need more explicit instructions to complete easy tasks; they can’t just infer and reason with ease.

For example, one might tell a human to “switch on the TV and watch it from the sofa.” Here, actions like “grab the remote control” and “sit/lie on sofa” have been omitted, since they’re part of the commonsense knowledge that humans have.

To better demonstrate these kinds of tasks to robots, the descriptions for actions needed to be much more detailed. To do so, the team first collected verbal descriptions of household activities, and then translated them into simple code. A program like this might include steps like: walk to the television, switch on the television, walk to the sofa, sit on the sofa, and watch television.

Once the programs were created, the team fed them to the VirtualHome 3-D simulator to be turned into videos. Then, a virtual agent would execute the tasks defined by the programs, whether it was watching television, placing a pot on the stove, or turning a toaster on and off.

The end result is not just a system for training robots to do chores, but also a large database of household tasks described using natural language. Companies like Amazon that are working to develop Alexa-like robotic systems at home could eventually use data like these to train their models to do more complex tasks.

The team’s model successfully demonstrated that their agents could learn to reconstruct a program, and therefore perform a task, given either a description: “pour milk into glass” or a video demonstration of the activity.

“This line of work could facilitate true robotic personal assistants in the future,” says Qiao Wang, a research assistant in arts, media, and engineering at Arizona State University. “Instead of each task programmed by the manufacturer, the robot can learn tasks just by listening to or watching the specific person it accompanies. This allows the robot to do tasks in a personalized way, or even some day invoke an emotional connection as a result of this personalized learning process.”

In the future, the team hopes to train the robots using actual videos instead of Sims-style simulation videos, which would enable a robot to learn simply by watching a YouTube video. The team is also working on implementing a reward-learning system in which the agent gets positive feedback when it does tasks correctly.

“You can imagine a setting where robots are assisting with chores at home and can eventually anticipate personalized wants and needs, or impending action,” says Puig. “This could be especially helpful as an assistive technology for the elderly, or those who may have limited mobility.”

Teaching chores to an artificial agent

A system developed at MIT aims to teach artificial agents a range of chores, including setting the table and making coffee.
Image: MIT CSAIL

By Adam Conner-Simons | Rachel Gordon

For many people, household chores are a dreaded, inescapable part of life that we often put off or do with little care. But what if a robot assistant could help lighten the load?

Recently, computer scientists have been working on teaching machines to do a wider range of tasks around the house. In a new paper spearheaded by MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the University of Toronto, researchers demonstrate “VirtualHome,” a system that can simulate detailed household tasks and then have artificial “agents” execute them, opening up the possibility of one day teaching robots to do such tasks.

The team trained the system using nearly 3,000 programs of various activities, which are further broken down into subtasks for the computer to understand. A simple task like “making coffee,” for example, would also include the step “grabbing a cup.” The researchers demonstrated VirtualHome in a 3-D world inspired by the Sims video game.

The team’s artificial agent can execute 1,000 of these interactions in the Sims-style world, with eight different scenes including a living room, kitchen, dining room, bedroom, and home office.

“Describing actions as computer programs has the advantage of providing clear and unambiguous descriptions of all the steps needed to complete a task,” says MIT PhD student Xavier Puig, who was lead author on the paper. “These programs can instruct a robot or a virtual character, and can also be used as a representation for complex tasks with simpler actions.”

The project was co-developed by CSAIL and the University of Toronto alongside researchers from McGill University and the University of Ljubljana. It will be presented at the Computer Vision and Pattern Recognition (CVPR) conference, which takes place this month in Salt Lake City.

Unlike humans, robots need more explicit instructions to complete easy tasks; they can’t just infer and reason with ease.

For example, one might tell a human to “switch on the TV and watch it from the sofa.” Here, actions like “grab the remote control” and “sit/lie on sofa” have been omitted, since they’re part of the commonsense knowledge that humans have.

To better demonstrate these kinds of tasks to robots, the descriptions for actions needed to be much more detailed. To do so, the team first collected verbal descriptions of household activities, and then translated them into simple code. A program like this might include steps like: walk to the television, switch on the television, walk to the sofa, sit on the sofa, and watch television.

Once the programs were created, the team fed them to the VirtualHome 3-D simulator to be turned into videos. Then, a virtual agent would execute the tasks defined by the programs, whether it was watching television, placing a pot on the stove, or turning a toaster on and off.

The end result is not just a system for training robots to do chores, but also a large database of household tasks described using natural language. Companies like Amazon that are working to develop Alexa-like robotic systems at home could eventually use data like these to train their models to do more complex tasks.

The team’s model successfully demonstrated that their agents could learn to reconstruct a program, and therefore perform a task, given either a description: “pour milk into glass” or a video demonstration of the activity.

“This line of work could facilitate true robotic personal assistants in the future,” says Qiao Wang, a research assistant in arts, media, and engineering at Arizona State University. “Instead of each task programmed by the manufacturer, the robot can learn tasks just by listening to or watching the specific person it accompanies. This allows the robot to do tasks in a personalized way, or even some day invoke an emotional connection as a result of this personalized learning process.”

In the future, the team hopes to train the robots using actual videos instead of Sims-style simulation videos, which would enable a robot to learn simply by watching a YouTube video. The team is also working on implementing a reward-learning system in which the agent gets positive feedback when it does tasks correctly.

“You can imagine a setting where robots are assisting with chores at home and can eventually anticipate personalized wants and needs, or impending action,” says Puig. “This could be especially helpful as an assistive technology for the elderly, or those who may have limited mobility.”

Surgical technique improves sensation, control of prosthetic limb

Two agonist-antagonist myoneural interface devices (AMIs) were surgically created in the patient’s residual limb: One was electrically linked to the robotic ankle joint, and the other to the robotic subtalar joint.
Image: MIT Media Lab/Biomechatronics group. Original artwork by Stephanie Ku.

By Helen Knight

Humans can accurately sense the position, speed, and torque of their limbs, even with their eyes shut. This sense, known as proprioception, allows humans to precisely control their body movements.

Despite significant improvements to prosthetic devices in recent years, researchers have been unable to provide this essential sensation to people with artificial limbs, limiting their ability to accurately control their movements.

Researchers at the Center for Extreme Bionics at the MIT Media Lab have invented a new neural interface and communication paradigm that is able to send movement commands from the central nervous system to a robotic prosthesis, and relay proprioceptive feedback describing movement of the joint back to the central nervous system in return.

This new paradigm, known as the agonist-antagonist myoneural interface, involves a novel surgical approach to limb amputation in which dynamic muscle relationships are preserved within the amputated limb. The AMI was validated in extensive preclinical experimentation at MIT prior to its first surgical implementation in a human patient at Brigham and Women’s Faulkner Hospital.

In a paper published today in Science Translational Medicine, the researchers describe the first human implementation of the agonist-antagonist myoneural interface (AMI), in a person with below-knee amputation.

The paper represents the first time information on joint position, speed, and torque has been fed from a prosthetic limb into the nervous system, according to senior author and project director Hugh Herr, a professor of media arts and sciences at the MIT Media Lab.

“Our goal is to close the loop between the peripheral nervous system’s muscles and nerves, and the bionic appendage,” says Herr.

To do this, the researchers used the same biological sensors that create the body’s natural proprioceptive sensations.

The AMI consists of two opposing muscle-tendons, known as an agonist and an antagonist, which are surgically connected in series so that when one muscle contracts and shortens — upon either volitional or electrical activation — the other stretches, and vice versa.

This coupled movement enables natural biological sensors within the muscle-tendon to transmit electrical signals to the central nervous system, communicating muscle length, speed, and force information, which is interpreted by the brain as natural joint proprioception. 

This is how muscle-tendon proprioception works naturally in human joints, Herr says.

“Because the muscles have a natural nerve supply, when this agonist-antagonist muscle movement occurs information is sent through the nerve to the brain, enabling the person to feel those muscles moving, both their position, speed, and load,” he says.

By connecting the AMI with electrodes, the researchers can detect electrical pulses from the muscle, or apply electricity to the muscle to cause it to contract.

“When a person is thinking about moving their phantom ankle, the AMI that maps to that bionic ankle is moving back and forth, sending signals through the nerves to the brain, enabling the person with an amputation to actually feel their bionic ankle moving throughout the whole angular range,” Herr says.

Decoding the electrical language of proprioception within nerves is extremely difficult, according to Tyler Clites, first author of the paper and graduate student lead on the project.

“Using this approach, rather than needing to speak that electrical language ourselves, we use these biological sensors to speak the language for us,” Clites says. “These sensors translate mechanical stretch into electrical signals that can be interpreted by the brain as sensations of position, speed, and force.”

The AMI was first implemented surgically in a human patient at Brigham and Women’s Faulkner Hospital, Boston, by Matthew Carty, one of the paper’s authors, a surgeon in the Division of Plastic and Reconstructive Surgery, and an MIT research scientist.

In this operation, two AMIs were constructed in the residual limb at the time of primary below-knee amputation, with one AMI to control the prosthetic ankle joint, and the other to control the prosthetic subtalar joint.

“We knew that in order for us to validate the success of this new approach to amputation, we would need to couple the procedure with a novel prosthesis that could take advantage of the additional capabilities of this new type of residual limb,” Carty says. “Collaboration was critical, as the design of the procedure informed the design of the robotic limb, and vice versa.”

Toward this end, an advanced prosthetic limb was built at MIT and electrically linked to the patient’s peripheral nervous system using electrodes placed over each AMI muscle following the amputation surgery.

The researchers then compared the movement of the AMI patient with that of four people who had undergone a traditional below-knee amputation procedure, using the same advanced prosthetic limb.

They found that the AMI patient had more stable control over movement of the prosthetic device and was able to move more efficiently than those with the conventional amputation. They also found that the AMI patient quickly displayed natural, reflexive behaviors such as extending the toes toward the next step when walking down a set of stairs.

These behaviors are essential to natural human movement and were absent in all of the people who had undergone a traditional amputation.

What’s more, while the patients with conventional amputation reported feeling disconnected to the prosthesis, the AMI patient quickly described feeling that the bionic ankle and foot had become a part of their own body.

“This is pretty significant evidence that the brain and the spinal cord in this patient adopted the prosthetic leg as if it were their biological limb, enabling those biological pathways to become active once again,” Clites says. “We believe proprioception is fundamental to that adoption.”

It is difficult for an individual with a lower limb amputation to gain a sense of embodiment with their artificial limb, according to Daniel Ferris, the Robert W. Adenbaum Professor of Engineering Innovation at the University of Florida, who was not involved in the research.

“This is ground breaking. The increased sense of embodiment by the amputee subject is a powerful result of having better control of and feedback from the bionic limb,” Ferris says. “I expect that we will see individuals with traumatic amputations start to seek out this type of surgery and interface for their prostheses — it could provide a much greater quality of life for amputees.”

The researchers have since carried out the AMI procedure on nine other below-knee amputees and are planning to adapt the technique for those needing above-knee, below-elbow, and above-elbow amputations.

“Previously, humans have used technology in a tool-like fashion,” Herr says. “We are now starting to see a new era of human-device interaction, of full neurological embodiment, in which what we design becomes truly part of us, part of our identity.”

Page 8 of 12
1 6 7 8 9 10 12