Page 7 of 12
1 5 6 7 8 9 12

Nanoparticles take a fantastic, magnetic voyage

MIT engineers have designed a magnetic microrobot that can help push drug-delivery particles into tumor tissue (left). They also employed swarms of naturally magnetic bacteria to achieve the same effect (right).
Image courtesy of the researchers.

By Anne Trafton

MIT engineers have designed tiny robots that can help drug-delivery nanoparticles push their way out of the bloodstream and into a tumor or another disease site. Like crafts in “Fantastic Voyage” — a 1960s science fiction film in which a submarine crew shrinks in size and roams a body to repair damaged cells — the robots swim through the bloodstream, creating a current that drags nanoparticles along with them.

The magnetic microrobots, inspired by bacterial propulsion, could help to overcome one of the biggest obstacles to delivering drugs with nanoparticles: getting the particles to exit blood vessels and accumulate in the right place.

“When you put nanomaterials in the bloodstream and target them to diseased tissue, the biggest barrier to that kind of payload getting into the tissue is the lining of the blood vessel,” says Sangeeta Bhatia, the John and Dorothy Wilson Professor of Health Sciences and Technology and Electrical Engineering and Computer Science, a member of MIT’s Koch Institute for Integrative Cancer Research and its Institute for Medical Engineering and Science, and the senior author of the study.

“Our idea was to see if you can use magnetism to create fluid forces that push nanoparticles into the tissue,” adds Simone Schuerle, a former MIT postdoc and lead author of the paper, which appears in the April 26 issue of Science Advances.

In the same study, the researchers also showed that they could achieve a similar effect using swarms of living bacteria that are naturally magnetic. Each of these approaches could be suited for different types of drug delivery, the researchers say.

Tiny robots

Schuerle, who is now an assistant professor at the Swiss Federal Institute of Technology (ETH Zurich), first began working on tiny magnetic robots as a graduate student in Brad Nelson’s Multiscale Robotics Lab at ETH Zurich. When she came to Bhatia’s lab as a postdoc in 2014, she began investigating whether this kind of bot could help to make nanoparticle drug delivery more efficient.

In most cases, researchers target their nanoparticles to disease sites that are surrounded by “leaky” blood vessels, such as tumors. This makes it easier for the particles to get into the tissue, but the delivery process is still not as effective as it needs to be.

The MIT team decided to explore whether the forces generated by magnetic robots might offer a better way to push the particles out of the bloodstream and into the target site.

The robots that Schuerle used in this study are 35 hundredths of a millimeter long, similar in size to a single cell, and can be controlled by applying an external magnetic field. This bioinspired robot, which the researchers call an “artificial bacterial flagellum,” consists of a tiny helix that resembles the flagella that many bacteria use to propel themselves. These robots are 3-D-printed with a high-resolution 3-D printer and then coated with nickel, which makes them magnetic.

To test a single robot’s ability to control nearby nanoparticles, the researchers created a microfluidic system that mimics the blood vessels that surround tumors. The channel in their system, between 50 and 200 microns wide, is lined with a gel that has holes to simulate the broken blood vessels seen near tumors.

Using external magnets, the researchers applied magnetic fields to the robot, which makes the helix rotate and swim through the channel. Because fluid flows through the channel in the opposite direction, the robot remains stationary and creates a convection current, which pushes 200-nanometer polystyrene particles into the model tissue. These particles penetrated twice as far into the tissue as nanoparticles delivered without the aid of the magnetic robot.

This type of system could potentially be incorporated into stents, which are stationary and would be easy to target with an externally applied magnetic field. Such an approach could be useful for delivering drugs to help reduce inflammation at the site of the stent, Bhatia says.

Bacterial swarms

The researchers also developed a variant of this approach that relies on swarms of naturally magnetotactic bacteria instead of microrobots. Bhatia has previously developed bacteria that can be used to deliver cancer-fighting drugs and to diagnose cancer, exploiting bacteria’s natural tendency to accumulate at disease sites.

For this study, the researchers used a type of bacteria called Magnetospirillum magneticum, which naturally produces chains of iron oxide. These magnetic particles, known as magnetosomes, help bacteria orient themselves and find their preferred environments.

The researchers discovered that when they put these bacteria into the microfluidic system and applied rotating magnetic fields in certain orientations, the bacteria began to rotate in synchrony and move in the same direction, pulling along any nanoparticles that were nearby. In this case, the researchers found that nanoparticles were pushed into the model tissue three times faster than when the nanoparticles were delivered without any magnetic assistance.

This bacterial approach could be better suited for drug delivery in situations such as a tumor, where the swarm, controlled externally without the need for visual feedback, could generate fluidic forces in vessels throughout the tumor.  

The particles that the researchers used in this study are big enough to carry large payloads, including the components required for the CRISPR genome-editing system, Bhatia says. She now plans to collaborate with Schuerle to further develop both of these magnetic approaches for testing in animal models.

The research was funded by the Swiss National Science Foundation, the Branco Weiss Fellowship, the National Institutes of Health, the National Science Foundation, and the Howard Hughes Medical Institute.

Giving robots a better feel for object manipulation


A new “particle simulator” developed by MIT researchers improves robots’ abilities to mold materials into simulated target shapes and interact with solid objects and liquids. This could give robots a refined touch for industrial applications or for personal robotics— such as shaping clay or rolling sticky sushi rice.
Courtesy of the researchers

By Rob Matheson

A new learning system developed by MIT researchers improves robots’ abilities to mold materials into target shapes and make predictions about interacting with solid objects and liquids. The system, known as a learning-based particle simulator, could give industrial robots a more refined touch — and it may have fun applications in personal robotics, such as modelling clay shapes or rolling sticky rice for sushi.

In robotic planning, physical simulators are models that capture how different materials respond to force. Robots are “trained” using the models, to predict the outcomes of their interactions with objects, such as pushing a solid box or poking deformable clay. But traditional learning-based simulators mainly focus on rigid objects and are unable to handle fluids or softer objects. Some more accurate physics-based simulators can handle diverse materials, but rely heavily on approximation techniques that introduce errors when robots interact with objects in the real world.

In a paper being presented at the International Conference on Learning Representations in May, the researchers describe a new model that learns to capture how small portions of different materials — “particles” — interact when they’re poked and prodded. The model directly learns from data in cases where the underlying physics of the movements are uncertain or unknown. Robots can then use the model as a guide to predict how liquids, as well as rigid and deformable materials, will react to the force of its touch. As the robot handles the objects, the model also helps to further refine the robot’s control.

In experiments, a robotic hand with two fingers, called “RiceGrip,” accurately shaped a deformable foam to a desired configuration — such as a “T” shape — that serves as a proxy for sushi rice. In short, the researchers’ model serves as a type of “intuitive physics” brain that robots can leverage to reconstruct three-dimensional objects somewhat similarly to how humans do.

Humans have an intuitive physics model in our heads, where we can imagine how an object will behave if we push or squeeze it. Based on this intuitive model, humans can accomplish amazing manipulation tasks that are far beyond the reach of current robots,” says first author Yunzhu Li, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL).We want to build this type of intuitive model for robots to enable them to do what humans can do.”

“When children are 5 months old, they already have different expectations for solids and liquids,” adds co-author Jiajun Wu, a CSAIL graduate student. “That’s something we know at an early age, so maybe that’s something we should try to model for robots.”

Joining Li and Wu on the paper are: Russ Tedrake, a CSAIL researcher and a professor in the Department of Electrical Engineering and Computer Science (EECS); Joshua Tenenbaum, a professor in the Department of Brain and Cognitive Sciences; and Antonio Torralba, a professor in EECS and director of the MIT-IBM Watson AI Lab.

Dynamic graphs

A key innovation behind the model, called the “particle interaction network” (DPI-Nets), was creating dynamic interaction graphs, which consist of thousands of nodes and edges that can capture complex behaviors of so-called particles. In the graphs, each node represents a particle. Neighboring nodes are connected with each other using directed edges, which represent the interaction passing from one particle to the other. In the simulator, particles are hundreds of small spheres combined to make up some liquid or a deformable object.

The graphs are constructed as the basis for a machine-learning system called a graph neural network. In training, the model over time learns how particles in different materials react and reshape. It does so by implicitly calculating various properties for each particle — such as its mass and elasticity — to predict if and where the particle will move in the graph when perturbed.

The model then leverages a “propagation” technique, which instantaneously spreads a signal throughout the graph. The researchers customized the technique for each type of material — rigid, deformable, and liquid — to shoot a signal that predicts particles positions at certain incremental time steps. At each step, it moves and reconnects particles, if needed.

For example, if a solid box is pushed, perturbed particles will be moved forward. Because all particles inside the box are rigidly connected with each other, every other particle in the object moves the same calculated distance, rotation, and any other dimension. Particle connections remain intact and the box moves as a single unit. But if an area of deformable foam is indented, the effect will be different. Perturbed particles move forward a lot, surrounding particles move forward only slightly, and particles farther away won’t move at all. With liquids being sloshed around in a cup, particles may completely jump from one end of the graph to the other. The graph must learn to predict where and how much all affected particles move, which is computationally complex.

Shaping and adapting

In their paper, the researchers demonstrate the model by tasking the two-fingered RiceGrip robot with clamping target shapes out of deformable foam. The robot first uses a depth-sensing camera and object-recognition techniques to identify the foam. The researchers randomly select particles inside the perceived shape to initialize the position of the particles. Then, the model adds edges between particles and reconstructs the foam into a dynamic graph customized for deformable materials.

Because of the learned simulations, the robot already has a good idea of how each touch, given a certain amount of force, will affect each of the particles in the graph. As the robot starts indenting the foam, it iteratively matches the real-world position of the particles to the targeted position of the particles. Whenever the particles don’t align, it sends an error signal to the model. That signal tweaks the model to better match the real-world physics of the material.

Next, the researchers aim to improve the model to help robots better predict interactions with partially observable scenarios, such as knowing how a pile of boxes will move when pushed, even if only the boxes at the surface are visible and most of the other boxes are hidden.

The researchers are also exploring ways to combine the model with an end-to-end perception module by operating directly on images. This will be a joint project with Dan Yamins’s group; Yamin recently completed his postdoc at MIT and is now an assistant professor at Stanford University. “You’re dealing with these cases all the time where there’s only partial information,” Wu says. “We’re extending our model to learn the dynamics of all particles, while only seeing a small portion.”

Robots that can sort recycling

RoCycle can detect if an object is paper, metal, or plastic. CSAIL researchers say that such a system could potentially help enable the convenience of single-stream recycling with lower contamination rates that confirm to China’s new recycling standards.
Photo: Jason Dorfman

By Adam Conner-Simons

Every year trash companies sift through an estimated 68 million tons of recycling, which is the weight equivalent of more than 30 million cars.

A key step in the process happens on fast-moving conveyor belts, where workers have to sort items into categories like paper, plastic and glass. Such jobs are dull, dirty, and often unsafe, especially in facilities where workers also have to remove normal trash from the mix.

With that in mind, a team led by researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has developed a robotic system that can detect if an object is paper, metal, or plastic.

The team’s “RoCycle” system includes a soft Teflon hand that uses tactile sensors on its fingertips to detect an object’s size and stiffness. Compatible with any robotic arm, RoCycle was found to be 85 percent accurate at detecting materials when stationary, and 63 percent accurate on an actual simulated conveyer belt. (Its most common error was identifying paper-covered metal tins as paper, which the team says would be improved by adding more sensors along the contact surface.)

“Our robot’s sensorized skin provides haptic feedback that allows it to differentiate between a wide range of objects, from the rigid to the squishy,” says MIT Professor Daniela Rus, senior author on a related paper that will be presented in April at the IEEE International Conference on Soft Robotics (RoboSoft) in Seoul, South Korea. “Computer vision alone will not be able to solve the problem of giving machines human-like perception, so being able to use tactile input is of vital importance.”

A collaboration with Yale University, RoCycle directly demonstrates the limits of sight-based sorting: It can reliably distinguish between two visually similar Starbucks cups, one made of paper and one made of plastic, that would give vision systems trouble.

Incentivizing recycling

Rus says that the project is part of her larger goal to reduce the back-end cost of recycling, in order to incentivize more cities and countries to create their own programs. Today recycling centers aren’t particularly automated; their main kinds of machinery include optical sorters that use different wavelength light to distinguish between plastics, magnetic sorters that separate out iron and steel products, and aluminum sorters that use eddy currents to remove non-magnetic metals.

This is a problem for one very big reason: just last month China raised its standards for the cleanliness of recycled goods it accepts from the United States, meaning that some of the country’s single-stream recycling is now sent to landfills.

“If a system like RoCycle could be deployed on a wide scale, we’d potentially be able to have the convenience of single-stream recycling with the lower contamination rates of multi-stream recycling,” says PhD student Lillian Chin, lead author on the new paper.

It’s surprisingly hard to develop machines that can distinguish between paper, plastic, and metal, which shows how impressive a feat it is for humans. When we pick up an object, we can immediately recognize many of its qualities even with our eyes closed, like whether it’s large and stiff or small and soft. By feeling the object and understanding how that relates to the softness of our own fingertips, we are able to learn how to handle a wide range of objects without dropping or breaking them.

This kind of intuition is tough to program into robots. Traditional hard (“rigid”) robot hands have to know an object’s exact location and size to be able to calculate a precise motion path. Soft hands made of materials like rubber are much more flexible, but have a different problem: Because they’re powered by fluidic forces, they have a balloon-like structure that can puncture quite easily.

How RoCycle works

Rus’ team used a motor-driven hand made of a relatively new material called “auxetics.” Most materials get narrower when pulled on, like a rubber band when you stretch it; auxetics, meanwhile, actually get wider. The MIT team took this concept and put a twist on it, quite literally: They created auxetics that, when cut, twist to either the left or right. Combining a “left-handed” and “right-handed” auxetic for each of the hand’s two large fingers makes them interlock and oppose each other’s rotation, enabling more dynamic movement. (The team calls this “handed-shearing auxetics”, or HSA.)

“In contrast to soft robots, whose fluid-driven approach requires air pumps and compressors, HSA combines twisting with extension, meaning that you’re able to use regular motors,” says Chin.

The team’s gripper first uses its “strain sensor” to estimate an object’s size, and then uses its two pressure sensors to measure the force needed to grasp an object. These metrics — along with calibration data on the size and stiffnesses of objects of different material types — are what gives the gripper a sense of what material the object is made. (Since the tactile sensors are also conductive, they can detect metal by how much it changes the electrical signal.)

“In other words, we estimate the size and measure the pressure difference between the current closed hand and what a normal open hand should look like,” says Chin. “We use this pressure difference and size to classify the specific object based on information about different objects that we’ve already measured.”

RoCycle builds on an set of sensors that detect the radius of an object to within 30 percent accuracy, and tell the difference between “hard” and “soft” objects with 78 percent accuracy. The team’s hand is also almost completely puncture resistant: It was able to be scraped by a sharp lid and punctured by a needle more than 20 times, with minimal structural damage.

As a next step, the researchers plan to build out the system so that it can combine tactile data with actual video data from a robot’s cameras. This would allow the team to further improve its accuracy and potentially allow for even more nuanced differentiation between different kinds of materials.

Chin and Rus co-wrote the RoCycle paper alongside MIT postdoc Jeffrey Lipton, as well as PhD student Michelle Yuen and Professor Rebecca Kramer-Bottiglio of Yale University.

This project was supported in part by Amazon, JD.com, the Toyota Research Institute, and the National Science Foundation.

Teaching machines to reason about what they see

Researchers trained a hybrid AI model to answer questions like “Does the red object left of the green cube have the same shape as the purple matte thing?” by feeding it examples of object colors and shapes followed by more complex scenarios involving multi-object comparisons. The model could transfer this knowledge to new scenarios as well as or better than state-of-the-art models using a fraction of the training data.
Image: Justin Johnson

A child who has never seen a pink elephant can still describe one — unlike a computer. “The computer learns from data,” says Jiajun Wu, a PhD student at MIT. “The ability to generalize and recognize something you’ve never seen before — a pink elephant — is very hard for machines.”

Deep learning systems interpret the world by picking out statistical patterns in data. This form of machine learning is now everywhere, automatically tagging friends on Facebook, narrating Alexa’s latest weather forecast, and delivering fun facts via Google search. But statistical learning has its limits. It requires tons of data, has trouble explaining its decisions, and is terrible at applying past knowledge to new situations; It can’t comprehend an elephant that’s pink instead of gray.  

To give computers the ability to reason more like us, artificial intelligence (AI) researchers are returning to abstract, or symbolic, programming. Popular in the 1950s and 1960s, symbolic AI wires in the rules and logic that allow machines to make comparisons and interpret how objects and entities relate. Symbolic AI uses less data, records the chain of steps it takes to reach a decision, and when combined with the brute processing power of statistical neural networks, it can even beat humans in a complicated image comprehension test. 

A new study by a team of researchers at MITMIT-IBM Watson AI Lab, and DeepMind shows the promise of merging statistical and symbolic AI. Led by Wu and Joshua Tenenbaum, a professor in MIT’s Department of Brain and Cognitive Sciences and the Computer Science and Artificial Intelligence Laboratory, the team shows that its hybrid model can learn object-related concepts like color and shape, and leverage that knowledge to interpret complex object relationships in a scene. With minimal training data and no explicit programming, their model could transfer concepts to larger scenes and answer increasingly tricky questions as well as or better than its state-of-the-art peers. The team presents its results at the International Conference on Learning Representations in May.

“One way children learn concepts is by connecting words with images,” says the study’s lead author Jiayuan Mao, an undergraduate at Tsinghua University who worked on the project as a visiting fellow at MIT. “A machine that can learn the same way needs much less data, and is better able to transfer its knowledge to new scenarios.”

The study is a strong argument for moving back toward abstract-program approaches, says Jacob Andreas, a recent graduate of the University of California at Berkeley, who starts at MIT as an assistant professor this fall and was not involved in the work. “The trick, it turns out, is to add more symbolic structure, and to feed the neural networks a representation of the world that’s divided into objects and properties rather than feeding it raw images,” he says. “This work gives us insight into what machines need to understand before language learning is possible.”

The team trained their model on images paired with related questions and answers, part of the CLEVR image comprehension test developed at Stanford University. As the model learns, the questions grow progressively harder, from, “What’s the color of the object?” to “How many objects are both right of the green cylinder and have the same material as the small blue ball?” Once object-level concepts are mastered, the model advances to learning how to relate objects and their properties to each other.

Like other hybrid AI models, MIT’s works by splitting up the task. A perception module of neural networks crunches the pixels in each image and maps the objects. A language module, also made of neural nets, extracts a meaning from the words in each sentence and creates symbolic programs, or instructions, that tell the machine how to answer the question. A third reasoning module runs the symbolic programs on the scene and gives an answer, updating the model when it makes mistakes.

Key to the team’s approach is a perception module that translates the image into an object-based representation, making the programs easier to execute. Also unique is what they call curriculum learning, or selectively training the model on concepts and scenes that grow progressively more difficult. It turns out that feeding the machine data in a logical way, rather than haphazardly, helps the model learn faster while improving accuracy.

Once the model has a solid foundation, it can interpret new scenes and concepts, and increasingly difficult questions, almost perfectly. Asked to answer an unfamiliar question like, “What’s the shape of the big yellow thing?” it outperformed its peers at Stanford and nearby MIT Lincoln Laboratory with a fraction of the data. 

While other models trained on the full CLEVR dataset of 70,000 images and 700,000 questions, the MIT-IBM model used 5,000 images and 100,000 questions. As the model built on previously learned concepts, it absorbed the programs underlying each question, speeding up the training process. 

Though statistical, deep learning models are now embedded in daily life, much of their decision process remains hidden from view. This lack of transparency makes it difficult to anticipate where the system is susceptible to manipulation, error, or bias. Adding a symbolic layer can open the black box, explaining the growing interest in hybrid AI systems.

“Splitting the task up and letting programs do some of the work is the key to building interpretability into deep learning models,” says Lincoln Laboratory researcher David Mascharka, whose hybrid model, Transparency by Design Network, is benchmarked in the MIT-IBM study.      

The MIT-IBM team is now working to improve the model’s performance on real-world photos and extending it to video understanding and robotic manipulation. Other authors of the study are Chuang Gan and Pushmeet Kohli, researchers atthe MIT-IBM Watson AI Lab and DeepMind, respectively.

“Particle robot” works as a cluster of simple units


By Rob Matheson
Researchers have developed computationally simple robots, called particles, that cluster and form a single “particle robot” that moves around, transports objects, and completes other tasks. The work hails from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), Columbia University, and elsewhere.
Image: Felice Frankel

Taking a cue from biological cells, researchers from MIT, Columbia University, and elsewhere have developed computationally simple robots that connect in large groups to move around, transport objects, and complete other tasks.

This so-called “particle robotics” system — based on a project by MIT, Columbia Engineering, Cornell University, and Harvard University researchers — comprises many individual disc-shaped units, which the researchers call “particles.” The particles are loosely connected by magnets around their perimeters, and each unit can only do two things: expand and contract. (Each particle is about 6 inches in its contracted state and about 9 inches when expanded.) That motion, when carefully timed, allows the individual particles to push and pull one another in coordinated movement. On-board sensors enable the cluster to gravitate toward light sources.

In a Nature paper published today, the researchers demonstrate a cluster of two dozen real robotic particles and a virtual simulation of up to 100,000 particles moving through obstacles toward a light bulb. They also show that a particle robot can transport objects placed in its midst.

Particle robots can form into many configurations and fluidly navigate around obstacles and squeeze through tight gaps. Notably, none of the particles directly communicate with or rely on one another to function, so particles can be added or subtracted without any impact on the group. In their paper, the researchers show particle robotic systems can complete tasks even when many units malfunction.

The paper represents a new way to think about robots, which are traditionally designed for one purpose, comprise many complex parts, and stop working when any part malfunctions. Robots made up of these simplistic components, the researchers say, could enable more scalable, flexible, and robust systems.

“We have small robot cells that are not so capable as individuals but can accomplish a lot as a group,” says Daniela Rus, director of the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science. “The robot by itself is static, but when it connects with other robot particles, all of a sudden the robot collective can explore the world and control more complex actions. With these ‘universal cells,’ the robot particles can achieve different shapes, global transformation, global motion, global behavior, and, as we have shown in our experiments, follow gradients of light. This is very powerful.”

Joining Rus on the paper are: first author Shuguang Li, a CSAIL postdoc; co-first author Richa Batra and corresponding author Hod Lipson, both of Columbia Engineering; David Brown, Hyun-Dong Chang, and Nikhil Ranganathan of Cornell; and Chuck Hoberman of Harvard.

At MIT, Rus has been working on modular, connected robots for nearly 20 years, including an expanding and contracting cube robot that could connect to others to move around. But the square shape limited the robots’ group movement and configurations.

In collaboration with Lipson’s lab, where Li was a graduate student until coming to MIT in 2014, the researchers went for disc-shaped mechanisms that can rotate around one another. They can also connect and disconnect from each other, and form into many configurations.

Each unit of a particle robot has a cylindrical base, which houses a battery, a small motor, sensors that detect light intensity, a microcontroller, and a communication component that sends out and receives signals. Mounted on top is a children’s toy called a Hoberman Flight Ring — its inventor is one of the paper’s co-authors — which consists of small panels connected in a circular formation that can be pulled to expand and pushed back to contract. Two small magnets are installed in each panel.

The trick was programming the robotic particles to expand and contract in an exact sequence to push and pull the whole group toward a destination light source. To do so, the researchers equipped each particle with an algorithm that analyzes broadcasted information about light intensity from every other particle, without the need for direct particle-to-particle communication.

The sensors of a particle detect the intensity of light from a light source; the closer the particle is to the light source, the greater the intensity. Each particle constantly broadcasts a signal that shares its perceived intensity level with all other particles. Say a particle robotic system measures light intensity on a scale of levels 1 to 10: Particles closest to the light register a level 10 and those furthest will register level 1. The intensity level, in turn, corresponds to a specific time that the particle must expand. Particles experiencing the highest intensity — level 10 — expand first. As those particles contract, the next particles in order, level 9, then expand. That timed expanding and contracting motion happens at each subsequent level.

“This creates a mechanical expansion-contraction wave, a coordinated pushing and dragging motion, that moves a big cluster toward or away from environmental stimuli,” Li says. The key component, Li adds, is the precise timing from a shared synchronized clock among the particles that enables movement as efficiently as possible: “If you mess up the synchronized clock, the system will work less efficiently.”

In videos, the researchers demonstrate a particle robotic system comprising real particles moving and changing directions toward different light bulbs as they’re flicked on, and working its way through a gap between obstacles. In their paper, the researchers also show that simulated clusters of up to 10,000 particles maintain locomotion, at half their speed, even with up to 20 percent of units failed.

“It’s a bit like the proverbial ‘gray goo,’” says Lipson, a professor of mechanical engineering at Columbia Engineering, referencing the science-fiction concept of a self-replicating robot that comprises billions of nanobots. “The key novelty here is that you have a new kind of robot that has no centralized control, no single point of failure, no fixed shape, and its components have no unique identity.”

The next step, Lipson adds, is miniaturizing the components to make a robot composed of millions of microscopic particles.

Robots track moving objects with unprecedented precision

MIT Media Lab researchers are using RFID tags to help robots home in on moving objects with unprecedented speed and accuracy, potentially enabling greater collaboration in robotic packaging and assembly and among swarms of drones.
Photo courtesy of the researchers

A novel system developed at MIT uses RFID tags to help robots home in on moving objects with unprecedented speed and accuracy. The system could enable greater collaboration and precision by robots working on packaging and assembly, and by swarms of drones carrying out search-and-rescue missions.

In a paper being presented next week at the USENIX Symposium on Networked Systems Design and Implementation, the researchers show that robots using the system can locate tagged objects within 7.5 milliseconds, on average, and with an error of less than a centimeter.

In the system, called TurboTrack, an RFID (radio-frequency identification) tag can be applied to any object. A reader sends a wireless signal that reflects off the RFID tag and other nearby objects, and rebounds to the reader. An algorithm sifts through all the reflected signals to find the RFID tag’s response. Final computations then leverage the RFID tag’s movement — even though this usually decreases precision — to improve its localization accuracy.

The researchers say the system could replace computer vision for some robotic tasks. As with its human counterpart, computer vision is limited by what it can see, and it can fail to notice objects in cluttered environments. Radio frequency signals have no such restrictions: They can identify targets without visualization, within clutter and through walls.

To validate the system, the researchers attached one RFID tag to a cap and another to a bottle. A robotic arm located the cap and placed it onto the bottle, held by another robotic arm. In another demonstration, the researchers tracked RFID-equipped nanodrones during docking, maneuvering, and flying. In both tasks, the system was as accurate and fast as traditional computer-vision systems, while working in scenarios where computer vision fails, the researchers report.

“If you use RF signals for tasks typically done using computer vision, not only do you enable robots to do human things, but you can also enable them to do superhuman things,” says Fadel Adib, an assistant professor and principal investigator in the MIT Media Lab, and founding director of the Signal Kinetics Research Group. “And you can do it in a scalable way, because these RFID tags are only 3 cents each.”

In manufacturing, the system could enable robot arms to be more precise and versatile in, say, picking up, assembling, and packaging items along an assembly line. Another promising application is using handheld “nanodrones” for search and rescue missions. Nanodrones currently use computer vision and methods to stitch together captured images for localization purposes. These drones often get confused in chaotic areas, lose each other behind walls, and can’t uniquely identify each other. This all limits their ability to, say, spread out over an area and collaborate to search for a missing person. Using the researchers’ system, nanodrones in swarms could better locate each other, for greater control and collaboration.

“You could enable a swarm of nanodrones to form in certain ways, fly into cluttered environments, and even environments hidden from sight, with great precision,” says first author Zhihong Luo, a graduate student in the Signal Kinetics Research Group.

The other Media Lab co-authors on the paper are visiting student Qiping Zhang, postdoc Yunfei Ma, and Research Assistant Manish Singh.

Super resolution

Adib’s group has been working for years on using radio signals for tracking and identification purposes, such as detecting contamination in bottled foods, communicating with devices inside the body, and managing warehouse inventory.

Similar systems have attempted to use RFID tags for localization tasks. But these come with trade-offs in either accuracy or speed. To be accurate, it may take them several seconds to find a moving object; to increase speed, they lose accuracy.

The challenge was achieving both speed and accuracy simultaneously. To do so, the researchers drew inspiration from an imaging technique called “super-resolution imaging.” These systems stitch together images from multiple angles to achieve a finer-resolution image.

“The idea was to apply these super-resolution systems to radio signals,” Adib says. “As something moves, you get more perspectives in tracking it, so you can exploit the movement for accuracy.”

The system combines a standard RFID reader with a “helper” component that’s used to localize radio frequency signals. The helper shoots out a wideband signal comprising multiple frequencies, building on a modulation scheme used in wireless communication, called orthogonal frequency-division multiplexing.

The system captures all the signals rebounding off objects in the environment, including the RFID tag. One of those signals carries a signal that’s specific to the specific RFID tag, because RFID signals reflect and absorb an incoming signal in a certain pattern, corresponding to bits of 0s and 1s, that the system can recognize.

Because these signals travel at the speed of light, the system can compute a “time of flight” — measuring distance by calculating the time it takes a signal to travel between a transmitter and receiver — to gauge the location of the tag, as well as the other objects in the environment. But this provides only a ballpark localization figure, not subcentimter precision.

Leveraging movement

To zoom in on the tag’s location, the researchers developed what they call a “space-time super-resolution” algorithm.

The algorithm combines the location estimations for all rebounding signals, including the RFID signal, which it determined using time of flight. Using some probability calculations, it narrows down that group to a handful of potential locations for the RFID tag.

As the tag moves, its signal angle slightly alters — a change that also corresponds to a certain location. The algorithm then can use that angle change to track the tag’s distance as it moves. By constantly comparing that changing distance measurement to all other distance measurements from other signals, it can find the tag in a three-dimensional space. This all happens in a fraction of a second.

“The high-level idea is that, by combining these measurements over time and over space, you get a better reconstruction of the tag’s position,” Adib says.

The work was sponsored, in part, by the National Science Foundation.

Identifying artificial intelligence “blind spots”

By Rob Matheson

A novel model developed by MIT and Microsoft researchers identifies instances in which autonomous systems have “learned” from training examples that don’t match what’s actually happening in the real world. Engineers could use this model to improve the safety of artificial intelligence systems, such as driverless vehicles and autonomous robots.

The AI systems powering driverless cars, for example, are trained extensively in virtual simulations to prepare the vehicle for nearly every event on the road. But sometimes the car makes an unexpected error in the real world because an event occurs that should, but doesn’t, alter the car’s behavior.

Consider a driverless car that wasn’t trained, and more importantly doesn’t have the sensors necessary, to differentiate between distinctly different scenarios, such as large, white cars and ambulances with red, flashing lights on the road. If the car is cruising down the highway and an ambulance flicks on its sirens, the car may not know to slow down and pull over, because it does not perceive the ambulance as different from a big white car.

In a pair of papers — presented at last year’s Autonomous Agents and Multiagent Systems conference and the upcoming Association for the Advancement of Artificial Intelligence conference — the researchers describe a model that uses human input to uncover these training “blind spots.”

As with traditional approaches, the researchers put an AI system through simulation training. But then, a human closely monitors the system’s actions as it acts in the real world, providing feedback when the system made, or was about to make, any mistakes. The researchers then combine the training data with the human feedback data, and use machine-learning techniques to produce a model that pinpoints situations where the system most likely needs more information about how to act correctly.

The researchers validated their method using video games, with a simulated human correcting the learned path of an on-screen character. But the next step is to incorporate the model with traditional training and testing approaches for autonomous cars and robots with human feedback.

“The model helps autonomous systems better know what they don’t know,” says first author Ramya Ramakrishnan, a graduate student in the Computer Science and Artificial Intelligence Laboratory. “Many times, when these systems are deployed, their trained simulations don’t match the real-world setting [and] they could make mistakes, such as getting into accidents. The idea is to use humans to bridge that gap between simulation and the real world, in a safe way, so we can reduce some of those errors.”

Co-authors on both papers are: Julie Shah, an associate professor in the Department of Aeronautics and Astronautics and head of the CSAIL’s Interactive Robotics Group; and Ece Kamar, Debadeepta Dey, and Eric Horvitz, all from Microsoft Research. Besmira Nushi is an additional co-author on the upcoming paper.

Taking feedback

Some traditional training methods do provide human feedback during real-world test runs, but only to update the system’s actions. These approaches don’t identify blind spots, which could be useful for safer execution in the real world.

The researchers’ approach first puts an AI system through simulation training, where it will produce a “policy” that essentially maps every situation to the best action it can take in the simulations. Then, the system will be deployed in the real-world, where humans provide error signals in regions where the system’s actions are unacceptable.

Humans can provide data in multiple ways, such as through “demonstrations” and “corrections.” In demonstrations, the human acts in the real world, while the system observes and compares the human’s actions to what it would have done in that situation. For driverless cars, for instance, a human would manually control the car while the system produces a signal if its planned behavior deviates from the human’s behavior. Matches and mismatches with the human’s actions provide noisy indications of where the system might be acting acceptably or unacceptably.

Alternatively, the human can provide corrections, with the human monitoring the system as it acts in the real world. A human could sit in the driver’s seat while the autonomous car drives itself along its planned route. If the car’s actions are correct, the human does nothing. If the car’s actions are incorrect, however, the human may take the wheel, which sends a signal that the system was not acting unacceptably in that specific situation.

Once the feedback data from the human is compiled, the system essentially has a list of situations and, for each situation, multiple labels saying its actions were acceptable or unacceptable. A single situation can receive many different signals, because the system perceives many situations as identical. For example, an autonomous car may have cruised alongside a large car many times without slowing down and pulling over. But, in only one instance, an ambulance, which appears exactly the same to the system, cruises by. The autonomous car doesn’t pull over and receives a feedback signal that the system took an unacceptable action.

“At that point, the system has been given multiple contradictory signals from a human: some with a large car beside it, and it was doing fine, and one where there was an ambulance in the same exact location, but that wasn’t fine. The system makes a little note that it did something wrong, but it doesn’t know why,” Ramakrishnan says. “Because the agent is getting all these contradictory signals, the next step is compiling the information to ask, ‘How likely am I to make a mistake in this situation where I received these mixed signals?’”

Intelligent aggregation

The end goal is to have these ambiguous situations labeled as blind spots. But that goes beyond simply tallying the acceptable and unacceptable actions for each situation. If the system performed correct actions nine times out of 10 in the ambulance situation, for instance, a simple majority vote would label that situation as safe.

“But because unacceptable actions are far rarer than acceptable actions, the system will eventually learn to predict all situations as safe, which can be extremely dangerous,” Ramakrishnan says.

To that end, the researchers used the Dawid-Skene algorithm, a machine-learning method used commonly for crowdsourcing to handle label noise. The algorithm takes as input a list of situations, each having a set of noisy “acceptable” and “unacceptable” labels. Then it aggregates all the data and uses some probability calculations to identify patterns in the labels of predicted blind spots and patterns for predicted safe situations. Using that information, it outputs a single aggregated “safe” or “blind spot” label for each situation along with a its confidence level in that label. Notably, the algorithm can learn in a situation where it may have, for instance, performed acceptably 90 percent of the time, the situation is still ambiguous enough to merit a “blind spot.”

In the end, the algorithm produces a type of “heat map,” where each situation from the system’s original training is assigned low-to-high probability of being a blind spot for the system.

“When the system is deployed into the real world, it can use this learned model to act more cautiously and intelligently. If the learned model predicts a state to be a blind spot with high probability, the system can query a human for the acceptable action, allowing for safer execution,” Ramakrishnan says.

3Q: Aleksander Madry on building trustworthy artificial intelligence

Aleksander Madry is a leader in the emerging field of building guarantees into artificial intelligence, which has nearly become a branch of machine learning in its own right.
Photo courtesy of CSAIL

By Kim Martineau

Machine learning algorithms now underlie much of the software we use, helping to personalize our news feeds and finish our thoughts before we’re done typing. But as artificial intelligence becomes further embedded in daily life, expectations have risen. Before autonomous systems fully gain our confidence, we need to know they are reliable in most situations and can withstand outside interference; in engineering terms, that they are robust. We also need to understand the reasoning behind their decisions; that they are interpretable.

Aleksander Madry, an associate professor of computer science at MIT and a lead faculty member of the Computer Science and Artificial Intelligence Lab (CSAIL)’s Trustworthy AI initiative, compares AI to a sharp knife, a useful but potentially-hazardous tool that society must learn to weild properly. Madry recently spoke at MIT’s Symposium on Robust, Interpretable AI, an event co-sponsored by the MIT Quest for Intelligence and CSAIL, and held Nov. 20 in Singleton Auditorium. The symposium was designed to showcase new MIT work in the area of building guarantees into AI, which has almost become a branch of machine learning in its own right. Six faculty members spoke about their research, 40 students presented posters, and Madry opened the symposium with a talk the aptly titled, “Robustness and Interpretability.” We spoke with Madry, a leader in this emerging field, about some of the key ideas raised during the event.

Q: AI owes much of its recent progress to deep learning, a branch of machine learning that has significantly improved the ability of algorithms to pick out patterns in text, images and sounds, giving us automated assistants like Siri and Alexa, among other things. But deep learning systems remain vulnerable in surprising ways: stumbling when they encounter slightly unfamiliar examples in the real world or when a malicious attacker feeds it subtly-altered images. How are you and others trying to make AI more robust?

A: Until recently, AI researchers focused simply on getting machine-learning algorithms to accomplish basic tasks. Achieving even average-case performance was a major challenge. Now that performance has improved, attention has shifted to the next hurdle: improving the worst-case performance. Most of my research is focused on meeting this challenge. Specifically, I work on developing next-generation machine-learning systems that will be reliable and secure enough for mission-critical applications like self-driving cars and software that filters malicious contentWe’re currently building tools to train object-recognition systems to identify what’s happening in a scene or picture, even if the images fed to the model have been manipulated. We are also studying the limits of systems that offer security and reliability guarantees. How much reliability and security can we build into machine-learning models, and what other features might we need to sacrifice to get there?

My colleague Luca Daniel, who also spoke, is working on an important aspect of this problem: developing a way to measure the resilience of a deep learning system in key situations. Decisions made by deep learning systems have major consequences, and thus it’s essential that end-users be able to measure the reliability of each of the model’s outputs. Another way to make a system more robust is during the training process. In her talk, “Robustness in GANs and in Black-box Optimization,” Stefanie Jegelka showed how the learner in a generative adversarial network, or GAN, can be made to withstand manipulations to its input, leading to much better performance. 

Q: The neural networks that power deep learning seem to learn almost effortlessly: Feed them enough data and they can outperform humans at many tasks. And yet, we’ve also seen how easily they can fail, with at least three widely publicized cases of self-driving cars crashing and killing someone. AI applications in health care are not yet under the same level of scrutiny but the stakes are just as high. David Sontag focused his talk on the often life-or-death consequences when an AI system lacks robustness. What are some of the red flags when training an AI on patient medical records and other observational data?

A: This goes back to the nature of guarantees and the underlying assumptions that we build into our models. We often assume that our training datasets are representative of the real-world data we test our models on — an assumption that tends to be too optimistic. Sontag gave two examples of flawed assumptions baked into the training process that could lead an AI to give the wrong diagnosis or recommend a harmful treatment. The first focused on a massive database of patient X-rays released last year by the National Institutes of Health. The dataset was expected to bring big improvements to the automated diagnosis of lung disease until a skeptical radiologist took a closer look and found widespread errors in the scans’ diagnostic labels. An AI trained on chest scans with a lot of incorrect labels is going to have a hard time generating accurate diagnoses. 

A second problem Sontag cited is the failure to correct for gaps and irregularities in the data due to system glitches or changes in how hospitals and health care providers report patient data. For example, a major disaster could limit the amount of data available for emergency room patients. If a machine-learning model failed to take that shift into account its predictions would not be very reliable.

Q: You’ve covered some of the techniques for making AI more reliable and secure. What about interpretability? What makes neural networks so hard to interpret, and how are engineers developing ways to peer beneath the hood?

A: Understanding neural-network predictions is notoriously difficult. Each prediction arises from a web of decisions made by hundreds to thousands of individual nodes. We are trying to develop new methods to make this process more transparent. In the field of computer vision one of the pioneers is Antonio Torralba, director of The Quest. In his talk, he demonstrated a new tool developed in his lab that highlights the features that a neural network is focusing on as it interprets a scene. The tool lets you identify the nodes in the network responsible for recognizing, say, a door, from a set of windows or a stand of trees. Visualizing the object-recognition process allows software developers to get a more fine-grained understanding of how the network learns. 

Another way to achieve interpretability is to precisely define the properties that make the model understandable, and then train the model to find that type of solution. Tommi Jaakkola showed in his talk, “Interpretability and Functional Transparency,” that models can be trained to be linear or have other desired qualities locally while maintaining the network’s overall flexibility. Explanations are needed at different levels of resolution much as they are in interpreting physical phenomena. Of course, there’s a cost to building guarantees into machine-learning systems — this is a theme that carried through all the talks. But those guarantees are necessary and not insurmountable. The beauty of human intelligence is that while we can’t perform most tasks perfectly, as a machine might, we have the ability and flexibility to learn in a remarkable range of environments. 

Inside “The Laughing Room”

interactive installation at Cambridge Public Library, was shown live on monitors at Hayden Library and streamed online.
Still photos courtesy of metaLAB (at) Harvard.

By Brigham Fay

“The Laughing Room,” an interactive art installation by author, illustrator, and MIT graduate student Jonathan “Jonny” Sun, looks like a typical living room: couches, armchairs, coffee table, soft lighting. This cozy scene, however, sits in a glass-enclosed space, flanked by bright lights and a microphone, with a bank of laptops and a video camera positioned across the room. People wander in, take a seat, begin chatting. After a pause in the conversation, a riot of canned laughter rings out, prompting genuine giggles from the group.

Presented at the Cambridge Public Library in Cambridge, Massachusetts, Nov. 16-18, “The Laughing Room” was an artificially intelligent room programmed to play an audio laugh track whenever participants said something that its algorithm deemed funny. Sun, who is currently on leave from his PhD program within the MIT Department of Urban Studies and Planning, is an affiliate at the Berkman Klein Center for Internet and Society at Harvard University, and creative researcher at the metaLAB at Harvard, created the project to explore the increasingly social and cultural roles of technology in public and private spaces, users’ agency within and dependence on such technology, and the issues of privacy raised by these systems. The installations were presented as part of ARTificial Intelligence, an ongoing program led by MIT associate professor of literature Stephanie Frampton that fosters public dialogue about the emerging ethical and social implications of artificial intelligence (AI) through art and design.

Setting the scene

“Cambridge is the birthplace of artificial intelligence, and this installation gives us an opportunity to think about the new roles that AI is playing in our lives every day,” said Frampton. “It was important to us to set the installations in the Cambridge Public Library and MIT Libraries, where they could spark an open conversation at the intersections of art and science.”

“I wanted the installation to resemble a sitcom set from the 1980s–a private, familial space,” said Sun. “I wanted to explore how AI is changing our conception of private space, with things like the Amazon Echo or Google Home, where you’re aware of this third party listening.”

“The Control Room,” a companion installation located in Hayden Library at MIT, displayed a live stream of the action in “The Laughing Room,while another monitor showed the algorithm evaluating people’s speech in real time. Live streams were also shared online via YouTube and Periscope. “It’s an extension of the sitcom metaphor, the idea that people are watching,” said Sun. The artist was interested to see how people would act, knowing they had an audience. Would they perform for the algorithm? Sun likened it to Twitter users trying to craft the perfect tweet so it will go viral.

Programming funny

“Almost all machine learning starts from a dataset,” said Hannah Davis, an artist, musician, and programmer who collaborated with Sun to create the installation’s algorithm. She described the process at an “Artists Talk Back” event held Saturday, Nov. 17, at Hayden Library. The panel discussion included Davis; Sun; Frampton; collaborator Christopher Sun, research assistant Nikhil Dharmaraj, Reinhard Engels, manager of technology and innovation at Cambridge Public Library, Mark Szarko, librarian at MIT Libraries, and Sarah Newman, creative researcher at the metaLAB. The panel was moderated by metaLAB founder and director Jeffrey Schnapp.

Davis explained how, to train the algorithm, she scraped stand-up comedy routines from YouTube, selecting performances by women and people of color to avoid programming misogyny and racism into how the AI identified humor. “It determines what is the setup to the joke and what shouldn’t be laughed at, and what is the punchline and what should be laughed at,” said Davis. Depending on how likely something is to be a punchline, the laugh track plays at different intensities.

Fake laughs, real connections

Sun acknowledged that the reactions from “The Laughing Room” participants have been mixed: “Half of the people came out saying ‘that was really fun,’” he said. “The other half said ‘that was really creepy.’”

That was the impression shared by Colin Murphy, a student at Tufts University who heard about the project from following Sun on Twitter: “This idea that you are the spectacle of an art piece, that was really weird.”

“It didn’t seem like it was following any kind of structure,” added Henry Scott, who was visiting from Georgia. “I felt like it wasn’t laughing at jokes, but that it was laughing at us. The AI seems mean.”

While many found the experience of “The Laughing Room” uncanny, for others it was intimate, joyous, even magical.

“There’s a laughter that comes naturally after the laugh track that was interesting to me, how it can bring out the humanness,” said Newman at the panel discussion. “The work does that more than I expected it to.”

Frampton noted how the installation’s setup also prompted unexpected connections: “It enabled strangers to have conversations with each other that wouldn’t have happened without someone listening.”

Continuing his sitcom metaphor, Sun described these first installations as a “pilot,” and is looking forward to presenting future versions of “The Laughing Room.” He and his collaborators will keep tweaking the algorithm, using different data sources, and building on what they’ve learned through these installations. “The Laughing Room” will be on display in the MIT Wiesner Student Art Gallery in May 2019, and the team is planning further events at MIT, Harvard, and Cambridge Public Library throughout the coming year.

“This has been an extraordinary collaboration and shown us how much interest there is in this kind of programming and how much energy can come from using the libraries in new ways,” said Frampton.

“The Laughing Room” and “The Control Room” were funded by the metaLAB (at) Harvard, the MIT De Florez Fund for Humor, the Council of the Arts at MIT, and the MIT Center For Art, Science and Technology and presented in partnership with the Cambridge Public Library and the MIT Libraries.

Reproducing paintings that make an impression

The RePaint system reproduces paintings by combining two approaches called color-contoning and half-toning, as well as a deep learning model focused on determining how to stack 10 different inks to recreate the specific shades of color.
Image courtesy of the researchers

By Rachel Gordon

The empty frames hanging inside the Isabella Stewart Gardner Museum serve as a tangible reminder of the world’s biggest unsolved art heist. While the original masterpieces may never be recovered, a team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) might be able to help, with a new system aimed at designing reproductions of paintings.

RePaint uses a combination of 3-D printing and deep learning to authentically recreate favorite paintings — regardless of different lighting conditions or placement. RePaint could be used to remake artwork for a home, protect originals from wear and tear in museums, or even help companies create prints and postcards of historical pieces.

“If you just reproduce the color of a painting as it looks in the gallery, it might look different in your home,” says Changil Kim, one of the authors on a new paper about the system, which will be presented at ACM SIGGRAPH Asia in December. “Our system works under any lighting condition, which shows a far greater color reproduction capability than almost any other previous work.”

To test RePaint, the team reproduced a number of oil paintings created by an artist collaborator. The team found that RePaint was more than four times more accurate than state-of-the-art physical models at creating the exact color shades for different artworks.

At this time the reproductions are only about the size of a business card, due to the time-costly nature of printing. In the future the team expects that more advanced, commercial 3-D printers could help with making larger paintings more efficiently.

While 2-D printers are most commonly used for reproducing paintings, they have a fixed set of just four inks (cyan, magenta, yellow, and black). The researchers, however, found a better way to capture a fuller spectrum of Degas and Dali. They used a special technique they call “color-contoning,” which involves using a 3-D printer and 10 different transparent inks stacked in very thin layers, much like the wafers and chocolate in a Kit-Kat bar. They combined their method with a decades-old technique called half-toning, where an image is created by lots of little colored dots rather than continuous tones. Combining these, the team says, better captured the nuances of the colors.

With a larger color scope to work with, the question of what inks to use for which paintings still remained. Instead of using more laborious physical approaches, the team trained a deep-learning model to predict the optimal stack of different inks. Once the system had a handle on that, they fed in images of paintings and used the model to determine what colors should be used in what particular areas for specific paintings.

Despite the progress so far, the team says they have a few improvements to make before they can whip up a dazzling duplicate of “Starry Night.” For example, mechanical engineer Mike Foshey said they couldn’t completely reproduce certain colors like cobalt blue due to a limited ink library. In the future they plan to expand this library, as well as create a painting-specific algorithm for selecting inks, he says. They also can hope to achieve better detail to account for aspects like surface texture and reflection, so that they can achieve specific effects such as glossy and matte finishes.

“The value of fine art has rapidly increased in recent years, so there’s an increased tendency for it to be locked up in warehouses away from the public eye,” says Foshey. “We’re building the technology to reverse this trend, and to create inexpensive and accurate reproductions that can be enjoyed by all.”

Kim and Foshey worked on the system alongside lead author Liang Shi; MIT professor Wojciech Matusik; former MIT postdoc Vahid Babaei, now Group Leader at Max Planck Institute of Informatics; Princeton University computer science professor Szymon Rusinkiewicz; and former MIT postdoc Pitchaya Sitthi-Amorn, who is now a lecturer at Chulalongkorn University in Bangkok, Thailand.

This work is supported in part by the National Science Foundation.

Using electricity and water, a new kind of motor can slide microrobots into motion

Water droplets are inserted into the microhydraulic actuator, which rotates when voltage is applied to electrodes that pull the droplets in one direction. This disc-shaped actuator’s inner diameter is 5 millimeters.
Photo: Glen Cooper

Look around and you’ll likely see something that runs on an electric motor. Powerful and efficient, they keep much of our world moving, everything from our computers to refrigerators to the automatic windows in our cars. But these qualities change for the worse when such motors are shrunk down to sizes smaller than a cubic centimeter.

“At very small scales, you get a heater instead of a motor,” said Jakub Kedzierski, staff in MIT Lincoln Laboratory’s Chemical, Microsystem, and Nanoscale Technologies Group. Today, no motor exists that is both highly efficient and powerful at microsizes. And that’s a problem, because motors on that scale are needed to put miniaturized systems into motion — microgimbals that can point lasers to a fraction of a degree over thousands of miles, tiny drones that can squeeze into wreckage to find survivors, or even bots that can crawl through the human digestive tract.

To help power systems like these, Kedzierski and his team are making a new type of motor called a microhydraulic actuator. The actuators move with a level of precision, efficiency, and power that has not yet been possible at the microscale. A paper describing this work was published in the September 2018 issue of Science Robotics.

The microhydraulic actuators use a technique called electrowetting to achieve motion. Electrowetting applies an electrical voltage to water droplets on a solid surface to distort the surface tension of the liquid. The actuators take advantage of this distortion to force water droplets inside of the actuator to move, and with them, the entire actuator. 

“Think about a droplet of water on a window; the force of gravity distorts it, and it moves down,” said Kedzierski. “Here, we use voltage to cause the distortion, which in turn produces motion.”

The actuator is constructed in two layers. The bottom layer is a sheet of metal with electrodes stamped into it. This layer is covered with a dielectric, an insulator that becomes polarized when an electric field is applied. The top layer is a sheet of polyimide, a strong plastic, that has shallow channels drilled into it. The channels guide the path of dozens of water droplets that are applied in between the two layers and are aligned with the electrodes. To hold off evaporation, the water is premixed with a solution of lithium chloride, which depresses the water’s vapor pressure enough for the micrometer-sized droplets to last for months. The droplets keep their rounded shape (instead of being squashed between the layers) due to their surface tension and relatively small size.

The actuator comes to life when voltage is applied to the electrodes, though not to all of them at once. It’s done in a cycle of turning on two electrodes per droplet at a time. With no voltage, a single water droplet rests neutrally on two electrodes, 1 and 2. But apply a voltage to electrodes 2 and 3, and suddenly the droplet is deformed, stretching to touch the energized electrode 3 and pulling off of electrode 1.

This horizontal force in one droplet isn’t enough to move the actuator. But with this voltage cycle being applied in unison to the electrodes underneath every drop in the array, the entire polyimide layer slides over to appease the drops’ attraction to the energized electrodes. Keep cycling the voltage through, and droplets continue to walk over the electrodes and the layer continues to slide over; turn the voltage off, and the actuator stops in its tracks. The voltage, then, becomes a powerful tool to precisely control the actuator’s movement.

But how does the actuator stand up against other types of motors? The two metrics to measure performance are power density, or the amount of power the motor produces in relation to its weight, and efficiency, or the measure of wasted energy. One of best electric motors in terms of efficiency and power density is the motor of the Tesla Model S sedan. When the team tested the microhydraulic actuators, they found them to be just behind the Model S’s power density (at 0.93 kilowatt per kilogram) and efficiency output (at 60 percent efficient at maximum power density). They widely exceeded piezoelectric actuators and other types of microactuators.

“We’re excited because we are meeting that benchmark, and we are still improving as we scale to smaller sizes,” Kedzierski said. The actuators improve at smaller sizes because surface tension remains the same regardless of the water droplet size — and smaller droplets make room for even more droplets to squeeze in and exert their horizontal force on the actuator. “Power density just shoots up. It’s like having a rope whose strength doesn’t weaken as it gets thinner,” he added. 

The latest actuator, the one edging close to the Model S, had a separation of 48 micrometers between droplets. The team is now shrinking that down to 30 micrometers. They project that, at that scale, the actuator will match the Tesla Model S in power density, and, at 15 micrometers, eclipse it.

Scaling the actuators down is just one part of the equation. The other aspect the team is actively working on is 3-D integration. Right now, a single actuator is a two-layer system, thinner than a plastic bag and flexible like one too. They want to stack the actuators in a scaffold-like system that can move in three dimensions.

Kedzierski envisions such a system mimicking our bodies’ muscle matrix, the network of tissues that allow our muscles to achieve instantaneous, powerful, and flexible motion. Ten times more powerful than muscle, the actuators were inspired by muscle in many ways, from their flexibility and lightness to their composition of fluid and solid components.

And just as muscle is an excellent actuator at the scale of an ant or an elephant, these microhydraulic actuators, too, could have a powerful impact not just at the microscale, but at the macro.

“One might imagine,” said Eric Holihan, who has been assembling and testing the actuators, “the technology being applied to exoskeletons,” built with the actuators working as lifelike muscle, configured into flexible joints instead of gears. Or an aircraft wing could shapeshift on electrical command, with thousands of actuators sliding past each other to change the wing’s aerodynamic form.

While their imaginations are churning, the team faces challenges in developing large systems of the actuators. One challenge is how to distribute power at that volume. A parallel effort at the laboratory that is developing microbatteries to integrate with the actuators could help solve that issue. Another challenge is how to package the actuators so that evaporation is eliminated.

“Reliability and packaging will continue to be the predominate questions posed to us about the technology until we demonstrate a solution,” said Holihan. “This is something that we look to attack head on in the coming months.”

Fleets of drones could aid searches for lost hikers


MIT researchers describe an autonomous system for a fleet of drones to collaboratively search under dense forest canopies using only onboard computation and wireless communication — no GPS required.
Images: Melanie Gonick

By Rob Matheson

Finding lost hikers in forests can be a difficult and lengthy process, as helicopters and drones can’t get a glimpse through the thick tree canopy. Recently, it’s been proposed that autonomous drones, which can bob and weave through trees, could aid these searches. But the GPS signals used to guide the aircraft can be unreliable or nonexistent in forest environments.

In a paper being presented at the International Symposium on Experimental Robotics conference next week, MIT researchers describe an autonomous system for a fleet of drones to collaboratively search under dense forest canopies. The drones use only onboard computation and wireless communication — no GPS required.

Each autonomous quadrotor drone is equipped with laser-range finders for position estimation, localization, and path planning. As the drone flies around, it creates an individual 3-D map of the terrain. Algorithms help it recognize unexplored and already-searched spots, so it knows when it’s fully mapped an area. An off-board ground station fuses individual maps from multiple drones into a global 3-D map that can be monitored by human rescuers.

In a real-world implementation, though not in the current system, the drones would come equipped with object detection to identify a missing hiker. When located, the drone would tag the hiker’s location on the global map. Humans could then use this information to plan a rescue mission.

“Essentially, we’re replacing humans with a fleet of drones to make the search part of the search-and-rescue process more efficient,” says first author Yulun Tian, a graduate student in the Department of Aeronautics and Astronautics (AeroAstro).

The researchers tested multiple drones in simulations of randomly generated forests, and tested two drones in a forested area within NASA’s Langley Research Center. In both experiments, each drone mapped a roughly 20-square-meter area in about two to five minutes and collaboratively fused their maps together in real-time. The drones also performed well across several metrics, including overall speed and time to complete the mission, detection of forest features, and accurate merging of maps.

Co-authors on the paper are: Katherine Liu, a PhD student in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and AeroAstro; Kyel Ok, a PhD student in CSAIL and the Department of Electrical Engineering and Computer Science; Loc Tran and Danette Allen of the NASA Langley Research Center; Nicholas Roy, an AeroAstro professor and CSAIL researcher; and Jonathan P. How, the Richard Cockburn Maclaurin Professor of Aeronautics and Astronautics.

Exploring and mapping

On each drone, the researchers mounted a LIDAR system, which creates a 2-D scan of the surrounding obstacles by shooting laser beams and measuring the reflected pulses. This can be used to detect trees; however, to drones, individual trees appear remarkably similar. If a drone can’t recognize a given tree, it can’t determine if it’s already explored an area.

The researchers programmed their drones to instead identify multiple trees’ orientations, which is far more distinctive. With this method, when the LIDAR signal returns a cluster of trees, an algorithm calculates the angles and distances between trees to identify that cluster. “Drones can use that as a unique signature to tell if they’ve visited this area before or if it’s a new area,” Tian says.

This feature-detection technique helps the ground station accurately merge maps. The drones generally explore an area in loops, producing scans as they go. The ground station continuously monitors the scans. When two drones loop around to the same cluster of trees, the ground station merges the maps by calculating the relative transformation between the drones, and then fusing the individual maps to maintain consistent orientations.

“Calculating that relative transformation tells you how you should align the two maps so it corresponds to exactly how the forest looks,” Tian says.

In the ground station, robotic navigation software called “simultaneous localization and mapping” (SLAM) — which both maps an unknown area and keeps track of an agent inside the area — uses the LIDAR input to localize and capture the position of the drones. This helps it fuse the maps accurately.

The end result is a map with 3-D terrain features. Trees appear as blocks of colored shades of blue to green, depending on height. Unexplored areas are dark but turn gray as they’re mapped by a drone. On-board path-planning software tells a drone to always explore these dark unexplored areas as it flies around. Producing a 3-D map is more reliable than simply attaching a camera to a drone and monitoring the video feed, Tian says. Transmitting video to a central station, for instance, requires a lot of bandwidth that may not be available in forested areas.

More efficient searching

A key innovation is a novel search strategy that let the drones more efficiently explore an area. According to a more traditional approach, a drone would always search the closest possible unknown area. However, that could be in any number of directions from the drone’s current position. The drone usually flies a short distance, and then stops to select a new direction.

“That doesn’t respect dynamics of drone [movement],” Tian says. “It has to stop and turn, so that means it’s very inefficient in terms of time and energy, and you can’t really pick up speed.”

Instead, the researchers’ drones explore the closest possible area while considering their speed and direction and maintaining a consistent velocity. This strategy — where the drone tends to travel in a spiral pattern — covers a search area much faster. “In search and rescue missions, time is very important,” Tian says.

In the paper, the researchers compared their new search strategy with a traditional method. Compared to that baseline, the researchers’ strategy helped the drones cover significantly more area, several minutes faster and with higher average speeds.

One limitation for practical use is that the drones still must communicate with an off-board ground station for map merging. In their outdoor experiment, the researchers had to set up a wireless router that connected each drone and the ground station. In the future, they hope to design the drones to communicate wirelessly when approaching one another, fuse their maps, and then cut communication when they separate. The ground station, in that case, would only be used to monitor the updated global map.

Machines that learn language more like kids do

MIT researchers have developed a “semantic parser” that learns through observation to more closely mimic a child’s language-acquisition process, which could greatly extend computing’s capabilities.
Photo: MIT News

By Rob Matheson

Children learn language by observing their environment, listening to the people around them, and connecting the dots between what they see and hear. Among other things, this helps children establish their language’s word order, such as where subjects and verbs fall in a sentence.

In computing, learning language is the task of syntactic and semantic parsers. These systems are trained on sentences annotated by humans that describe the structure and meaning behind words. Parsers are becoming increasingly important for web searches, natural-language database querying, and voice-recognition systems such as Alexa and Siri. Soon, they may also be used for home robotics.

But gathering the annotation data can be time-consuming and difficult for less common languages. Additionally, humans don’t always agree on the annotations, and the annotations themselves may not accurately reflect how people naturally speak.

In a paper being presented at this week’s Empirical Methods in Natural Language Processing conference, MIT researchers describe a parser that learns through observation to more closely mimic a child’s language-acquisition process, which could greatly extend the parser’s capabilities. To learn the structure of language, the parser observes captioned videos, with no other information, and associates the words with recorded objects and actions. Given a new sentence, the parser can then use what it’s learned about the structure of the language to accurately predict a sentence’s meaning, without the video.

This “weakly supervised” approach — meaning it requires limited training data — mimics how children can observe the world around them and learn language, without anyone providing direct context. The approach could expand the types of data and reduce the effort needed for training parsers, according to the researchers. A few directly annotated sentences, for instance, could be combined with many captioned videos, which are easier to come by, to improve performance.

In the future, the parser could be used to improve natural interaction between humans and personal robots. A robot equipped with the parser, for instance, could constantly observe its environment to reinforce its understanding of spoken commands, including when the spoken sentences aren’t fully grammatical or clear. “People talk to each other in partial sentences, run-on thoughts, and jumbled language. You want a robot in your home that will adapt to their particular way of speaking … and still figure out what they mean,” says co-author Andrei Barbu, a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Center for Brains, Minds, and Machines (CBMM) within MIT’s McGovern Institute.

The parser could also help researchers better understand how young children learn language. “A child has access to redundant, complementary information from different modalities, including hearing parents and siblings talk about the world, as well as tactile information and visual information, [which help him or her] to understand the world,” says co-author Boris Katz, a principal research scientist and head of the InfoLab Group at CSAIL. “It’s an amazing puzzle, to process all this simultaneous sensory input. This work is part of bigger piece to understand how this kind of learning happens in the world.”

Co-authors on the paper are: first author Candace Ross, a graduate student in the Department of Electrical Engineering and Computer Science and CSAIL, and a researcher in CBMM; Yevgeni Berzak PhD ’17, a postdoc in the Computational Psycholinguistics Group in the Department of Brain and Cognitive Sciences; and CSAIL graduate student Battushig Myanganbayar.

Visual learner

For their work, the researchers combined a semantic parser with a computer-vision component trained in object, human, and activity recognition in video. Semantic parsers are generally trained on sentences annotated with code that ascribes meaning to each word and the relationships between the words. Some have been trained on still images or computer simulations.

The new parser is the first to be trained using video, Ross says. In part, videos are more useful in reducing ambiguity. If the parser is unsure about, say, an action or object in a sentence, it can reference the video to clear things up. “There are temporal components — objects interacting with each other and with people — and high-level properties you wouldn’t see in a still image or just in language,” Ross says.

The researchers compiled a dataset of about 400 videos depicting people carrying out a number of actions, including picking up an object or putting it down, and walking toward an object. Participants on the crowdsourcing platform Mechanical Turk then provided 1,200 captions for those videos. They set aside 840 video-caption examples for training and tuning, and used 360 for testing. One advantage of using vision-based parsing is “you don’t need nearly as much data — although if you had [the data], you could scale up to huge datasets,” Barbu says.

In training, the researchers gave the parser the objective of determining whether a sentence accurately describes a given video. They fed the parser a video and matching caption. The parser extracts possible meanings of the caption as logical mathematical expressions. The sentence, “The woman is picking up an apple,” for instance, may be expressed as: λxy. woman x, pick_up x y, apple y.

Those expressions and the video are inputted to the computer-vision algorithm, called “Sentence Tracker,” developed by Barbu and other researchers. The algorithm looks at each video frame to track how objects and people transform over time, to determine if actions are playing out as described. In this way, it determines if the meaning is possibly true of the video.

Connecting the dots

The expression with the most closely matching representations for objects, humans, and actions becomes the most likely meaning of the caption. The expression, initially, may refer to many different objects and actions in the video, but the set of possible meanings serves as a training signal that helps the parser continuously winnow down possibilities. “By assuming that all of the sentences must follow the same rules, that they all come from the same language, and seeing many captioned videos, you can narrow down the meanings further,” Barbu says.

In short, the parser learns through passive observation: To determine if a caption is true of a video, the parser by necessity must identify the highest probability meaning of the caption. “The only way to figure out if the sentence is true of a video [is] to go through this intermediate step of, ‘What does the sentence mean?’ Otherwise, you have no idea how to connect the two,” Barbu explains. “We don’t give the system the meaning for the sentence. We say, ‘There’s a sentence and a video. The sentence has to be true of the video. Figure out some intermediate representation that makes it true of the video.’”

The training produces a syntactic and semantic grammar for the words it’s learned. Given a new sentence, the parser no longer requires videos, but leverages its grammar and lexicon to determine sentence structure and meaning.

Ultimately, this process is learning “as if you’re a kid,” Barbu says. “You see world around you and hear people speaking to learn meaning. One day, I can give you a sentence and ask what it means and, even without a visual, you know the meaning.”

“This research is exactly the right direction for natural language processing,” says Stefanie Tellex, a professor of computer science at Brown University who focuses on helping robots use natural language to communicate with humans. “To interpret grounded language, we need semantic representations, but it is not practicable to make it available at training time. Instead, this work captures representations of compositional structure using context from captioned videos. This is the paper I have been waiting for!”

In future work, the researchers are interested in modeling interactions, not just passive observations. “Children interact with the environment as they’re learning. Our idea is to have a model that would also use perception to learn,” Ross says.

This work was supported, in part, by the CBMM, the National Science Foundation, a Ford Foundation Graduate Research Fellowship, the Toyota Research Institute, and the MIT-IBM Brain-Inspired Multimedia Comprehension project.

How to mass produce cell-sized robots

This photo shows circles on a graphene sheet where the sheet is draped over an array of round posts, creating stresses that will cause these discs to separate from the sheet. The gray bar across the sheet is liquid being used to lift the discs from the surface.
Image: Felice Frankel

By David L. Chandler

Tiny robots no bigger than a cell could be mass-produced using a new method developed by researchers at MIT. The microscopic devices, which the team calls “syncells” (short for synthetic cells), might eventually be used to monitor conditions inside an oil or gas pipeline, or to search out disease while floating through the bloodstream.

The key to making such tiny devices in large quantities lies in a method the team developed for controlling the natural fracturing process of atomically-thin, brittle materials, directing the fracture lines so that they produce miniscule pockets of a predictable size and shape. Embedded inside these pockets are electronic circuits and materials that can collect, record, and output data.

The novel process, called “autoperforation,” is described in a paper published today in the journal Nature Materials, by MIT Professor Michael Strano, postdoc Pingwei Liu, graduate student Albert Liu, and eight others at MIT.

The system uses a two-dimensional form of carbon called graphene, which forms the outer structure of the tiny syncells. One layer of the material is laid down on a surface, then tiny dots of a polymer material, containing the electronics for the devices, are deposited by a sophisticated laboratory version of an inkjet printer. Then, a second layer of graphene is laid on top.

Controlled fracturing

People think of graphene, an ultrathin but extremely strong material, as being “floppy,” but it is actually brittle, Strano explains. But rather than considering that brittleness a problem, the team figured out that it could be used to their advantage.

“We discovered that you can use the brittleness,” says Strano, who is the Carbon P. Dubbs Professor of Chemical Engineering at MIT. “It’s counterintuitive. Before this work, if you told me you could fracture a material to control its shape at the nanoscale, I would have been incredulous.”

But the new system does just that. It controls the fracturing process so that rather than generating random shards of material, like the remains of a broken window, it produces pieces of uniform shape and size. “What we discovered is that you can impose a strain field to cause the fracture to be guided, and you can use that for controlled fabrication,” Strano says.

When the top layer of graphene is placed over the array of polymer dots, which form round pillar shapes, the places where the graphene drapes over the round edges of the pillars form lines of high strain in the material. As Albert Liu describes it, “imagine a tablecloth falling slowly down onto the surface of a circular table. One can very easily visualize the developing circular strain toward the table edges, and that’s very much analogous to what happens when a flat sheet of graphene folds around these printed polymer pillars.”

As a result, the fractures are concentrated right along those boundaries, Strano says. “And then something pretty amazing happens: The graphene will completely fracture, but the fracture will be guided around the periphery of the pillar.” The result is a neat, round piece of graphene that looks as if it had been cleanly cut out by a microscopic hole punch.

Because there are two layers of graphene, above and below the polymer pillars, the two resulting disks adhere at their edges to form something like a tiny pita bread pocket, with the polymer sealed inside. “And the advantage here is that this is essentially a single step,” in contrast to many complex clean-room steps needed by other processes to try to make microscopic robotic devices, Strano says.

The researchers have also shown that other two-dimensional materials in addition to graphene, such as molybdenum disulfide and hexagonal boronitride, work just as well.

Cell-like robots

Ranging in size from that of a human red blood cell, about 10 micrometers across, up to about 10 times that size, these tiny objects “start to look and behave like a living biological cell. In fact, under a microscope, you could probably convince most people that it is a cell,” Strano says.

This work follows up on earlier research by Strano and his students on developing syncells that could gather information about the chemistry or other properties of their surroundings using sensors on their surface, and store the information for later retrieval, for example injecting a swarm of such particles in one end of a pipeline and retrieving them at the other to gain data about conditions inside it. While the new syncells do not yet have as many capabilities as the earlier ones, those were assembled individually, whereas this work demonstrates a way of easily mass-producing such devices.

Apart from the syncells’ potential uses for industrial or biomedical monitoring, the way the tiny devices are made is itself an innovation with great potential, according to Albert Liu. “This general procedure of using controlled fracture as a production method can be extended across many length scales,” he says. “[It could potentially be used with] essentially any 2-D materials of choice, in principle allowing future researchers to tailor these atomically thin surfaces into any desired shape or form for applications in other disciplines.”

This is, Albert Liu says, “one of the only ways available right now to produce stand-alone integrated microelectronics on a large scale” that can function as independent, free-floating devices. Depending on the nature of the electronics inside, the devices could be provided with capabilities for movement, detection of various chemicals or other parameters, and memory storage.

There are a wide range of potential new applications for such cell-sized robotic devices, says Strano, who details many such possible uses in a book he co-authored with Shawn Walsh, an expert at Army Research Laboratories, on the subject, called “Robotic Systems and Autonomous Platforms,” which is being published this month by Elsevier Press.

As a demonstration, the team “wrote” the letters M, I, and T into a memory array within a syncell, which stores the information as varying levels of electrical conductivity. This information can then be “read” using an electrical probe, showing that the material can function as a form of electronic memory into which data can be written, read, and erased at will. It can also retain the data without the need for power, allowing information to be collected at a later time. The researchers have demonstrated that the particles are stable over a period of months even when floating around in water, which is a harsh solvent for electronics, according to Strano.

“I think it opens up a whole new toolkit for micro- and nanofabrication,” he says.

Daniel Goldman, a professor of physics at Georgia Tech, who was not involved with this work, says, “The techniques developed by Professor Strano’s group have the potential to create microscale intelligent devices that can accomplish tasks together that no single particle can accomplish alone.”

In addition to Strano, Pingwei Liu, who is now at Zhejiang University in China, and Albert Liu, a graduate student in the Strano lab, the team included MIT graduate student Jing Fan Yang, postdocs Daichi Kozawa, Juyao Dong, and Volodomyr Koman, Youngwoo Son PhD ’16, research affiliate Min Hao Wong, and Dartmouth College student Max Saccone and visiting scholar Song Wang. The work was supported by the Air Force Office of Scientific Research, and the Army Research Office through MIT’s Institute for Soldier Nanotechnologies.

How should autonomous vehicles be programmed?

Ethical questions involving autonomous vehicles are the focus of a new global survey conducted by MIT researchers.

By Peter Dizikes

A massive new survey developed by MIT researchers reveals some distinct global preferences concerning the ethics of autonomous vehicles, as well as some regional variations in those preferences.

The survey has global reach and a unique scale, with over 2 million online participants from over 200 countries weighing in on versions of a classic ethical conundrum, the “Trolley Problem.” The problem involves scenarios in which an accident involving a vehicle is imminent, and the vehicle must opt for one of two potentially fatal options. In the case of driverless cars, that might mean swerving toward a couple of people, rather than a large group of bystanders.

“The study is basically trying to understand the kinds of moral decisions that driverless cars might have to resort to,” says Edmond Awad, a postdoc at the MIT Media Lab and lead author of a new paper outlining the results of the project. “We don’t know yet how they should do that.”

Still, Awad adds, “We found that there are three elements that people seem to approve of the most.”

Indeed, the most emphatic global preferences in the survey are for sparing the lives of humans over the lives of other animals; sparing the lives of many people rather than a few; and preserving the lives of the young, rather than older people.

“The main preferences were to some degree universally agreed upon,” Awad notes. “But the degree to which they agree with this or not varies among different groups or countries.” For instance, the researchers found a less pronounced tendency to favor younger people, rather than the elderly, in what they defined as an “eastern” cluster of countries, including many in Asia.

The paper, “The Moral Machine Experiment,” is being published today in Nature.

The authors are Awad; Sohan Dsouza, a doctoral student in the Media Lab; Richard Kim, a research assistant in the Media Lab; Jonathan Schulz, a postdoc at Harvard University; Joseph Henrich, a professor at Harvard; Azim Shariff, an associate professor at the University of British Columbia; Jean-François Bonnefon, a professor at the Toulouse School of Economics; and Iyad Rahwan, an associate professor of media arts and sciences at the Media Lab, and a faculty affiliate in the MIT Institute for Data, Systems, and Society.

Awad is a postdoc in the MIT Media Lab’s Scalable Cooperation group, which is led by Rahwan.

To conduct the survey, the researchers designed what they call “Moral Machine,” a multilingual online game in which participants could state their preferences concerning a series of dilemmas that autonomous vehicles might face. For instance: If it comes right down it, should autonomous vehicles spare the lives of law-abiding bystanders, or, alternately, law-breaking pedestrians who might be jaywalking? (Most people in the survey opted for the former.)

All told, “Moral Machine” compiled nearly 40 million individual decisions from respondents in 233 countries; the survey collected 100 or more responses from 130 countries. The researchers analyzed the data as a whole, while also breaking participants into subgroups defined by age, education, gender, income, and political and religious views. There were 491,921 respondents who offered demographic data.

The scholars did not find marked differences in moral preferences based on these demographic characteristics, but they did find larger “clusters” of moral preferences based on cultural and geographic affiliations. They defined “western,” “eastern,” and “southern” clusters of countries, and found some more pronounced variations along these lines. For instance: Respondents in southern countries had a relatively stronger tendency to favor sparing young people rather than the elderly, especially compared to the eastern cluster.

Awad suggests that acknowledgement of these types of preferences should be a basic part of informing public-sphere discussion of these issues. In all regions, since there is a moderate preference for sparing law-abiding bystanders rather than jaywalkers, knowing these preferences could, in theory, inform the way software is written to control autonomous vehicles.

“The question is whether these differences in preferences will matter in terms of people’s adoption of the new technology when [vehicles] employ a specific rule,” he says.

Rahwan, for his part, notes that “public interest in the platform surpassed our wildest expectations,” allowing the researchers to conduct a survey that raised awareness about automation and ethics while also yielding specific public-opinion information.

“On the one hand, we wanted to provide a simple way for the public to engage in an important societal discussion,” Rahwan says. “On the other hand, we wanted to collect data to identify which factors people think are important for autonomous cars to use in resolving ethical tradeoffs.”

Beyond the results of the survey, Awad suggests, seeking public input about an issue of innovation and public safety should continue to become a larger part of the dialoge surrounding autonomous vehicles.

“What we have tried to do in this project, and what I would hope becomes more common, is to create public engagement in these sorts of decisions,” Awad says.

Page 7 of 12
1 5 6 7 8 9 12