Page 11 of 12
1 9 10 11 12

“Peel-and-go” printable structures fold themselves

A new method produces a printable structure that begins to fold itself up as soon as it’s peeled off the printing platform. Credit: MIT
by Larry Hardesty

As 3-D printing has become a mainstream technology, industry and academic researchers have been investigating printable structures that will fold themselves into useful three-dimensional shapes when heated or immersed in water.

In a paper appearing in the American Chemical Society’s journal Applied Materials and Interfaces, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and colleagues report something new: a printable structure that begins to fold itself up as soon as it’s peeled off the printing platform.

One of the big advantages of devices that self-fold without any outside stimulus, the researchers say, is that they can involve a wider range of materials and more delicate structures.

“If you want to add printed electronics, you’re generally going to be using some organic materials, because a majority of printed electronics rely on them,” says Subramanian Sundaram, an MIT graduate student in electrical engineering and computer science and first author on the paper. “These materials are often very, very sensitive to moisture and temperature. So if you have these electronics and parts, and you want to initiate folds in them, you wouldn’t want to dunk them in water or heat them, because then your electronics are going to degrade.”

To illustrate this idea, the researchers built a prototype self-folding printable device that includes electrical leads and a polymer “pixel” that changes from transparent to opaque when a voltage is applied to it. The device, which is a variation on the “printable goldbug” that Sundaram and his colleagues announced earlier this year, starts out looking something like the letter “H.” But each of the legs of the H folds itself in two different directions, producing a tabletop shape.

The researchers also built several different versions of the same basic hinge design, which show that they can control the precise angle at which a joint folds. In tests, they forcibly straightened the hinges by attaching them to a weight, but when the weight was removed, the hinges resumed their original folds.

In the short term, the technique could enable the custom manufacture of sensors, displays, or antennas whose functionality depends on their three-dimensional shape. Longer term, the researchers envision the possibility of printable robots.

Sundaram is joined on the paper by his advisor, Wojciech Matusik, an associate professor of electrical engineering and computer science (EECS) at MIT; Marc Baldo, also an associate professor of EECS, who specializes in organic electronics; David Kim, a technical assistant in Matusik’s Computational Fabrication Group; and Ryan Hayward, a professor of polymer science and engineering at the University of Massachusetts at Amherst.

This clip shows an example of an accelerated fold. (Image: Tom Buehler/CSAIL)

Stress relief

The key to the researchers’ design is a new printer-ink material that expands after it solidifies, which is unusual. Most printer-ink materials contract slightly as they solidify, a technical limitation that designers frequently have to work around.

Printed devices are built up in layers, and in their prototypes the MIT researchers deposit their expanding material at precise locations in either the top or bottom few layers. The bottom layer adheres slightly to the printer platform, and that adhesion is enough to hold the device flat as the layers are built up. But as soon as the finished device is peeled off the platform, the joints made from the new material begin to expand, bending the device in the opposite direction.

Like many technological breakthroughs, the CSAIL researchers’ discovery of the material was an accident. Most of the printer materials used by Matusik’s Computational Fabrication Group are combinations of polymers, long molecules that consist of chainlike repetitions of single molecular components, or monomers. Mixing these components is one method for creating printer inks with specific physical properties.

While trying to develop an ink that yielded more flexible printed components, the CSAIL researchers inadvertently hit upon one that expanded slightly after it hardened. They immediately recognized the potential utility of expanding polymers and began experimenting with modifications of the mixture, until they arrived at a recipe that let them build joints that would expand enough to fold a printed device in half.

Whys and wherefores

Hayward’s contribution to the paper was to help the MIT team explain the material’s expansion. The ink that produces the most forceful expansion includes several long molecular chains and one much shorter chain, made up of the monomer isooctyl acrylate. When a layer of the ink is exposed to ultraviolet light — or “cured,” a process commonly used in 3-D printing to harden materials deposited as liquids — the long chains connect to each other, producing a rigid thicket of tangled molecules.

When another layer of the material is deposited on top of the first, the small chains of isooctyl acrylate in the top, liquid layer sink down into the lower, more rigid layer. There, they interact with the longer chains to exert an expansive force, which the adhesion to the printing platform temporarily resists.

The researchers hope that a better theoretical understanding of the reason for the material’s expansion will enable them to design material tailored to specific applications — including materials that resist the 1–3 percent contraction typical of many printed polymers after curing.

“This work is exciting because it provides a way to create functional electronics on 3-D objects,” says Michael Dickey, a professor of chemical engineering at North Carolina State University. “Typically, electronic processing is done in a planar, 2-D fashion and thus needs a flat surface. The work here provides a route to create electronics using more conventional planar techniques on a 2-D surface and then transform them into a 3-D shape, while retaining the function of the electronics. The transformation happens by a clever trick to build stress into the materials during printing.”

3 Questions: Iyad Rahwan on the “psychological roadblocks” facing self-driving cars

An image of some connected autonomous cars

by Peter Dizikes

This summer, a survey released by the American Automobile Association showed that 78 percent of Americans feared riding in a self-driving car, with just 19 percent trusting the technology. What might it take to alter public opinion on the issue? Iyad Rahwan, the AT&T Career Development Professor in the MIT Media Lab, has studied the issue at length, and, along with Jean-Francois Bonnefon of the Toulouse School of Economics and Azim Shariff of the University of California at Irvine, has authored a new commentary on the subject, titled, “Psychological roadblocks to the adoption of self-driving vehicles,” published today in Nature Human Behavior. Rahwan spoke to MIT News about the hurdles automakers face if they want greater public buy-in for autonomous vehicles.  

Q: Your new paper states that when it comes to autonomous vehicles, trust “will determine how widely they are adopted by consumers, and how tolerated they are by everyone else.” Why is this?

A: It’s a new kind of agent in the world. We’ve always built tools and had to trust that technology will function in the way it was intended. We’ve had to trust that the materials are reliable and don’t have health hazards, and that there are consumer protection entities that promote the interests of consumers. But these are passive products that we choose to use. For the first time in history we are building objects that are proactive and have autonomy and are even adaptive. They are learning behaviors that may be different from the ones they were originally programmed for. We don’t really know how to get people to trust such entities, because humans don’t have mental models of what these entities are, what they’re capable of, how they learn.

Before we can trust machines like autonomous vehicles, we have a number of challenges. The first is technical: the challenge of building an AI [artificial intelligence] system that can drive a car. The second is legal and regulatory: Who is liable for different kinds of faults? A third class of challenges is psychological. Unless people are comfortable putting their lives in the hands of AI, then none of this will matter. People won’t buy the product, the economics won’t work, and that’s the end of the story. What we’re trying to highlight in this paper is that these psychological challenges have to be taken seriously, even if [people] are irrational in the way they assess risk, even if the technology is safe and the legal framework is reliable.

Q: What are the specific psychological issues people have with autonomous vehicles?

A: We classify three psychological challenges that we think are fairly big. One of them is dilemmas: A lot of people are concerned about how autonomous vehicles will resolve ethical dilemmas. How will they decide, for example, whether to prioritize safety for the passenger or safety for pedestrians? Should this influence the way in which the car makes a decision about relative risk? And what we’re finding is that people have an idea about how to solve this dilemma: The car should just minimize harm. But the problem is that people are not willing to buy such cars, because they want to buy cars that will always prioritize themselves.

A second one is that people don’t always reason about risk in an unbiased way. People may overplay the risk of dying in a car crash caused by an autonomous vehicle even if autonomous vehicles are, on the average, safer. We’ve seen this kind of overreaction in other fields. Many people are afraid of flying even though you’re incredibly less likely to die from a plane crash than a car crash. So people don’t always reason about risk.

The third class of psychological challenges is this idea that we don’t always have transparency about what the car is thinking and why it’s doing what it’s doing. The carmaker has better knowledge of what the car thinks and how it behaves … which makes it more difficult for people to predict the behavior of autonomous vehicles, which can also dimish trust. One of the preconditions of trust is predictability: If I can trust that you will behave in a particular way, I can behave according to that expectation.

Q: In the paper you state that autonomous vehicles are better depicted “as being perfected, not as perfect.” In essence, is that your advice to the auto industry?

A: Yes, I think setting up very high expectations can be a recipe for disaster, because if you overpromise and underdeliver, you get in trouble. That is not to say that we should underpromise. We should just be a bit realistic about what we promise. If the promise is an improvement on the current status quo, that is, a reduction in risk to everyone, both pedestrians as well as passengers in cars, that’s an admirable goal. Even if we achieve it in a small way, that’s already progress that we should take seriously. I think being transparent about that, and being transparent about the progress being made toward that goal, is crucial.

IBM and MIT to pursue joint research in artificial intelligence, establish new MIT–IBM Watson AI Lab

MIT President L. Rafael Reif, left, and John Kelly III, IBM senior vice president, Cognitive Solutions and Research, shake hands at the conclusion of a signing ceremony establishing the new MIT–IBM Watson AI Lab. Credit: Jake Belcher

IBM and MIT today announced that IBM plans to make a 10-year, $240 million investment to create the MIT–IBM Watson AI Lab in partnership with MIT. The lab will carry out fundamental artificial intelligence (AI) research and seek to propel scientific breakthroughs that unlock the potential of AI. The collaboration aims to advance AI hardware, software, and algorithms related to deep learning and other areas; increase AI’s impact on industries, such as health care and cybersecurity; and explore the economic and ethical implications of AI on society. IBM’s $240 million investment in the lab will support research by IBM and MIT scientists.

The new lab will be one of the largest long-term university-industry AI collaborations to date, mobilizing the talent of more than 100 AI scientists, professors, and students to pursue joint research at IBM’s Research Lab in Cambridge, Massachusetts — co-located with the IBM Watson Health and IBM Security headquarters in Kendall Square — and on the neighboring MIT campus.

The lab will be co-chaired by Dario Gil, IBM Research VP of AI and IBM Q, and Anantha P. Chandrakasan, dean of MIT’s School of Engineering. (Read a related Q&A with Chandrakasan.) IBM and MIT plan to issue a call for proposals to MIT researchers and IBM scientists to submit their ideas for joint research to push the boundaries in AI science and technology in several areas, including:

  • AI algorithms: Developing advanced algorithms to expand capabilities in machine learning and reasoning. Researchers will create AI systems that move beyond specialized tasks to tackle more complex problems and benefit from robust, continuous learning. Researchers will invent new algorithms that can not only leverage big data when available, but also learn from limited data to augment human intelligence.
  • Physics of AI: Investigating new AI hardware materials, devices, and architectures that will support future analog computational approaches to AI model training and deployment, as well as the intersection of quantum computing and machine learning. The latter involves using AI to help characterize and improve quantum devices, and researching the use of quantum computing to optimize and speed up machine-learning algorithms and other AI applications.
  • Application of AI to industries: Given its location in IBM Watson Health and IBM Security headquarters in Kendall Square, a global hub of biomedical innovation, the lab will develop new applications of AI for professional use, including fields such as health care and cybersecurity. The collaboration will explore the use of AI in areas such as the security and privacy of medical data, personalization of health care, image analysis, and the optimum treatment paths for specific patients.
  • Advancing shared prosperity through AI: The MIT–IBM Watson AI Lab will explore how AI can deliver economic and societal benefits to a broader range of people, nations, and enterprises. The lab will study the economic implications of AI and investigate how AI can improve prosperity and help individuals achieve more in their lives.

In addition to IBM’s plan to produce innovations that advance the frontiers of AI, a distinct objective of the new lab is to encourage MIT faculty and students to launch companies that will focus on commercializing AI inventions and technologies that are developed at the lab. The lab’s scientists also will publish their work, contribute to the release of open source material, and foster an adherence to the ethical application of AI.

“The field of artificial intelligence has experienced incredible growth and progress over the past decade. Yet today’s AI systems, as remarkable as they are, will require new innovations to tackle increasingly difficult real-world problems to improve our work and lives,” says John Kelly III, IBM senior vice president, Cognitive Solutions and Research. “The extremely broad and deep technical capabilities and talent at MIT and IBM are unmatched, and will lead the field of AI for at least the next decade.”

“I am delighted by this new collaboration,” MIT President L. Rafael Reif says. “True breakthroughs are often the result of fresh thinking inspired by new kinds of research teams. The combined MIT and IBM talent dedicated to this new effort will bring formidable power to a field with staggering potential to advance knowledge and help solve important challenges.”

Both MIT and IBM have been pioneers in artificial intelligence research, and the new AI lab builds on a decades-long research relationship between the two. In 2016, IBM Research announced a multiyear collaboration with MIT’s Department of Brain and Cognitive Sciences to advance the scientific field of machine vision, a core aspect of artificial intelligence. The collaboration has brought together leading brain, cognitive, and computer scientists to conduct research in the field of unsupervised machine understanding of audio-visual streams of data, using insights from next-generation models of the brain to inform advances in machine vision. In addition, IBM and the Broad Institute of MIT and Harvard have established a five-year, $50 million research collaboration on AI and genomics.

MIT researchers were among those who helped coin and popularize the very phrase “artificial intelligence” in the 1950s. MIT pushed several major advances in the subsequent decades, from neural networks to data encryption to quantum computing to crowdsourcing. Marvin Minsky, a founder of the discipline, collaborated on building the first artificial neural network and he, along with Seymour Papert, advanced learning algorithms. Currently, the Computer Science and Artificial Intelligence Laboratory, the Media Lab, the Department of Brain and Cognitive Sciences, and the MIT Institute for Data, Systems, and Society serve as connected hubs for AI and related research at MIT.

For more than 20 years, IBM has explored the application of AI across many areas and industries. IBM researchers invented and built Watson, which is a cloud-based AI platform being used by businesses, developers, and universities to fight cancer, improve classroom learning, minimize pollution, enhance agriculture and oil and gas exploration, better manage financial investments, and much more. Today, IBM scientists across the globe are working on fundamental advances in AI algorithms, science and technology that will pave the way for the next generation of artificially intelligent systems.

For information about employment opportunities with IBM at the new AI Lab, please visit MITIBMWatsonAILab.mit.edu.

Gregory Falco: Protecting urban infrastructure against cyberterrorism

“The concept of my startup is, ‘Let’s use hacker tools to defeat hackers,’” PhD student Gregory Falco says. “If you don’t know how to break it, you don’t know how to fix it.”
Photo: Ian MacLellan

by Dara Farhadi

While working for the global management consulting company Accenture, Gregory Falco discovered just how vulnerable the technologies underlying smart cities and the “internet of things” — everyday devices that are connected to the internet or a network — are to cyberterrorism attacks.

“What happened was, I was telling sheiks and government officials all around the world about how amazing the internet of things is and how it’s going to solve all their problems and solve sustainability issues and social problems,” Falco says. “And then they asked me, ‘Is it secure?’ I looked at the security guys and they said, ‘There’s no problem.’ And then I looked under the hood myself, and there was nothing going on there.”

Falco is currently transitioning into the third and final year of his PhD within the Department of Urban Studies and Planning (DUSP). Currently, his is carrying out his research at the Computer Science and Artificial Intelligence Laboratory (CSAIL). His focus is on cybersecurity for urban critical infrastructure, and the internet of things, or IoT, is at the center of his work. A washing machine, for example, that is connected to an app on its owner’s smartphone is considered part of the IoT. There are billions of IoT devices that don’t have traditional security software because they’re built with small amounts of memory and low-power processors. This makes these devices susceptible to cyberattacks and may provide a gate for hackers to breach other devices on the same network.

Falco’s concentration is on industrial controls and embedded systems such as automatic switches found in subway systems.

“If someone decides to figure out how to access a switch by hacking another access point that is communicating with that switch, then that subway is not going to stop, and people are going to die,” Falco says. “We rely on these systems for our life functions — critical infrastructure like electric grids, water grids, or transportation systems, but also our health care systems. Insulin pumps, for example, are now connected to your smartphone.”

Citing real-world examples, Falco notes that Russian hackers were able to take down the Ukrainian capital city’s electric grid, and that Iranian hackers interfered with the computer-guided controls of a small dam in Rye Brook, New York.

Falco aims to help combat potential cyberattacks through his research. One arm of his dissertation, which he is working on with renown negotiation Professor Lawrence Susskind, is aimed at conflict negotiation, and looks at how best to negotiate with cyberterrorists. Also, with CSAIL Principal Research Scientist Howard Shrobe, Falco seeks to determine the possibility of predicting which control-systems vulnerabilities could be exploited in critical urban infrastructure. The final branch of Falco’s dissertation is in collaboration with NASA’s Jet Propulsion Laboratory. He has secured a contract to develop an artificial intelligence-powered automated attack generator that can identify all the possible ways someone could hack and destroy NASA’s systems.

“What I really intend to do for my PhD is something that is actionable to the communities I’m working with,” Falco says. “I don’t want to publish something in a book that will sit on a shelf where nobody would read it.”

“Not science fiction anymore”

Falco’s battle against cyberterrorism has also lead him to co-found NeuroMesh, a startup dedicated to protecting IoT devices by using the same techniques hackers use.

“The concept of my startup is, ‘Let’s use hacker tools to defeat hackers,’” Falco says. “If you don’t know how to break it, you don’t know how to fix it.”

One tool hackers use is called a botnet. Once botnets get on a device, they often kill off other malware on the device so that they use all the processing power on the device for themselves. Botnets also play “king of the hill” on the device, and don’t let other botnets latch on.

NeuroMesh uses a botnet’s features against itself to create a good botnet. By re-engineering the botnet, programmers can use them to defeat any kind of malware that comes onto a device.

“The benefit is also that when you look at securing IoT devices with low memory and low processing power, it’s impossible to put any security on them, but these botnets have no problem getting on there because they are so small,” Falco says.

Much like a vaccine protects against diseases, NeuroMesh applies a cyber vaccine to protect industrial devices from cyberattacks. And, by leveraging the bitcoin blockchain to update devices, NeuroMesh further fortifies the security system to block other malware from attacking vital IoT devices.

Recently, Falco and his team pitched their botnet vaccine at MIT’s $100K Accelerate competition and placed second. Falco’s infant son was in the audience while Falco was presenting how NeuroMesh’s technology could secure a baby monitor, as an example, from being hacked. The startup advanced to MIT’s prestigious 100K Launch startup competition, where they finished among the top eight competitors. NeuroMesh is now further developing its technology with the help of a grant from the Department of Energy, working with Stuart Madnick, who is the John Norris Maguire Professor at MIT, and Michael Siegel, a principal research scientist at MIT’s Sloan School of Management.

“Enemies are here. They are on our turf and in our wires. It’s not science fiction anymore,” Falco says. “We’re protecting against this. That’s what NeuroMesh is meant to do.”  

The human tornado

Falco’s abundant energy has led his family to call him “the tornado.”

“One-fourth of my mind is on my startup, one-fourth on finishing my dissertation, and other half is on my 11-month-old because he comes with me when my wife works,” Falco says. “He comes to all our venture capital meetings and my presentations. He’s always around and he’s generally very good.”

As a high school student, Falco’s energy and excitement for engineering drove him to discover a new physics wave theory. Applying this to the tennis racket, he invented a new, control-enhanced method of stringing, with which he won various science competitions (and tennis matches). He used this knowledge to start a small business for stringing rackets. The thrill of business took him on a path to Cornell University’s School of Hotel Administration. After graduating early, Falco transitioned into the field of sustainability technology and energy systems, and returned to his engineering roots by earning his LEED AP (Leadership in Energy and Environmental Design) accreditation and a master’s degree in sustainability management from Columbia University.

His excitement followed him to Accenture, where he founded the smart cities division and eventually learned about the vulnerability of IoT devices. For the past three years, Falco has also been sharing his newfound knowledge about sustainability and computer science as an adjunct professor at Columbia University.

“My challenge is always to find these interdisciplinary holes because my background is so messed up. You can’t say, this guy is a computer scientist or he’s a business person or an environmental scientist because I’m all over the place,” he says.

That’s part of the reason why Falco enjoys taking care of his son, Milo, so much.

“He’s the most awesome thing ever. I see him learning and it’s really amazing,” Falco says. “Spending so much time with him is very fun. He does things that my wife gets frustrated at because he’s a ball of energy and all over the place — just like me.”

Robotic system monitors specific neurons

MIT engineers have devised a way to automate the process of monitoring neurons in a living brain using a computer algorithm that analyzes microscope images and guides a robotic arm to the target cell. In this image, a pipette guided by a robotic arm approaches a neuron identified with a fluorescent stain.
Credit: Ho-Jun Suk

by Anne Trafton

Recording electrical signals from inside a neuron in the living brain can reveal a great deal of information about that neuron’s function and how it coordinates with other cells in the brain. However, performing this kind of recording is extremely difficult, so only a handful of neuroscience labs around the world do it.

To make this technique more widely available, MIT engineers have now devised a way to automate the process, using a computer algorithm that analyzes microscope images and guides a robotic arm to the target cell.

This technology could allow more scientists to study single neurons and learn how they interact with other cells to enable cognition, sensory perception, and other brain functions. Researchers could also use it to learn more about how neural circuits are affected by brain disorders.

“Knowing how neurons communicate is fundamental to basic and clinical neuroscience. Our hope is this technology will allow you to look at what’s happening inside a cell, in terms of neural computation, or in a disease state,” says Ed Boyden, an associate professor of biological engineering and brain and cognitive sciences at MIT, and a member of MIT’s Media Lab and McGovern Institute for Brain Research.

Boyden is the senior author of the paper, which appears in the Aug. 30 issue of Neuron. The paper’s lead author is MIT graduate student Ho-Jun Suk.

Precision guidance

For more than 30 years, neuroscientists have been using a technique known as patch clamping to record the electrical activity of cells. This method, which involves bringing a tiny, hollow glass pipette in contact with the cell membrane of a neuron, then opening up a small pore in the membrane, usually takes a graduate student or postdoc several months to learn. Learning to perform this on neurons in the living mammalian brain is even more difficult.

There are two types of patch clamping: a “blind” (not image-guided) method, which is limited because researchers cannot see where the cells are and can only record from whatever cell the pipette encounters first, and an image-guided version that allows a specific cell to be targeted.

Five years ago, Boyden and colleagues at MIT and Georgia Tech, including co-author Craig Forest, devised a way to automate the blind version of patch clamping. They created a computer algorithm that could guide the pipette to a cell based on measurements of a property called electrical impedance — which reflects how difficult it is for electricity to flow out of the pipette. If there are no cells around, electricity flows and impedance is low. When the tip hits a cell, electricity can’t flow as well and impedance goes up.

Once the pipette detects a cell, it can stop moving instantly, preventing it from poking through the membrane. A vacuum pump then applies suction to form a seal with the cell’s membrane. Then, the electrode can break through the membrane to record the cell’s internal electrical activity.

The researchers achieved very high accuracy using this technique, but it still could not be used to target a specific cell. For most studies, neuroscientists have a particular cell type they would like to learn about, Boyden says.

“It might be a cell that is compromised in autism, or is altered in schizophrenia, or a cell that is active when a memory is stored. That’s the cell that you want to know about,” he says. “You don’t want to patch a thousand cells until you find the one that is interesting.”

To enable this kind of precise targeting, the researchers set out to automate image-guided patch clamping. This technique is difficult to perform manually because, although the scientist can see the target neuron and the pipette through a microscope, he or she must compensate for the fact that nearby cells will move as the pipette enters the brain.

“It’s almost like trying to hit a moving target inside the brain, which is a delicate tissue,” Suk says. “For machines it’s easier because they can keep track of where the cell is, they can automatically move the focus of the microscope, and they can automatically move the pipette.”

By combining several imaging processing techniques, the researchers came up with an algorithm that guides the pipette to within about 25 microns of the target cell. At that point, the system begins to rely on a combination of imagery and impedance, which is more accurate at detecting contact between the pipette and the target cell than either signal alone.

The researchers imaged the cells with two-photon microscopy, a commonly used technique that uses a pulsed laser to send infrared light into the brain, lighting up cells that have been engineered to express a fluorescent protein.

Using this automated approach, the researchers were able to successfully target and record from two types of cells — a class of interneurons, which relay messages between other neurons, and a set of excitatory neurons known as pyramidal cells. They achieved a success rate of about 20 percent, which is comparable to the performance of highly trained scientists performing the process manually.

Unraveling circuits

This technology paves the way for in-depth studies of the behavior of specific neurons, which could shed light on both their normal functions and how they go awry in diseases such as Alzheimer’s or schizophrenia. For example, the interneurons that the researchers studied in this paper have been previously linked with Alzheimer’s. In a recent study of mice, led by Li-Huei Tsai, director of MIT’s Picower Institute for Learning and Memory, and conducted in collaboration with Boyden, it was reported that inducing a specific frequency of brain wave oscillation in interneurons in the hippocampus could help to clear amyloid plaques similar to those found in Alzheimer’s patients.

“You really would love to know what’s happening in those cells,” Boyden says. “Are they signaling to specific downstream cells, which then contribute to the therapeutic result? The brain is a circuit, and to understand how a circuit works, you have to be able to monitor the components of the circuit while they are in action.”

This technique could also enable studies of fundamental questions in neuroscience, such as how individual neurons interact with each other as the brain makes a decision or recalls a memory.

Bernardo Sabatini, a professor of neurobiology at Harvard Medical School, says he is interested in adapting this technique to use in his lab, where students spend a great deal of time recording electrical activity from neurons growing in a lab dish.

“It’s silly to have amazingly intelligent students doing tedious tasks that could be done by robots,” says Sabatini, who was not involved in this study. “I would be happy to have robots do more of the experimentation so we can focus on the design and interpretation of the experiments.”

To help other labs adopt the new technology, the researchers plan to put the details of their approach on their web site, autopatcher.org.

Other co-authors include Ingrid van Welie, Suhasa Kodandaramaiah, and Brian Allen. The research was funded by Jeremy and Joyce Wertheimer, the National Institutes of Health (including the NIH Single Cell Initiative and the NIH Director’s Pioneer Award), the HHMI-Simons Faculty Scholars Program, and the New York Stem Cell Foundation-Robertson Award.

Robot learns to follow orders like Alexa

ComText allows robots to understand contextual commands such as, “Pick up the box I put down.”
Photo: Tom Buehler/MIT CSAIL

by Adam Conner-Simons & Rachel Gordon

Despite what you might see in movies, today’s robots are still very limited in what they can do. They can be great for many repetitive tasks, but their inability to understand the nuances of human language makes them mostly useless for more complicated requests.

For example, if you put a specific tool in a toolbox and ask a robot to “pick it up,” it would be completely lost. Picking it up means being able to see and identify objects, understand commands, recognize that the “it” in question is the tool you put down, go back in time to remember the moment when you put down the tool, and distinguish the tool you put down from other ones of similar shapes and sizes.

Recently researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have gotten closer to making this type of request easier: In a new paper, they present an Alexa-like system that allows robots to understand a wide range of commands that require contextual knowledge about objects and their environments. They’ve dubbed the system “ComText,” for “commands in context.”

The toolbox situation above was among the types of tasks that ComText can handle. If you tell the system that “the tool I put down is my tool,” it adds that fact to its knowledge base. You can then update the robot with more information about other objects and have it execute a range of tasks like picking up different sets of objects based on different commands.

“Where humans understand the world as a collection of objects and people and abstract concepts, machines view it as pixels, point-clouds, and 3-D maps generated from sensors,” says CSAIL postdoc Rohan Paul, one of the lead authors of the paper. “This semantic gap means that, for robots to understand what we want them to do, they need a much richer representation of what we do and say.”

The team tested ComText on Baxter, a two-armed humanoid robot developed for Rethink Robotics by former CSAIL director Rodney Brooks.

The project was co-led by research scientist Andrei Barbu, alongside research scientist Sue Felshin, senior research scientist Boris Katz, and Professor Nicholas Roy. They presented the paper at last week’s International Joint Conference on Artificial Intelligence (IJCAI) in Australia.

How it works

Things like dates, birthdays, and facts are forms of “declarative memory.” There are two kinds of declarative memory: semantic memory, which is based on general facts like the “sky is blue,” and episodic memory, which is based on personal facts, like remembering what happened at a party.

Most approaches to robot learning have focused only on semantic memory, which obviously leaves a big knowledge gap about events or facts that may be relevant context for future actions. ComText, meanwhile, can observe a range of visuals and natural language to glean “episodic memory” about an object’s size, shape, position, type and even if it belongs to somebody. From this knowledge base, it can then reason, infer meaning and respond to commands.

“The main contribution is this idea that robots should have different kinds of memory, just like people,” says Barbu. “We have the first mathematical formulation to address this issue, and we’re exploring how these two types of memory play and work off of each other.”

With ComText, Baxter was successful in executing the right command about 90 percent of the time. In the future, the team hopes to enable robots to understand more complicated information, such as multi-step commands, the intent of actions, and using properties about objects to interact with them more naturally.

For example, if you tell a robot that one box on a table has crackers, and one box has sugar, and then ask the robot to “pick up the snack,” the hope is that the robot could deduce that sugar is a raw material and therefore unlikely to be somebody’s “snack.”

By creating much less constrained interactions, this line of research could enable better communications for a range of robotic systems, from self-driving cars to household helpers.

“This work is a nice step towards building robots that can interact much more naturally with people,” says Luke Zettlemoyer, an associate professor of computer science at the University of Washington who was not involved in the research. “In particular, it will help robots better understand the names that are used to identify objects in the world, and interpret instructions that use those names to better do what users ask.”

The work was funded, in part, by the Toyota Research Institute, the National Science Foundation, the Robotics Collaborative Technology Alliance of the U.S. Army, and the Air Force Research Laboratory.

New robot rolls with the rules of pedestrian conduct


by Jennifer Chu
Engineers at MIT have designed an autonomous robot with “socially aware navigation,” that can keep pace with foot traffic while observing these general codes of pedestrian conduct.
Credit: MIT

Just as drivers observe the rules of the road, most pedestrians follow certain social codes when navigating a hallway or a crowded thoroughfare: Keep to the right, pass on the left, maintain a respectable berth, and be ready to weave or change course to avoid oncoming obstacles while keeping up a steady walking pace.

Now engineers at MIT have designed an autonomous robot with “socially aware navigation,” that can keep pace with foot traffic while observing these general codes of pedestrian conduct.

In drive tests performed inside MIT’s Stata Center, the robot, which resembles a knee-high kiosk on wheels, successfully avoided collisions while keeping up with the average flow of pedestrians. The researchers have detailed their robotic design in a paper that they will present at the IEEE Conference on Intelligent Robots and Systems in September.

“Socially aware navigation is a central capability for mobile robots operating in environments that require frequent interactions with pedestrians,” says Yu Fan “Steven” Chen, who led the work as a former MIT graduate student and is the lead author of the study. “For instance, small robots could operate on sidewalks for package and food delivery. Similarly, personal mobility devices could transport people in large, crowded spaces, such as shopping malls, airports, and hospitals.”

Chen’s co-authors are graduate student Michael Everett, former postdoc Miao Liu, and Jonathan How, the Richard Cockburn Maclaurin Professor of Aeronautics and Astronautics at MIT.

Social drive

In order for a robot to make its way autonomously through a heavily trafficked environment, it must solve four main challenges: localization (knowing where it is in the world), perception (recognizing its surroundings), motion planning (identifying the optimal path to a given destination), and control (physically executing its desired path).

Chen and his colleagues used standard approaches to solve the problems of localization and perception. For the latter, they outfitted the robot with off-the-shelf sensors, such as webcams, a depth sensor, and a high-resolution lidar sensor. For the problem of localization, they used open-source algorithms to map the robot’s environment and determine its position. To control the robot, they employed standard methods used to drive autonomous ground vehicles.

“The part of the field that we thought we needed to innovate on was motion planning,” Everett says. “Once you figure out where you are in the world, and know how to follow trajectories, which trajectories should you be following?”

That’s a tricky problem, particularly in pedestrian-heavy environments, where individual paths are often difficult to predict. As a solution, roboticists sometimes take a trajectory-based approach, in which they program a robot to compute an optimal path that accounts for everyone’s desired trajectories. These trajectories must be inferred from sensor data, because people don’t explicitly tell the robot where they are trying to go. 

“But this takes forever to compute. Your robot is just going to be parked, figuring out what to do next, and meanwhile the person’s already moved way past it before it decides ‘I should probably go to the right,’” Everett says. “So that approach is not very realistic, especially if you want to drive faster.”

Others have used faster, “reactive-based” approaches, in which a robot is programmed with a simple model, using geometry or physics, to quickly compute a path that avoids collisions.

The problem with reactive-based approaches, Everett says, is the unpredictability of human nature — people rarely stick to a straight, geometric path, but rather weave and wander, veering off to greet a friend or grab a coffee. In such an unpredictable environment, such robots tend to collide with people or look like they are being pushed around by avoiding people excessively.

 “The knock on robots in real situations is that they might be too cautious or aggressive,” Everett says. “People don’t find them to fit into the socially accepted rules, like giving people enough space or driving at acceptable speeds, and they get more in the way than they help.”

Training days

The team found a way around such limitations, enabling the robot to adapt to unpredictable pedestrian behavior while continuously moving with the flow and following typical social codes of pedestrian conduct.

They used reinforcement learning, a type of machine learning approach, in which they performed computer simulations to train a robot to take certain paths, given the speed and trajectory of other objects in the environment. The team also incorporated social norms into this offline training phase, in which they encouraged the robot in simulations to pass on the right, and penalized the robot when it passed on the left.

“We want it to be traveling naturally among people and not be intrusive,” Everett says. “We want it to be following the same rules as everyone else.”

The advantage to reinforcement learning is that the researchers can perform these training scenarios, which take extensive time and computing power, offline. Once the robot is trained in simulation, the researchers can program it to carry out the optimal paths, identified in the simulations, when the robot recognizes a similar scenario in the real world.

The researchers enabled the robot to assess its environment and adjust its path, every one-tenth of a second. In this way, the robot can continue rolling through a hallway at a typical walking speed of 1.2 meters per second, without pausing to reprogram its route.

“We’re not planning an entire path to the goal — it doesn’t make sense to do that anymore, especially if you’re assuming the world is changing,” Everett says. “We just look at what we see, choose a velocity, do that for a tenth of a second, then look at the world again, choose another velocity, and go again. This way, we think our robot looks more natural, and is anticipating what people are doing.”

Crowd control

Everett and his colleagues test-drove the robot in the busy, winding halls of MIT’s Stata Building, where the robot was able to drive autonomously for 20 minutes at a time. It rolled smoothly with the pedestrian flow, generally keeping to the right of hallways, occasionally passing people on the left, and avoiding any collisions.

“We wanted to bring it somewhere where people were doing their everyday things, going to class, getting food, and we showed we were pretty robust to all that,” Everett says. “One time there was even a tour group, and it perfectly avoided them.”

Everett says going forward, he plans to explore how robots might handle crowds in a pedestrian environment.

“Crowds have a different dynamic than individual people, and you may have to learn something totally different if you see five people walking together,” Everett says. “There may be a social rule of, ‘Don’t move through people, don’t split people up, treat them as one mass.’ That’s something we’re looking at in the future.”

This research was funded by Ford Motor Company.  

Custom robots in a matter of minutes

Interactive Robogami enables the fabrication of a wide range of robot designs. Photo: MIT CSAIL

Even as robots become increasingly common, they remain incredibly difficult to make. From designing and modeling to fabricating and testing, the process is slow and costly: Even one small change can mean days or weeks of rethinking and revising important hardware.

But what if there were a way to let non-experts craft different robotic designs — in one sitting?

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are getting closer to doing exactly that. In a new paper, they present a system called “Interactive Robogami” that lets you design a robot in minutes, and then 3-D print and assemble it in as little as four hours.
 

One of the key features of the system is that it allows designers to determine both the robot’s movement (“gait”) and shape (“geometry”), a capability that’s often separated in design systems.

“Designing robots usually requires expertise that only mechanical engineers and roboticists have,” says PhD student and co-lead author Adriana Schulz. “What’s exciting here is that we’ve created a tool that allows a casual user to design their own robot by giving them this expert knowledge.”

The paper, which is being published in the new issue of the International Journal of Robotics Research, was co-led by PhD graduate Cynthia Sung alongside MIT professors Wojciech Matusik and Daniela Rus.

The other co-authors include PhD student Andrew Spielberg, former master’s student Wei Zhao, former undergraduate Robin Cheng, and Columbia University professor Eitan Grinspun. (Sung is now an assistant professor at the University of Pennsylvania.)

How it works

3-D printing has transformed the way that people can turn ideas into real objects, allowing users to move away from more traditional manufacturing. Despite these developments, current design tools still have space and motion limitations, and there’s a steep learning curve to understanding the various nuances.

Interactive Robogami aims to be much more intuitive. It uses simulations and interactive feedback with algorithms for design composition, allowing users to focus on high-level conceptual design. Users can choose from a library of over 50 different bodies, wheels, legs, and “peripherals,” as well as a selection of different steps (“gaits”).

Importantly, the system is able to guarantee that a design is actually possible, analyzing factors such as speed and stability to make suggestions and ensure that, for example, the user doesn’t create a robot so top-heavy that it can’t move without tipping over.

Once designed, the robot is then fabricated. The team’s origami-inspired “3-D print and fold” technique involves printing the design as flat faces connected at joints, and then folding the design into the final shape, combining the most effective parts of 2-D and 3-D printing.  

“3-D printing lets you print complex, rigid structures, while 2-D fabrication gives you lightweight but strong structures that can be produced quickly,” Sung says. “By 3-D printing 2-D patterns, we can leverage these advantages to develop strong, complex designs with lightweight materials.”

Results

To test the system, the team used eight subjects who were given 20 minutes of training and asked to perform two tasks.

One task involved creating a mobile, stable car design in just 10 minutes. In a second task, users were given a robot design and asked to create a trajectory to navigate the robot through an obstacle course in the least amount of travel time.

The team fabricated a total of six robots, each of which took 10 to 15 minutes to design, three to seven hours to print and 30 to 90 minutes to assemble. The team found that their 3-D print-and-fold method reduced printing time by 73 percent and the amount of material used by 70 percent. The robots also demonstrated a wide range of movement, like using single legs to walk, using different step sequences, and using legs and wheels simultaneously.

“You can quickly design a robot that you can print out, and that will help you do these tasks very quickly, easily, and cheaply,” says Sung. “It’s lowering the barrier to have everyone design and create their own robots.”

Rus hopes people will be able to incorporate robots to help with everyday tasks, and that similar systems with rapid printing technologies will enable large-scale customization and production of robots.

“These tools enable new approaches to teaching computational thinking and creating,” says Rus. “Students can not only learn by coding and making their own robots, but by bringing to life conceptual ideas about what their robots can actually do.”

While the current version focuses on designs that can walk, the team hopes that in the future, the robots can take flight. Another goal is to have the user be able to go into the system and define the behavior of the robot in terms of tasks it can perform.

“This tool enables rapid exploration of dynamic robots at an early stage in the design process,” says Moritz Bächer, a research scientist and head of the computational design and manufacturing group at Disney Research. “The expert defines the building blocks, with constraints and composition rules, and paves the way for non-experts to make complex robotic systems. This system will likely inspire follow-up work targeting the computational design of even more intricate robots.”

This research was supported by the National Science Foundation’s Expeditions in Computing program.

Using machine learning to improve patient care

Credit: Shutterstock / MIT

Doctors are often deluged by signals from charts, test results, and other metrics to keep track of. It can be difficult to integrate and monitor all of these data for multiple patients while making real-time treatment decisions, especially when data is documented inconsistently across hospitals.

In a new pair of papers, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) explore ways for computers to help doctors make better medical decisions.

One team created a machine-learning approach called “ICU Intervene” that takes large amounts of intensive-care-unit (ICU) data, from vitals and labs to notes and demographics, to determine what kinds of treatments are needed for different symptoms. The system uses “deep learning” to make real-time predictions, learning from past ICU cases to make suggestions for critical care, while also explaining the reasoning behind these decisions.

“The system could potentially be an aid for doctors in the ICU, which is a high-stress, high-demand environment,” says PhD student Harini Suresh, lead author on the paper about ICU Intervene. “The goal is to leverage data from medical records to improve health care and predict actionable interventions.”

Another team developed an approach called “EHR Model Transfer” that can facilitate the application of predictive models on an electronic health record (EHR) system, despite being trained on data from a different EHR system. Specifically, using this approach the team showed that predictive models for mortality and prolonged length of stay can be trained on one EHR system and used to make predictions in another.

ICU Intervene was co-developed by Suresh, undergraduate student Nathan Hunt, postdoc Alistair Johnson, researcher Leo Anthony Celi, MIT Professor Peter Szolovits, and PhD student Marzyeh Ghassemi. It was presented this month at the Machine Learning for Healthcare Conference in Boston.

EHR Model Transfer was co-developed by lead authors Jen Gong and Tristan Naumann, both PhD students at CSAIL, as well as Szolovits and John Guttag, who is the Dugald C. Jackson Professor in Electrical Engineering. It was presented at the ACM’s Special Interest Group on Knowledge Discovery and Data Mining in Halifax, Canada.

Both models were trained using data from the critical care database MIMIC, which includes de-identified data from roughly 40,000 critical care patients and was developed by the MIT Lab for Computational Physiology.

ICU Intervene

Integrated ICU data is vital to automating the process of predicting patients’ health outcomes.

“Much of the previous work in clinical decision-making has focused on outcomes such as mortality (likelihood of death), while this work predicts actionable treatments,” Suresh says. “In addition, the system is able to use a single model to predict many outcomes.”

ICU Intervene focuses on hourly prediction of five different interventions that cover a wide variety of critical care needs, such as breathing assistance, improving cardiovascular function, lowering blood pressure, and fluid therapy.

At each hour, the system extracts values from the data that represent vital signs, as well as clinical notes and other data points. All of the data are represented with values that indicate how far off a patient is from the average (to then evaluate further treatment).

Importantly, ICU Intervene can make predictions far into the future. For example, the model can predict whether a patient will need a ventilator six hours later rather than just 30 minutes or an hour later. The team also focused on providing reasoning for the model’s predictions, giving physicians more insight.

“Deep neural-network-based predictive models in medicine are often criticized for their black-box nature,” says Nigam Shah, an associate professor of medicine at Stanford University who was not involved in the paper. “However, these authors predict the start and end of medical interventions with high accuracy, and are able to demonstrate interpretability for the predictions they make.”

The team found that the system outperformed previous work in predicting interventions, and was especially good at predicting the need for vasopressors, a medication that tightens blood vessels and raises blood pressure.

In the future, the researchers will be trying to improve ICU Intervene to be able to give more individualized care and provide more advanced reasoning for decisions, such as why one patient might be able to taper off steroids, or why another might need a procedure like an endoscopy.

EHR Model Transfer

Another important consideration for leveraging ICU data is how it’s stored and what happens when that storage method gets changed. Existing machine-learning models need data to be encoded in a consistent way, so the fact that hospitals often change their EHR systems can create major problems for data analysis and prediction.

That’s where EHR Model Transfer comes in. The approach works across different versions of EHR platforms, using natural language processing to identify clinical concepts that are encoded differently across systems and then mapping them to a common set of clinical concepts (such as “blood pressure” and “heart rate”).

For example, a patient in one EHR platform could be switching hospitals and would need their data transferred to a different type of platform. EHR Model Transfer aims to ensure that the model could still predict aspects of that patient’s ICU visit, such as their likelihood of a prolonged stay or even of dying in the unit.

“Machine-learning models in health care often suffer from low external validity, and poor portability across sites,” says Shah. “The authors devise a nifty strategy for using prior knowledge in medical ontologies to derive a shared representation across two sites that allows models trained at one site to perform well at another site. I am excited to see such creative use of codified medical knowledge in improving portability of predictive models.”

With EHR Model Transfer, the team tested their model’s ability to predict two outcomes: mortality and the need for a prolonged stay. They trained it on one EHR platform and then tested its predictions on a different platform. EHR Model Transfer was found to outperform baseline approaches and demonstrated better transfer of predictive models across EHR versions compared to using EHR-specific events alone.

In the future, the EHR Model Transfer team plans to evaluate the system on data and EHR systems from other hospitals and care settings.

Both papers were supported, in part, by the Intel Science and Technology Center for Big Data and the National Library of Medicine. The paper detailing EHR Model Transfer was additionally supported by the National Science Foundation and Quanta Computer, Inc.

New AI algorithm monitors sleep with radio waves

Researchers have devised a new way to monitor sleep stages without sensors attached to the body. Their device uses an advanced artificial intelligence algorithm to analyze the radio signals around the person and translate those measurements into sleep stages: light, deep, or rapid eye movement (REM). Image: Christine Daniloff/MIT

More than 50 million Americans suffer from sleep disorders, and diseases including Parkinson’s and Alzheimer’s can also disrupt sleep. Diagnosing and monitoring these conditions usually requires attaching electrodes and a variety of other sensors to patients, which can further disrupt their sleep.

To make it easier to diagnose and study sleep problems, researchers at MIT and Massachusetts General Hospital have devised a new way to monitor sleep stages without sensors attached to the body. Their device uses an advanced artificial intelligence algorithm to analyze the radio signals around the person and translate those measurements into sleep stages: light, deep, or rapid eye movement (REM).

“Imagine if your Wi-Fi router knows when you are dreaming, and can monitor whether you are having enough deep sleep, which is necessary for memory consolidation,” says Dina Katabi, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science, who led the study. “Our vision is developing health sensors that will disappear into the background and capture physiological signals and important health metrics, without asking the user to change her behavior in any way.”

Katabi worked on the study with Matt Bianchi, chief of the division of sleep medicine at MGH, and Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering and Computer Science and a member of the Institute for Data, Systems, and Society at MIT. Mingmin Zhao, an MIT graduate student, is the paper’s first author, and Shichao Yue, another MIT graduate student, is also a co-author.

The researchers will present their new sensor at the International Conference on Machine Learning on Aug. 9.

Remote sensing

Katabi and members of her group in MIT’s Computer Science and Artificial Intelligence Laboratory have previously developed radio-based sensors that enable them to remotely measure vital signs and behaviors that can be indicators of health. These sensors consist of a wireless device, about the size of a laptop computer, that emits low-power radio frequency (RF) signals. As the radio waves reflect off of the body, any slight movement of the body alters the frequency of the reflected waves. Analyzing those waves can reveal vital signs such as pulse and breathing rate.

“It’s a smart Wi-Fi-like box that sits in the home and analyzes these reflections and discovers all of these changes in the body, through a signature that the body leaves on the RF signal,” Katabi says.

Katabi and her students have also used this approach to create a sensor called WiGait that can measure walking speed using wireless signals, which could help doctors predict cognitive decline, falls, certain cardiac or pulmonary diseases, or other health problems.

After developing those sensors, Katabi thought that a similar approach could also be useful for monitoring sleep, which is currently done while patients spend the night in a sleep lab hooked up to monitors such as electroencephalography (EEG) machines.

“The opportunity is very big because we don’t understand sleep well, and a high fraction of the population has sleep problems,” says Zhao. “We have this technology that, if we can make it work, can move us from a world where we do sleep studies once every few months in the sleep lab to continuous sleep studies in the home.”

To achieve that, the researchers had to come up with a way to translate their measurements of pulse, breathing rate, and movement into sleep stages. Recent advances in artificial intelligence have made it possible to train computer algorithms known as deep neural networks to extract and analyze information from complex datasets, such as the radio signals obtained from the researchers’ sensor. However, these signals have a great deal of information that is irrelevant to sleep and can be confusing to existing algorithms. The MIT researchers had to come up with a new AI algorithm based on deep neural networks, which eliminates the irrelevant information.

“The surrounding conditions introduce a lot of unwanted variation in what you measure. The novelty lies in preserving the sleep signal while removing the rest,” says Jaakkola. Their algorithm can be used in different locations and with different people, without any calibration.

Using this approach in tests of 25 healthy volunteers, the researchers found that their technique was about 80 percent accurate, which is comparable to the accuracy of ratings determined by sleep specialists based on EEG measurements.

“Our device allows you not only to remove all of these sensors that you put on the person, and make it a much better experience that can be done at home, it also makes the job of the doctor and the sleep technologist much easier,” Katabi says. “They don’t have to go through the data and manually label it.”

Sleep deficiencies

Other researchers have tried to use radio signals to monitor sleep, but these systems are accurate only 65 percent of the time and mainly determine whether a person is awake or asleep, not what sleep stage they are in. Katabi and her colleagues were able to improve on that by training their algorithm to ignore wireless signals that bounce off of other objects in the room and include only data reflected from the sleeping person.

The researchers now plan to use this technology to study how Parkinson’s disease affects sleep.

“When you think about Parkinson’s, you think about it as a movement disorder, but the disease is also associated with very complex sleep deficiencies, which are not very well understood,” Katabi says.

The sensor could also be used to learn more about sleep changes produced by Alzheimer’s disease, as well as sleep disorders such as insomnia and sleep apnea. It may also be useful for studying epileptic seizures that happen during sleep, which are usually difficult to detect.

Automatic image retouching system

A new system can automatically retouch images in the style of a professional photographer. It can run on a cellphone and display retouched images in real-time. Courtesy of the researchers (edited by MIT News)

The data captured by today’s digital cameras is often treated as the raw material of a final image. Before uploading pictures to social networking sites, even casual cellphone photographers might spend a minute or two balancing color and tuning contrast, with one of the many popular image-processing programs now available.

This week at Siggraph, the premier digital graphics conference, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory and Google are presenting a new system that can automatically retouch images in the style of a professional photographer. It’s so energy-efficient, however, that it can run on a cellphone, and it’s so fast that it can display retouched images in real-time, so that the photographer can see the final version of the image while still framing the shot.

The same system can also speed up existing image-processing algorithms. In tests involving a new Google algorithm for producing high-dynamic-range images, which capture subtleties of color lost in standard digital images, the new system produced results that were visually indistinguishable from those of the algorithm in about one-tenth the time — again, fast enough for real-time display.

The system is a machine-learning system, meaning that it learns to perform tasks by analyzing training data; in this case, for each new task it learned, it was trained on thousands of pairs of images, raw and retouched.

The work builds on an earlier project from the MIT researchers, in which a cellphone would send a low-resolution version of an image to a web server. The server would send back a “transform recipe” that could be used to retouch the high-resolution version of the image on the phone, reducing bandwidth consumption.

“Google heard about the work I’d done on the transform recipe,” says Michaël Gharbi, an MIT graduate student in electrical engineering and computer science and first author on both papers. “They themselves did a follow-up on that, so we met and merged the two approaches. The idea was to do everything we were doing before but, instead of having to process everything on the cloud, to learn it. And the first goal of learning it was to speed it up.”

Short cuts

In the new work, the bulk of the image processing is performed on a low-resolution image, which drastically reduces time and energy consumption. But this introduces a new difficulty, because the color values of the individual pixels in the high-res image have to be inferred from the much coarser output of the machine-learning system.

In the past, researchers have attempted to use machine learning to learn how to “upsample” a low-res image, or increase its resolution by guessing the values of the omitted pixels. During training, the input to the system is a low-res image, and the output is a high-res image. But this doesn’t work well in practice; the low-res image just leaves out too much data.

Gharbi and his colleagues — MIT professor of electrical engineering and computer science Frédo Durand and Jiawen Chen, Jon Barron, and Sam Hasinoff of Google — address this problem with two clever tricks. The first is that the output of their machine-learning system is not an image; rather, it’s a set of simple formulae for modifying the colors of image pixels. During training, the performance of the system is judged according to how well the output formulae, when applied to the original image, approximate the retouched version.

Taking bearings

The second trick is a technique for determining how to apply those formulae to individual pixels in the high-res image. The output of the researchers’ system is a three-dimensional grid, 16 by 16 by 8. The 16-by-16 faces of the grid correspond to pixel locations in the source image; the eight layers stacked on top of them correspond to different pixel intensities. Each cell of the grid contains formulae that determine modifications of the color values of the source images.

That means that each cell of one of the grid’s 16-by-16 faces has to stand in for thousands of pixels in the high-res image. But suppose that each set of formulae corresponds to a single location at the center of its cell. Then any given high-res pixel falls within a square defined by four sets of formulae.

Roughly speaking, the modification of that pixel’s color value is a combination of the formulae at the square’s corners, weighted according to distance. A similar weighting occurs in the third dimension of the grid, the one corresponding to pixel intensity.

The researchers trained their system on a data set created by Durand’s group and Adobe Systems, the creators of Photoshop. The data set includes 5,000 images, each retouched by five different photographers. They also trained their system on thousands of pairs of images produced by the application of particular image-processing algorithms, such as the one for creating high-dynamic-range (HDR) images. The software for performing each modification takes up about as much space in memory as a single digital photo, so in principle, a cellphone could be equipped to process images in a range of styles.

Finally, the researchers compared their system’s performance to that of a machine-learning system that processed images at full resolution rather than low resolution. During processing, the full-res version needed about 12 gigabytes of memory to execute its operations; the researchers’ version needed about 100 megabytes, or one-hundredth as much. The full-resolution version of the HDR system took about 10 times as long to produce an image as the original algorithm, or 100 times as long as the researchers’ system.

“This technology has the potential to be very useful for real-time image enhancement on mobile platforms,” says Barron. “Using machine learning for computational photography is an exciting prospect but is limited by the severe computational and power constraints of mobile phones. This paper may provide us with a way to sidestep these issues and produce new, compelling, real-time photographic experiences without draining your battery or giving you a laggy viewfinder experience.”

SMART trials self-driving wheelchair at hospital

Image: MIT CSAIL

Singapore and MIT have been at the forefront of autonomous vehicle development. First, there were self-driving golf buggies. Then, an autonomous electric car. Now, leveraging similar technology, MIT and Singaporean researchers have developed and deployed a self-driving wheelchair at a hospital.

Spearheaded by Daniela Rus, the Andrew (1956) and Erna Viterbi Professor of Electrical Engineering and Computer Science and director of MIT’s Computer Science and Artificial Intelligence Laboratory, this autonomous wheelchair is an extension of the self-driving scooter that launched at MIT last year — and it is a testament to the success of the Singapore-MIT Alliance for Research and Technology, or SMART, a collaboration between researchers at MIT and in Singapore.

Rus, who is also the principal investigator of the SMART Future Urban Mobility research group, says this newest innovation can help nurses focus more on patient care as they can get relief from logistics work which includes searching for wheelchairs and wheeling patients in the complex hospital network.

“When we visited several retirement communities, we realized that the quality of life is dependent on mobility. We want to make it really easy for people to move around,” Rus says.

Reshaping computer-aided design

Adriana Schulz, an MIT PhD student in the Computer Science and Artificial Intelligence Laboratory, demonstrates the InstantCAD computer-aided-design-optimizing interface. Photo: Rachel Gordon/MIT CSAIL

Almost every object we use is developed with computer-aided design (CAD). Ironically, while CAD programs are good for creating designs, using them is actually very difficult and time-consuming if you’re trying to improve an existing design to make the most optimal product. Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Columbia University are trying to make the process faster and easier: In a new paper, they’ve developed InstantCAD, a tool that lets designers interactively edit, improve, and optimize CAD models using a more streamlined and intuitive workflow.

 InstantCAD integrates seamlessly with existing CAD programs as a plug-in, meaning that designers don’t have to learn new tools to use it.

“From more ergonomic desks to higher-performance cars, this is really about creating better products in less time,” says Department of Electrical Engineering and Computer Science PhD student and lead author Adriana Schulz, who will be presenting the paper at this month’s SIGGRAPH computer-graphics conference in Los Angeles. “We think this could be a real game changer for automakers and other companies that want to be able to test and improve complex designs in a matter of seconds to minutes, instead of hours to days.”

The paper was co-written by Associate Professor Wojciech Matusik, PhD student Jie Xu, and postdoc Bo Zhu of CSAIL, as well as Associate Professor Eitan Grinspun and Assistant Professor Changxi Zheng of Columbia University.

Traditional CAD systems are “parametric,” which means that when engineers design models, they can change properties like shape and size (“parameters”) based on different priorities. For example, when designing a wind turbine you might have to make trade-offs between how much airflow you can get versus how much energy it will generate.

InstantCAD enables designers to interactively edit, improve, and optimize CAD models using a more streamlined and intuitive workflow. Photo: Rachel Gordon/MIT CSAIL

However, it can be difficult to determine the absolute best design for what you want your object to do, because there are many different options for modifying the design. On top of that, the process is time-consuming because changing a single property means having to wait to regenerate the new design, run a simulation, see the result, and then figure out what to do next.

With InstantCAD, the process of improving and optimizing the design can be done in real-time, saving engineers days or weeks. After an object is designed in a commercial CAD program, it is sent to a cloud platform where multiple geometric evaluations and simulations are run at the same time.

With this precomputed data, you can instantly improve and optimize the design in two ways. With “interactive exploration,” a user interface provides real-time feedback on how design changes will affect performance, like how the shape of a plane wing impacts air pressure distribution. With “automatic optimization,” you simply tell the system to give you a design with specific characteristics, like a drone that’s as lightweight as possible while still being able to carry the maximum amount of weight.

The reason it’s hard to optimize an object’s design is because of the massive size of the design space (the number of possible design options).

“It’s too data-intensive to compute every single point, so we have to come up with a way to predict any point in this space from just a small number of sampled data points,” says Schulz. “This is called ‘interpolation,’ and our key technical contribution is a new algorithm we developed to take these samples and estimate points in the space.”

Matusik says InstantCAD could be particularly helpful for more intricate designs for objects like cars, planes, and robots, particularly for industries like car manufacturing that care a lot about squeezing every little bit of performance out of a product.

“Our system doesn’t just save you time for changing designs, but has the potential to dramatically improve the quality of the products themselves,” says Matusik. “The more complex your design gets, the more important this kind of a tool can be.”

Because of the system’s productivity boosts and CAD integration, Schulz is confident that it will have immediate applications for industry. Down the line, she hopes that InstantCAD can also help lower the barrier for entry for casual users.

“In a world where 3-D printing and industrial robotics are making manufacturing more accessible, we need systems that make the actual design process more accessible, too,” Schulz says. “With systems like this that make it easier to customize objects to meet your specific needs, we hope to be paving the way to a new age of personal manufacturing and DIY design.”

The project was supported by the National Science Foundation.

Artificial intelligence suggests recipes based on food photos

Pic2Recipe, an artificial intelligence system developed at MIT, can take a photo of an entree and suggest a similar recipe to it. Photo: Jason Dorfman/MIT CSAIL

There are few things social media users love more than flooding their feeds with photos of food. Yet we seldom use these images for much more than a quick scroll on our cellphones. Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) believe that analyzing photos like these could help us learn recipes and better understand people’s eating habits.

In a new paper with the Qatar Computing Research Institute (QCRI), the team trained an artificial intelligence system called Pic2Recipe to look at a photo of food and be able to predict the ingredients and suggest similar recipes.

“In computer vision, food is mostly neglected because we don’t have the large-scale datasets needed to make predictions,” says Yusuf Aytar, an MIT postdoc who co-wrote a paper about the system with MIT Professor Antonio Torralba. “But seemingly useless photos on social media can actually provide valuable insight into health habits and dietary preferences.”

 The paper will be presented later this month at the Computer Vision and Pattern Recognition conference in Honolulu. CSAIL graduate student Nick Hynes was lead author alongside Amaia Salvador of the Polytechnic University of Catalonia in Spain. Co-authors include CSAIL postdoc Javier Marin, as well as scientist Ferda Ofli and research director Ingmar Weber of QCRI.

How it works

The web has spurred a huge growth of research in the area of classifying food data, but the majority of it has used much smaller datasets, which often leads to major gaps in labeling foods.

In 2014 Swiss researchers created the “Food-101” dataset and used it to develop an algorithm that could recognize images of food with 50 percent accuracy. Future iterations only improved accuracy to about 80 percent, suggesting that the size of the dataset may be a limiting factor.

Even the larger datasets have often been somewhat limited in how well they generalize across populations. A database from the City University in Hong Kong has over 110,000 images and 65,000 recipes, each with ingredient lists and instructions, but only contains Chinese cuisine.

The CSAIL team’s project aims to build off of this work but dramatically expand in scope. Researchers combed websites like All Recipes and Food.com to develop “Recipe1M,” a database of over 1 million recipes that were annotated with information about the ingredients in a wide range of dishes. They then used that data to train a neural network to find patterns and make connections between the food images and the corresponding ingredients and recipes.

Given a photo of a food item, Pic2Recipe could identify ingredients like flour, eggs, and butter, and then suggest several recipes that it determined to be similar to images from the database. (The team has an online demo where people can upload their own food photos to test it out.)

“You can imagine people using this to track their daily nutrition, or to photograph their meal at a restaurant and know what’s needed to cook it at home later,” says Christoph Trattner, an assistant professor at MODUL University Vienna in the New Media Technology Department who was not involved in the paper. “The team’s approach works at a similar level to human judgement, which is remarkable.”

The system did particularly well with desserts like cookies or muffins, since that was a main theme in the database. However, it had difficulty determining ingredients for more ambiguous foods, like sushi rolls and smoothies.

It was also often stumped when there were similar recipes for the same dishes. For example, there are dozens of ways to make lasagna, so the team needed to make sure that system wouldn’t “penalize” recipes that are similar when trying to separate those that are different. (One way to solve this was by seeing if the ingredients in each are generally similar before comparing the recipes themselves).

In the future, the team hopes to be able to improve the system so that it can understand food in even more detail. This could mean being able to infer how a food is prepared (i.e. stewed versus diced) or distinguish different variations of foods, like mushrooms or onions.

The researchers are also interested in potentially developing the system into a “dinner aide” that could figure out what to cook given a dietary preference and a list of items in the fridge.

“This could potentially help people figure out what’s in their food when they don’t have explicit nutritional information,” says Hynes. “For example, if you know what ingredients went into a dish but not the amount, you can take a photo, enter the ingredients, and run the model to find a similar recipe with known quantities, and then use that information to approximate your own meal.”

The project was funded, in part, by QCRI, as well as the European Regional Development Fund (ERDF) and the Spanish Ministry of Economy, Industry, and Competitiveness.

Bringing neural networks to cellphones

Image: Jose-Luis Olivares/MIT

In recent years, the best-performing artificial-intelligence systems — in areas such as autonomous driving, speech recognition, computer vision, and automatic translation — have come courtesy of software systems known as neural networks.

But neural networks take up a lot of memory and consume a lot of power, so they usually run on servers in the cloud, which receive data from desktop or mobile devices and then send back their analyses.

Last year, MIT associate professor of electrical engineering and computer science Vivienne Sze and colleagues unveiled a new, energy-efficient computer chip optimized for neural networks, which could enable powerful artificial-intelligence systems to run locally on mobile devices.

Now, Sze and her colleagues have approached the same problem from the opposite direction, with a battery of techniques for designing more energy-efficient neural networks. First, they developed an analytic method that can determine how much power a neural network will consume when run on a particular type of hardware. Then they used the method to evaluate new techniques for paring down neural networks so that they’ll run more efficiently on handheld devices.

The researchers describe the work in a paper they’re presenting next week at the Computer Vision and Pattern Recognition Conference. In the paper, they report that the methods offered as much as a 73 percent reduction in power consumption over the standard implementation of neural networks, and as much as a 43 percent reduction over the best previous method for paring the networks down.

Energy evaluator

Loosely based on the anatomy of the brain, neural networks consist of thousands or even millions of simple but densely interconnected information-processing nodes, usually organized into layers. Different types of networks vary according to their number of layers, the number of connections between the nodes, and the number of nodes in each layer.

The connections between nodes have “weights” associated with them, which determine how much a given node’s output will contribute to the next node’s computation. During training, in which the network is presented with examples of the computation it’s learning to perform, those weights are continually readjusted, until the output of the network’s last layer consistently corresponds with the result of the computation.

“The first thing we did was develop an energy-modeling tool that accounts for data movement, transactions, and data flow,” Sze says. “If you give it a network architecture and the value of its weights, it will tell you how much energy this neural network will take. One of the questions that people had is ‘Is it more energy efficient to have a shallow network and more weights or a deeper network with fewer weights?’ This tool gives us better intuition as to where the energy is going, so that an algorithm designer could have a better understanding and use this as feedback. The second thing we did is that, now that we know where the energy is actually going, we started to use this model to drive our design of energy-efficient neural networks.”

In the past, Sze explains, researchers attempting to reduce neural networks’ power consumption used a technique called “pruning.” Low-weight connections between nodes contribute very little to a neural network’s final output, so many of them can be safely eliminated, or pruned.

Principled pruning

With the aid of their energy model, Sze and her colleagues — first author Tien-Ju Yang and Yu-Hsin Chen, both graduate students in electrical engineering and computer science — varied this approach. Although cutting even a large number of low-weight connections can have little effect on a neural net’s output, cutting all of them probably would, so pruning techniques must have some mechanism for deciding when to stop.

The MIT researchers thus begin pruning those layers of the network that consume the most energy. That way, the cuts translate to the greatest possible energy savings. They call this method “energy-aware pruning.”

Weights in a neural network can be either positive or negative, so the researchers’ method also looks for cases in which connections with weights of opposite sign tend to cancel each other out. The inputs to a given node are the outputs of nodes in the layer below, multiplied by the weights of their connections. So the researchers’ method looks not only at the weights but also at the way the associated nodes handle training data. Only if groups of connections with positive and negative weights consistently offset each other can they be safely cut. This leads to more efficient networks with fewer connections than earlier pruning methods did.

“Recently, much activity in the deep-learning community has been directed toward development of efficient neural-network architectures for computationally constrained platforms,” says Hartwig Adam, the team lead for mobile vision at Google. “However, most of this research is focused on either reducing model size or computation, while for smartphones and many other devices energy consumption is of utmost importance because of battery usage and heat restrictions. This work is taking an innovative approach to CNN [convolutional neural net] architecture optimization that is directly guided by minimization of power consumption using a sophisticated new energy estimation tool, and it demonstrates large performance gains over computation-focused methods. I hope other researchers in the field will follow suit and adopt this general methodology to neural-network-model architecture design.”

Page 11 of 12
1 9 10 11 12