Page 6 of 12
1 4 5 6 7 8 12

Assembler robots make large structures from little pieces

Photo shows two prototype assembler robots at work putting together a series of small units, known as voxels, into a larger structure.
Image courtesy of Benjamin Jenett

By David L. Chandler

Today’s commercial aircraft are typically manufactured in sections, often in different locations — wings at one factory, fuselage sections at another, tail components somewhere else — and then flown to a central plant in huge cargo planes for final assembly.

But what if the final assembly was the only assembly, with the whole plane built out of a large array of tiny identical pieces, all put together by an army of tiny robots?

That’s the vision that graduate student Benjamin Jenett, working with Professor Neil Gershenfeld in MIT’s Center for Bits and Atoms (CBA), has been pursuing as his doctoral thesis work. It’s now reached the point that prototype versions of such robots can assemble small structures and even work together as a team to build up a larger assemblies.

The new work appears in the October issue of the IEEE Robotics and Automation Letters, in a paper by Jenett, Gershenfeld, fellow graduate student Amira Abdel-Rahman, and CBA alumnus Kenneth Cheung SM ’07, PhD ’12, who is now at NASA’s Ames Research Center, where he leads the ARMADAS project to design a lunar base that could be built with robotic assembly.

“This paper is a treat,” says Aaron Becker, an associate professor of electrical and computer engineering at the University of Houston, who was not associated with this work. “It combines top-notch mechanical design with jaw-dropping demonstrations, new robotic hardware, and a simulation suite with over 100,000 elements,” he says.

“What’s at the heart of this is a new kind of robotics, that we call relative robots,” Gershenfeld says. Historically, he explains, there have been two broad categories of robotics — ones made out of expensive custom components that are carefully optimized for particular applications such as factory assembly, and ones made from inexpensive mass-produced modules with much lower performance. The new robots, however, are an alternative to both. They’re much simpler than the former, while much more capable than the latter, and they have the potential to revolutionize the production of large-scale systems, from airplanes to bridges to entire buildings.

Experiments demonstrating relative robotic assembly of 1D, 2D, and 3D discrete cellular structures

According to Gershenfeld, the key difference lies in the relationship between the robotic device and the materials that it is handling and manipulating. With these new kinds of robots, “you can’t separate the robot from the structure — they work together as a system,” he says. For example, while most mobile robots require highly precise navigation systems to keep track of their position, the new assembler robots only need to keep track of where they are in relation to the small subunits, called voxels, that they are currently working on. Every time the robot takes a step onto the next voxel, it readjusts its sense of position, always in relation to the specific components that it is standing on at the moment.

The underlying vision is that just as the most complex of images can be reproduced by using an array of pixels on a screen, virtually any physical object can be recreated as an array of smaller three-dimensional pieces, or voxels, which can themselves be made up of simple struts and nodes. The team has shown that these simple components can be arranged to distribute loads efficiently; they are largely made up of open space so that the overall weight of the structure is minimized. The units can be picked up and placed in position next to one another by the simple assemblers, and then fastened together using latching systems built into each voxel.

The robots themselves resemble a small arm, with two long segments that are hinged in the middle, and devices for clamping onto the voxel structures on each end. The simple devices move around like inchworms, advancing along a row of voxels by repeatedly opening and closing their V-shaped bodies to move from one to the next. Jenett has dubbed the little robots BILL-E (a nod to the movie robot WALL-E), which stands for Bipedal Isotropic Lattice Locomoting Explorer.

Computer simulation shows a group of four assembler robots at work on building a three-dimensional structure. Whole swarms of such robots could be unleashed to create large structures such as airplane wings or space habitats. Illustration courtesy of the researchers

Jenett has built several versions of the assemblers as proof-of-concept designs, along with corresponding voxel designs featuring latching mechanisms to easily attach or detach each one from its neighbors. He has used these prototypes to demonstrate the assembly of the blocks into linear, two-dimensional, and three-dimensional structures. “We’re not putting the precision in the robot; the precision comes from the structure” as it gradually takes shape, Jenett says. “That’s different from all other robots. It just needs to know where its next step is.”

As it works on assembling the pieces, each of the tiny robots can count its steps over the structure, says Gershenfeld, who is the director of CBA. Along with navigation, this lets the robots correct errors at each step, eliminating most of the complexity of typical robotic systems, he says. “It’s missing most of the usual control systems, but as long as it doesn’t miss a step, it knows where it is.” For practical assembly applications, swarms of such units could be working together to speed up the process, thanks to control software developed by Abdel-Rahman that can allow the robots to coordinate their work and avoid getting in each other’s way.

This kind of assembly of large structures from identical subunits using a simple robotic system, much like a child assembling a large castle out of LEGO blocks, has already attracted the interest of some major potential users, including NASA, MIT’s collaborator on this research, and the European aerospace company Airbus SE, which also helped to sponsor the study.

One advantage of such assembly is that repairs and maintenance can be handled easily by the same kind of robotic process as the initial assembly. Damaged sections can be disassembled from the structure and replaced with new ones, producing a structure that is just as robust as the original. “Unbuilding is as important as building,” Gershenfeld says, and this process can also be used to make modifications or improvements to the system over time.

“For a space station or a lunar habitat, these robots would live on the structure, continuously maintaining and repairing it,” says Jenett.

Ultimately, such systems could be used to construct entire buildings, especially in difficult environments such as in space, or on the moon or Mars, Gershenfeld says. This could eliminate the need to ship large preassembled structures all the way from Earth. Instead it could be possible to send large batches of the tiny subunits — or form them from local materials using systems that could crank out these subunits at their final destination point. “If you can make a jumbo jet, you can make a building,” Gershenfeld says.

Sandor Fekete, director of the Institute of Operating Systems and Computer Networks at the Technical University of Braunschweig, in Germany, who was not involved in this work, says “Ultralight, digital materials such as [these] open amazing perspectives for constructing efficient, complex, large-scale structures, which are of vital importance in aerospace applications.”

But assembling such systems is a challenge, says Fekete, who plans to join the research team for further development of the control systems. “This is where the use of small and simple robots promises to provide the next breakthrough: Robots don’t get tired or bored, and using many miniature robots seems like the only way to get this critical job done. This extremely original and clever work by Ben Jennet and collaborators makes a giant leap towards the construction of dynamically adjustable airplane wings, enormous solar sails or even reconfigurable space habitats.”

In the process, Gershenfeld says, “we feel like we’re uncovering a new field of hybrid material-robot systems.”

Robots help patients manage chronic illness at home

Catalia Health uses a personal robot assistant, Mabu, to help patients managing chronic diseases.
Courtesy of Catalia Health

By Zach Winn

The Mabu robot, with its small yellow body and friendly expression, serves, literally, as the face of the care management startup Catalia Health. The most innovative part of the company’s solution, however, lies behind Mabu’s large blue eyes.

Catalia Health’s software incorporates expertise in psychology, artificial intelligence, and medical treatment plans to help patients manage their chronic conditions. The result is a sophisticated robot companion that uses daily conversations to give patients tips, medication reminders, and information on their condition while relaying relevant data to care providers. The information exchange can also take place on patients’ mobile phones.

“Ultimately, what we’re building are care management programs to help patients in particular disease states,” says Catalia Health founder and CEO Cory Kidd SM ’03, PhD ’08. “A lot of that is getting information back to the people providing care. We’re helping them scale up their efforts to interact with every patient more frequently.”

Heart failure patients first brought Mabu into their homes about a year and a half ago as part of a partnership with the health care provider Kaiser Permanente, who pays for the service. Since then, Catalia Health has also partnered with health care systems and pharmaceutical companies to help patients dealing with conditions including rheumatoid arthritis and kidney cancer.

Treatment plans for chronic diseases can be challenging for patients to manage consistently, and many people don’t follow them as prescribed. Kidd says Mabu’s daily conversations help not only patients, but also human care givers as they make treatment decisions using data collected by their robot counterpart.

Robotics for change

Kidd was a student and faculty member at Georgia Tech before coming to MIT for his master’s degree in 2001. His work focused on addressing problems in health care caused by an aging population and an increase in the number of people managing chronic diseases.

“The way we deliver health care doesn’t scale to the needs we have, so I was looking for technologies that might help with that,” Kidd says.

Many studies have found that communicating with someone in person, as opposed to over the phone or online, makes that person appear more trustworthy, engaging, and likeable. At MIT, Kidd conducted studies aimed at understanding if those findings translated to robots.

“What I found was when we used an interactive robot that you could look in the eye and share the same physical space with, you got the same psychological effects as face-to-face interaction,” Kidd says.

As part of his PhD in the Media Lab’s Media Arts and Sciences program, Kidd tested that finding in a randomized, controlled trial with patients in a diabetes and weight management program at the Boston University Medical Center. A portion of the patients were given a robotic weight-loss coach to take home, while another group used a computer running the same software. The tabletop robot conducted regular check ups and offered tips on maintaining a healthy diet and lifestyle. Patients who received the robot were much more likely to stick with the weight loss program.

Upon finishing his PhD in 2007, Kidd immediately sought to apply his research by starting the company Intuitive Automata to help people manage their diabetes using robot coaches. Even as he pursued the idea, though, Kidd says he knew it was too early to be introducing such sophisticated technology to a health care industry that, at the time, was still adjusting to electronic health records.

Intuitive Automata ultimately wasn’t a major commercial success, but it did help Kidd understand the health care sector at a much deeper level as he worked to sell the diabetes and weight management programs to providers, pharmaceutical companies, insurers, and patients.

“I was able to build a big network across the industry and understand how these people think about challenges in health care,” Kidd says. “It let me see how different entities think about how they fit in the health care ecosystem.”

Since then, Kidd has watched the costs associated with robotics and computing plummet. Many people have also enthusiastically adopted computer assistance like Amazon’s Alexa and Apple’s Siri. Finally, Kidd says members of the health care industry have developed an appreciation for technology’s potential to complement traditional methods of care.

“The common ways [care is delivered] on the provider side is by bringing patients to the doctor’s office or hospital,” Kidd explains. “Then on the pharma side, it’s call center-based. In the middle of these is the home visitation model. They’re all very human powered. If you want to help twice as many patients, you hire twice as many people. There’s no way around that.”

In the summer of 2014, he founded Catalia Health to help patients with chronic conditions at scale.

“It’s very exciting because I’ve seen how well this can work with patients,” Kidd says of the company’s potential. “The biggest challenge with the early studies was that, in the end, the patients didn’t want to give the robots back. From my perspective, that’s one of the things that shows this really does work.”

Mabu makes friends

Catalia Health uses artificial intelligence to help Mabu learn about each patient through daily conversations, which vary in length depending on the patient’s answers.

“A lot of conversations start off with ‘How are you feeling?’ similar to what a doctor or nurse might ask,” Kidd explains. “From there, it might go off in many directions. There are a few things doctors or nurses would ask if they could talk to these patients every day.”

For example, Mabu would ask heart failure patients how they are feeling, if they have shortness of breath, and about their weight.

“Based on patients’ answers, Mabu might say ‘You might want to call your doctor,’ or ‘I’ll send them this information,’ or ‘Let’s check in tomorrow,’” Kidd says.

Last year, Catalia Health announced a collaboration with the American Heart Association that has allowed Mabu to deliver the association’s guidelines for patients living with heart failure.

“A patient might say ‘I’m feeling terrible today’ and Mabu might ask ‘Is it one of these symptoms a lot of people with your condition deal with?’ We’re trying to get down to whether it’s the disease or the drug. When that happens, we do two things: Mabu has a lot of information about problems a patient might be dealing with, so she’s able to give quick feedback. Simultaneously, she’s sending that information to a clinician — a doctor, nurse, or pharmacists — whoever’s providing care.”

In addition to health care providers, Catalia also partners with pharmaceutical companies. In each case, patients pay nothing out of pocket for their robot companions. Although the data Catalia Health sends pharmaceutical companies is completely anonymized, it can help them follow their treatment’s effects on patients in real time and better understand the patient experience.

Details about many of Catalia Health’s partnerships have not been disclosed, but the company did announce a collaboration with Pfizer last month to test the impact of Mabu on patient treatment plans.

Over the next year, Kidd hopes to add to the company’s list of partnerships and help patients dealing a wider swath of diseases. Regardless of how fast Catalia Health scales, he says the service it provides will not diminish as Mabu brings its trademark attentiveness and growing knowledge base to every conversation.

“In a clinical setting, if we talk about a doctor with good bedside manner, we don’t mean that he or she has more clinical knowledge than the next person, we simply mean they’re better at connecting with patients,” Kidd says. “I’ve looked at the psychology behind that — what does it mean to be able to do that? — and turned that into the algorithms we use to help create conversations with patients.”

Soft robotics breakthrough manages immune response for implanted devices

Depiction of a soft robotic device known as a dynamic soft reservoir (DSR)
Image courtesy of the researchers.

Researchers from the Institute for Medical Engineering and Science (IMES) at MIT; the National University of Ireland Galway (NUI Galway); and AMBER, the SFI Research Centre for Advanced Materials and BioEngineering Research, recently announced a significant breakthrough in soft robotics that could help patients requiring in-situ (implanted) medical devices such as breast implants, pacemakers, neural probes, glucose biosensors, and drug and cell delivery devices.

The implantable medical devices market is currently estimated at approximately $100 billion, with significant growth potential into the future as new technologies for drug delivery and health monitoring are developed. These devices are not without problems, caused in part by the body’s own protection responses. These complex and unpredictable foreign-body responses impair device function and drastically limit the long-term performance and therapeutic efficacy of these devices.

One such foreign body response is fibrosis, a process whereby a dense fibrous capsule surrounds the implanted device, which can cause device failure or impede its function. Implantable medical devices have various failure rates that can be attributed to fibrosis, ranging from 30-50 percent for implantable pacemakers or 30 percent for mammoplasty prosthetics. In the case of biosensors or drug/cell delivery devices, the dense fibrous capsule which can build up around the implanted device can seriously impede its function, with consequences for the patient and costs to the health care system.

A radical new vision for medical devices to address this problem was published in the internationally respected journal, Science Robotics. The study was led by researchers from NUI Galway, IMES, and the SFI research center AMBER, among others. The research describes the use of soft robotics to modify the body’s response to implanted devices. Soft robots are flexible devices that can be implanted into the body.

The transatlantic partnership of scientists has created a tiny, mechanically actuated soft robotic device known as a dynamic soft reservoir (DSR) that has been shown to significantly reduce the build-up of the fibrous capsule by manipulating the environment at the interface between the device and the body. The device uses mechanical oscillation to modulate how cells respond around the implant. In a bio-inspired design, the DSR can change its shape at a microscope scale through an actuating membrane.

IMES core faculty member, assistant professor at the Department of Mechanical Engineering, and W.M. Keck Career Development Professor in Biomedical Engineering Ellen Roche, the senior co-author of the study, is a former researcher at NUI Galway who won international acclaim in 2017 for her work in creating a soft robotic sleeve to help patients with heart failure. Of this research, Roche says “This study demonstrates how mechanical perturbations of an implant can modulate the host foreign body response. This has vast potential for a range of clinical applications and will hopefully lead to many future collaborative studies between our teams.”

Garry Duffy, professor in anatomy at NUI Galway and AMBER principal investigator, and a senior co-author of the study, adds “We feel the ideas described in this paper could transform future medical devices and how they interact with the body. We are very excited to develop this technology further and to partner with people interested in the potential of soft robotics to better integrate devices for longer use and superior patient outcomes. It’s fantastic to build and continue the collaboration with the Dolan and Roche labs, and to develop a trans-Atlantic network of soft roboticists.”

The first author of the study, Eimear Dolan, lecturer of biomedical engineering at NUI Galway and former researcher in the Roche and Duffy labs at MIT and NUI Galway, says “We are very excited to publish this study, as it describes an innovative approach to modulate the foreign-body response using soft robotics. I recently received a Science Foundation Ireland Royal Society University Research Fellowship to bring this technology forward with a focus on Type 1 diabetes. It is a privilege to work with such a talented multi-disciplinary team, and I look forward to continuing working together.”

Guided by AI, robotic platform automates molecule manufacture

By Becky Ham

Guided by artificial intelligence and powered by a robotic platform, a system developed by MIT researchers moves a step closer to automating the production of small molecules that could be used in medicine, solar energy, and polymer chemistry.

The system, described in the August 8 issue of Science, could free up bench chemists from a variety of routine and time-consuming tasks, and may suggest possibilities for how to make new molecular compounds, according to the study co-leaders Klavs F. Jensen, the Warren K. Lewis Professor of Chemical Engineering, and Timothy F. Jamison, the Robert R. Taylor Professor of Chemistry and associate provost at MIT.

The technology “has the promise to help people cut out all the tedious parts of molecule building,” including looking up potential reaction pathways and building the components of a molecular assembly line each time a new molecule is produced, says Jensen.

“And as a chemist, it may give you inspirations for new reactions that you hadn’t thought about before,” he adds.

Other MIT authors on the Science paper include Connor W. Coley, Dale A. Thomas III, Justin A. M. Lummiss, Jonathan N. Jaworski, Christopher P. Breen, Victor Schultz, Travis Hart, Joshua S. Fishman, Luke Rogers, Hanyu Gao, Robert W. Hicklin, Pieter P. Plehiers, Joshua Byington, John S. Piotti, William H. Green, and A. John Hart.

From inspiration to recipe to finished product

The new system combines three main steps. First, software guided by artificial intelligence suggests a route for synthesizing a molecule, then expert chemists review this route and refine it into a chemical “recipe,” and finally the recipe is sent to a robotic platform that automatically assembles the hardware and performs the reactions that build the molecule.

Coley and his colleagues have been working for more than three years to develop the open-source software suite that suggests and prioritizes possible synthesis routes. At the heart of the software are several neural network models, which the researchers trained on millions of previously published chemical reactions drawn from the Reaxys and U.S. Patent and Trademark Office databases. The software uses these data to identify the reaction transformations and conditions that it believes will be suitable for building a new compound.

“It helps makes high-level decisions about what kinds of intermediates and starting materials to use, and then slightly more detailed analyses about what conditions you might want to use and if those reactions are likely to be successful,” says Coley.

“One of the primary motivations behind the design of the software is that it doesn’t just give you suggestions for molecules we know about or reactions we know about,” he notes. “It can generalize to new molecules that have never been made.”

Chemists then review the suggested synthesis routes produced by the software to build a more complete recipe for the target molecule. The chemists sometimes need to perform lab experiments or tinker with reagent concentrations and reaction temperatures, among other changes.

“They take some of the inspiration from the AI and convert that into an executable recipe file, largely because the chemical literature at present does not have enough information to move directly from inspiration to execution on an automated system,” Jamison says.

The final recipe is then loaded on to a platform where a robotic arm assembles modular reactors, separators, and other processing units into a continuous flow path, connecting pumps and lines that bring in the molecular ingredients.

“You load the recipe — that’s what controls the robotic platform — you load the reagents on, and press go, and that allows you to generate the molecule of interest,” says Thomas. “And then when it’s completed, it flushes the system and you can load the next set of reagents and recipe, and allow it to run.”

Unlike the continuous flow system the researchers presented last year, which had to be manually configured after each synthesis, the new system is entirely configured by the robotic platform.

“This gives us the ability to sequence one molecule after another, as well as generate a library of molecules on the system, autonomously,” says Jensen.

The design for the platform, which is about two cubic meters in size — slightly smaller than a standard chemical fume hood — resembles a telephone switchboard and operator system that moves connections between the modules on the platform.

“The robotic arm is what allowed us to manipulate the fluidic paths, which reduced the number of process modules and fluidic complexity of the system, and by reducing the fluidic complexity we can increase the molecular complexity,” says Thomas. “That allowed us to add additional reaction steps and expand the set of reactions that could be completed on the system within a relatively small footprint.”

Toward full automation

The researchers tested the full system by creating 15 different medicinal small molecules of different synthesis complexity, with processes taking anywhere between two hours for the simplest creations to about 68 hours for manufacturing multiple compounds.

The team synthesized a variety of compounds: aspirin and the antibiotic secnidazole in back-to-back processes; the painkiller lidocaine and the antianxiety drug diazepam in back-to-back processes using a common feedstock of reagents; the blood thinner warfarin and the Parkinson’s disease drug safinamide, to show how the software could design compounds with similar molecular components but differing 3-D structures; and a family of five ACE inhibitor drugs and a family of four nonsteroidal anti-inflammatory drugs.

“I’m particularly proud of the diversity of the chemistry and the kinds of different chemical reactions,” says Jamison, who said the system handled about 30 different reactions compared to about 12 different reactions in the previous continuous flow system.

“We are really trying to close the gap between idea generation from these programs and what it takes to actually run a synthesis,” says Coley. “We hope that next-generation systems will increase further the fraction of time and effort that scientists can focus their efforts on creativity and design.”  

The research was supported, in part, by the U.S. Defense Advanced Research Projects Agency (DARPA) Make-It program.

Professor Emeritus Fernando Corbató, MIT computing pioneer, dies at 93

Corbató in 1965, using one of MIT’s mainframe computers
Image: Computer History Museum

By Adam Conner-Simons | Rachel Gordon

Fernando “Corby” Corbató, an MIT professor emeritus whose work in the 1960s on time-sharing systems broke important ground in democratizing the use of computers, died on Friday, July 12, at his home in Newburyport, Massachusetts. He was 93.

Decades before the existence of concepts like cybersecurity and the cloud, Corbató led the development of one of the world’s first operating systems. His “Compatible Time-Sharing System” (CTSS) allowed multiple people to use a computer at the same time, greatly increasing the speed at which programmers could work. It’s also widely credited as the first computer system to use passwords

After CTSS Corbató led a time-sharing effort called Multics, which directly inspired operating systems like Linux and laid the foundation for many aspects of modern computing. Multics doubled as a fertile training ground for an emerging generation of programmers that included C programming language creator Dennis Ritchie, Unix developer Ken Thompson, and spreadsheet inventors Dan Bricklin and Bob Frankston.

Before time-sharing, using a computer was tedious and required detailed knowledge. Users would create programs on cards and submit them in batches to an operator, who would enter them to be run one at a time over a series of hours. Minor errors would require repeating this sequence, often more than once.

But with CTSS, which was first demonstrated in 1961, answers came back in mere seconds, forever changing the model of program development. Decades before the PC revolution, Corbató and his colleagues also opened up communication between users with early versions of email, instant messaging, and word processing. 

“Corby was one of the most important researchers for making computing available to many people for many purposes,” says long-time colleague Tom Van Vleck. “He saw that these concepts don’t just make things more efficient; they fundamentally change the way people use information.”

Besides making computing more efficient, CTSS also inadvertently helped establish the very concept of digital privacy itself. With different users wanting to keep their own files private, CTSS introduced the idea of having people create individual accounts with personal passwords. Corbató’s vision of making high-performance computers available to more people also foreshadowed trends in cloud computing, in which tech giants like Amazon and Microsoft rent out shared servers to companies around the world. 

“Other people had proposed the idea of time-sharing before,” says Jerry Saltzer, who worked on CTSS with Corbató after starting out as his teaching assistant. “But what he brought to the table was the vision and the persistence to get it done.”

CTSS was also the spark that convinced MIT to launch “Project MAC,” the precursor to the Laboratory for Computer Science (LCS). LCS later merged with the Artificial Intelligence Lab to become MIT’s largest research lab, the Computer Science and Artificial Intelligence Laboratory (CSAIL), which is now home to more than 600 researchers. 

“It’s no overstatement to say that Corby’s work on time-sharing fundamentally transformed computers as we know them today,” says CSAIL Director Daniela Rus. “From PCs to smartphones, the digital revolution can directly trace its roots back to the work that he led at MIT nearly 60 years ago.” 

In 1990 Corbató was honored for his work with the Association of Computing Machinery’s Turing Award, often described as “the Nobel Prize for computing.”

From sonar to CTSS

Corbató was born on July 1, 1926 in Oakland, California. At 17 he enlisted as a technician in the U.S. Navy, where he first got the engineering bug working on a range of radar and sonar systems. After World War II he earned his bachelor’s degree at Caltech before heading to MIT to complete a PhD in physics. 

As a PhD student, Corbató met Professor Philip Morse, who recruited him to work with his team on Project Whirlwind, the first computer capable of real-time computation. After graduating, Corbató joined MIT’s Computation Center as a research assistant, soon moving up to become deputy director of the entire center. 

It was there that he started thinking about ways to make computing more efficient. For all its innovation, Whirlwind was still a rather clunky machine. Researchers often had trouble getting much work done on it, since they had to take turns using it for half-hour chunks of time. (Corbató said that it had a habit of crashing every 20 minutes or so.) 

Since computer input and output devices were much slower than the computer itself, in the late 1950s a scheme called multiprogramming was developed to allow a second program to run whenever the first program was waiting for some device to finish. Time-sharing built on this idea, allowing other programs to run while the first program was waiting for a human user to type a request, thus allowing the user to interact directly with the first program.

Saltzer says that Corbató pioneered a programming approach that would be described today as agile design. 

“It’s a buzzword now, but back then it was just this iterative approach to coding that Corby encouraged and that seemed to work especially well,” he says.  

In 1962 Corbató published a paper about CTSS that quickly became the talk of the slowly-growing computer science community. The following year MIT invited several hundred programmers to campus to try out the system, spurring a flurry of further research on time-sharing.

Foreshadowing future technological innovation, Corbató was amazed — and amused — by how quickly people got habituated to CTSS’ efficiency.

“Once a user gets accustomed to [immediate] computer response, delays of even a fraction of a minute are exasperatingly long,” he presciently wrote in his 1962 paper. “First indications are that programmers would readily use such a system if it were generally available.”

Multics, meanwhile, expanded on CTSS’ more ad hoc design with a hierarchical file system, better interfaces to email and instant messaging, and more precise privacy controls. Peter Neumann, who worked at Bell Labs when they were collaborating with MIT on Multics, says that its design prevented the possibility of many vulnerabilities that impact modern systems, like “buffer overflow” (which happens when a program tries to write data outside the computer’s short-term memory). 

“Multics was so far ahead of the rest of the industry,” says Neumann. “It was intensely software-engineered, years before software engineering was even viewed as a discipline.” 

In spearheading these time-sharing efforts, Corbató served as a soft-spoken but driven commander in chief — a logical thinker who led by example and had a distinctly systems-oriented view of the world.

“One thing I liked about working for Corby was that I knew he could do my job if he wanted to,” says Van Vleck. “His understanding of all the gory details of our work inspired intense devotion to Multics, all while still being a true gentleman to everyone on the team.” 

Another legacy of the professor’s is “Corbató’s Law,” which states that the number of lines of code someone can write in a day is the same regardless of the language used. This maxim is often cited by programmers when arguing in favor of using higher-level languages.

Corbató was an active member of the MIT community, serving as associate department head for computer science and engineering from 1974 to 1978 and 1983 to 1993. He was a member of the National Academy of Engineering, and a fellow of the Institute of Electrical and Electronics Engineers and the American Association for the Advancement of Science. 

Corbató is survived by his wife, Emily Corbató, from Brooklyn, New York; his stepsons, David and Jason Gish; his brother, Charles; and his daughters, Carolyn and Nancy, from his marriage to his late wife Isabel; and five grandchildren. 

In lieu of flowers, gifts may be made to MIT’s Fernando Corbató Fellowship Fund via Bonny Kellermann in the Memorial Gifts Office. 

CSAIL will host an event to honor and celebrate Corbató in the coming months. 

Automated system generates robotic parts for novel tasks

A new MIT-invented system automatically designs and 3-D prints complex robotic actuators optimized according to an enormous number of specifications, such as appearance and flexibility. To demonstrate the system, the researchers fabricated floating water lilies with petals equipped with arrays of actuators and hinges that fold up in response to magnetic fields run through conductive fluids.
Credit: Subramanian Sundaram

By Rob Matheson

An automated system developed by MIT researchers designs and 3-D prints complex robotic parts called actuators that are optimized according to an enormous number of specifications. In short, the system does automatically what is virtually impossible for humans to do by hand.  

In a paper published today in Science Advances, the researchers demonstrate the system by fabricating actuators — devices that mechanically control robotic systems in response to electrical signals — that show different black-and-white images at different angles. One actuator, for instance, portrays a Vincent van Gogh portrait when laid flat. Tilted an angle when it’s activated, however, it portrays the famous Edvard Munch painting “The Scream.” The researchers also 3-D printed floating water lilies with petals equipped with arrays of actuators and hinges that fold up in response to magnetic fields run through conductive fluids.

The actuators are made from a patchwork of three different materials, each with a different light or dark color and a property — such as flexibility and magnetization — that controls the actuator’s angle in response to a control signal. Software first breaks down the actuator design into millions of three-dimensional pixels, or “voxels,” that can each be filled with any of the materials. Then, it runs millions of simulations, filling different voxels with different materials. Eventually, it lands on the optimal placement of each material in each voxel to generate two different images at two different angles. A custom 3-D printer then fabricates the actuator by dropping the right material into the right voxel, layer by layer.

“Our ultimate goal is to automatically find an optimal design for any problem, and then use the output of our optimized design to fabricate it,” says first author Subramanian Sundaram PhD ’18, a former graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “We go from selecting the printing materials, to finding the optimal design, to fabricating the final product in almost a completely automated way.”

The shifting images demonstrates what the system can do. But actuators optimized for appearance and function could also be used for biomimicry in robotics. For instance, other researchers are designing underwater robotic skins with actuator arrays meant to mimic denticles on shark skin. Denticles collectively deform to decrease drag for faster, quieter swimming. “You can imagine underwater robots having whole arrays of actuators coating the surface of their skins, which can be optimized for drag and turning efficiently, and so on,” Sundaram says.

Joining Sundaram on the paper are: Melina Skouras, a former MIT postdoc; David S. Kim, a former researcher in the Computational Fabrication Group; Louise van den Heuvel ’14, SM ’16; and Wojciech Matusik, an MIT associate professor in electrical engineering and computer science and head of the Computational Fabrication Group.

Navigating the “combinatorial explosion”

Robotic actuators today are becoming increasingly complex. Depending on the application, they must be optimized for weight, efficiency, appearance, flexibility, power consumption, and various other functions and performance metrics. Generally, experts manually calculate all those parameters to find an optimal design.  

Adding to that complexity, new 3-D-printing techniques can now use multiple materials to create one product. That means the design’s dimensionality becomes incredibly high. “What you’re left with is what’s called a ‘combinatorial explosion,’ where you essentially have so many combinations of materials and properties that you don’t have a chance to evaluate every combination to create an optimal structure,” Sundaram says.

In their work, the researchers first customized three polymer materials with specific properties they needed to build their actuators: color, magnetization, and rigidity. In the end, they produced a near-transparent rigid material, an opaque flexible material used as a hinge, and a brown nanoparticle material that responds to a magnetic signal. They plugged all that characterization data into a property library.

The system takes as input grayscale image examples — such as the flat actuator that displays the Van Gogh portrait but tilts at an exact angle to show “The Scream.” It basically executes a complex form of trial and error that’s somewhat like rearranging a Rubik’s Cube, but in this case around 5.5 million voxels are iteratively reconfigured to match an image and meet a measured angle.

Initially, the system draws from the property library to randomly assign different materials to different voxels. Then, it runs a simulation to see if that arrangement portrays the two target images, straight on and at an angle. If not, it gets an error signal. That signal lets it know which voxels are on the mark and which should be changed. Adding, removing, and shifting around brown magnetic voxels, for instance, will change the actuator’s angle when a magnetic field is applied. But, the system also has to consider how aligning those brown voxels will affect the image.

Voxel by voxel

To compute the actuator’s appearances at each iteration, the researchers adopted a computer graphics technique called “ray-tracing,” which simulates the path of light interacting with objects. Simulated light beams shoot through the actuator at each column of voxels. Actuators can be fabricated with more than 100 voxel layers. Columns can contain more than 100 voxels, with different sequences of the materials that radiate a different shade of gray when flat or at an angle.

When the actuator is flat, for instance, the light beam may shine down on a column containing many brown voxels, producing a dark tone. But when the actuator tilts, the beam will shine on misaligned voxels. Brown voxels may shift away from the beam, while more clear voxels may shift into the beam, producing a lighter tone. The system uses that technique to align dark and light voxel columns where they need to be in the flat and angled image. After 100 million or more iterations, and anywhere from a few to dozens of hours, the system will find an arrangement that fits the target images.

“We’re comparing what that [voxel column] looks like when it’s flat or when it’s titled, to match the target images,” Sundaram says. “If not, you can swap, say, a clear voxel with a brown one. If that’s an improvement, we keep this new suggestion and make other changes over and over again.”

To fabricate the actuators, the researchers built a custom 3-D printer that uses a technique called “drop-on-demand.” Tubs of the three materials are connected to print heads with hundreds of nozzles that can be individually controlled. The printer fires a 30-micron-sized droplet of the designated material into its respective voxel location. Once the droplet lands on the substrate, it’s solidified. In that way, the printer builds an object, layer by layer.

The work could be used as a stepping stone for designing larger structures, such as airplane wings, Sundaram says. Researchers, for instance, have similarly started breaking down airplane wings into smaller voxel-like blocks to optimize their designs for weight and lift, and other metrics. “We’re not yet able to print wings or anything on that scale, or with those materials. But I think this is a first step toward that goal,” Sundaram says.

Professor Patrick Winston, former director of MIT’s Artificial Intelligence Laboratory, dies at 76

A devoted teacher and cherished colleague, Patrick Winston led CSAIL’s Genesis Group, which focused on developing AI systems that have human-like intelligence, including the ability to tell, perceive and comprehend stories.
Photo: Jason Dorfman/MIT CSAIL

By Adam Conner-Simons and Rachel Gordon

Patrick Winston, a beloved professor and computer scientist at MIT, died on July 19 at Massachusetts General Hospital in Boston. He was 76.
 
A professor at MIT for almost 50 years, Winston was director of MIT’s Artificial Intelligence Laboratory from 1972 to 1997 before it merged with the Laboratory for Computer Science to become MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL).

 
A devoted teacher and cherished colleague, Winston led CSAIL’s Genesis Group, which focused on developing AI systems that have human-like intelligence, including the ability to tell, perceive, and comprehend stories. He believed that such work could help illuminate aspects of human intelligence that scientists don’t yet understand.
 
“My principal interest is in figuring out what’s going on inside our heads, and I’m convinced that one of the defining features of human intelligence is that we can understand stories,’” said Winston, the Ford Professor of Artificial Intelligence and Computer Science, in a 2011 interview for CSAIL. “Believing as I do that stories are important, it was natural for me to try to build systems that understand stories, and that shed light on what the story-understanding process is all about.”
 
He was renowned for his accessible and informative lectures, and gave a hugely popular talk every year during the Independent Activities Period called “How to Speak.” 
 
“As a speaker he always had his audience in the palm of his hand,” says MIT Professor Peter Szolovits. “He put a tremendous amount of work into his lectures, and yet managed to make them feel loose and spontaneous. He wasn’t flashy, but he was compelling and direct. ”
 
Winston’s dedication to teaching earned him many accolades over the years, including the Baker Award, the Eta Kappa Nu Teaching Award, and the Graduate Student Council Teaching Award.
 
“Patrick’s humanity and his commitment to the highest principles made him the soul of EECS,” MIT President L. Rafael Reif wrote in a letter to the MIT community. “I called on him often for advice and feedback, and he always responded with kindness, candor, wisdom and integrity.  I will be forever grateful for his counsel, his objectivity, and his tremendous inspiration and dedication to our students.”
 
Teaching computers to think

Born Feb. 5, 1943 in Peoria, Illinois, Winston was always exceptionally curious about science, technology and how to use such tools to explore what it means to be human. He was an MIT-lifer starting in 1961, earning his bachelor’s, master’s and doctoral degrees from the Institute before joining the faculty of the Department of Electrical Engineering and Computer Science in 1970.
 
His thesis work with Marvin Minsky centered on the difficulty of learning, setting off a trajectory of work where he put a playful, yet laser-sharp focus on fine-tuning AI systems to better understand stories.
 
His Genesis project aimed to faithfully model computers after human intelligence in order to fully grasp the inner workings of our own motivations, rationality, and perception. Using MIT research scientist Boris Katz’s START natural language processing system and a vision system developed by former MIT PhD student Sajit Rao, Genesis can digest short, simple chunks of text, then spit out reports about how it interpreted connections between events.
 
While the system has processed many works, Winston chose “Macbeth” as a primary text because the tragedy offers an opportunity to take big human themes, such as greed and revenge, and map out their components.
 
“[Shakespeare] was pretty good at his portrayal of ‘the human condition,’ as my friends in the humanities would say,” Winston told The Boston Globe. “So there’s all kinds of stuff in there about what’s typical when we humans wander through the world.”
 
His deep fascination with humanity, human intelligence, and how we communicate information spilled over into what he often described as his favorite academic activity: teaching.
 
“He was a superb educator who introduced the field to generations of students,” says MIT Professor and longtime colleague Randall Davis. “His lectures had an uncanny ability to move in minutes from the details of an algorithm to the larger issues it illustrated, to yet larger lessons about how to be a scientist and a human being.”
 
A past president of the Association for the Advancement of Artificial Intelligence (AAAI), Winston also wrote and edited numerous books, including a seminal textbook on AI that’s still used in classrooms around the world. Outside of the lab he also co-founded Ascent Technology, which produces scheduling and workforce management applications for major airports.
 
He is survived by his wife Karen Prendergast and his daughter Sarah.

Tiny motor can “walk” to carry out tasks

This walking microrobot was built by the MIT team from a set of just five basic parts, including a coil, a magnet, and stiff and flexible structural pieces.
Photo by Will Langford

By David L. Chandler

Years ago, MIT Professor Neil Gershenfeld had an audacious thought. Struck by the fact that all the world’s living things are built out of combinations of just 20 amino acids, he wondered: Might it be possible to create a kit of just 20 fundamental parts that could be used to assemble all of the different technological products in the world?

Gershenfeld and his students have been making steady progress in that direction ever since. Their latest achievement, presented this week at an international robotics conference, consists of a set of five tiny fundamental parts that can be assembled into a wide variety of functional devices, including a tiny “walking” motor that can move back and forth across a surface or turn the gears of a machine.

Previously, Gershenfeld and his students showed that structures assembled from many small, identical subunits can have numerous mechanical properties. Next, they demonstrated that a combination of rigid and flexible part types can be used to create morphing airplane wings, a longstanding goal in aerospace engineering. Their latest work adds components for movement and logic, and will be presented at the International Conference on Manipulation, Automation and Robotics at Small Scales (MARSS) in Helsinki, Finland, in a paper by Gershenfeld and MIT graduate student Will Langford.

Their work offers an alternative to today’s approaches to contructing robots, which largely fall into one of two types: custom machines that work well but are relatively expensive and inflexible, and reconfigurable ones that sacrifice performance for versatility. In the new approach, Langford came up with a set of five millimeter-scale components, all of which can be attached to each other by a standard connector. These parts include the previous rigid and flexible types, along with electromagnetic parts, a coil, and a magnet. In the future, the team plans to make these out of still smaller basic part types.

Using this simple kit of tiny parts, Langford assembled them into a novel kind of motor that moves an appendage in discrete mechanical steps, which can be used to turn a gear wheel, and a mobile form of the motor that turns those steps into locomotion, allowing it to “walk” across a surface in a way that is reminiscent of the molecular motors that move muscles. These parts could also be assembled into hands for gripping, or legs for walking, as needed for a particular task, and then later reassembled as those needs change. Gershenfeld refers to them as “digital materials,” discrete parts that can be reversibly joined, forming a kind of functional micro-LEGO.

The new system is a significant step toward creating a standardized kit of parts that could be used to assemble robots with specific capabilities adapted to a particular task or set of tasks. Such purpose-built robots could then be disassembled and reassembled as needed in a variety of forms, without the need to design and manufacture new robots from scratch for each application.

Langford’s initial motor has an ant-like ability to lift seven times its own weight. But if greater forces are required, many of these parts can be added to provide more oomph. Or if the robot needs to move in more complex ways, these parts could be distributed throughout the structure. The size of the building blocks can be chosen to match their application; the team has made nanometer-sized parts to make nanorobots, and meter-sized parts to make megarobots. Previously, specialized techniques were needed at each of these length scale extremes.

“One emerging application is to make tiny robots that can work in confined spaces,” Gershenfeld says. Some of the devices assembled in this project, for example, are smaller than a penny yet can carry out useful tasks.

To build in the “brains,” Langford has added part types that contain millimeter-sized integrated circuits, along with a few other part types to take care of connecting electrical signals in three dimensions.

The simplicity and regularity of these structures makes it relatively easy for their assembly to be automated. To do that, Langford has developed a novel machine that’s like a cross between a 3-D printer and the pick-and-place machines that manufacture electronic circuits, but unlike either of those, this one can produce complete robotic systems directly from digital designs. Gershenfeld says this machine is a first step toward to the project’s ultimate goal of “making an assembler that can assemble itself out of the parts that it’s assembling.”

Study: Social robots can benefit hospitalized children

A new study by researchers from MIT, Boston Children’s Hospital, and elsewhere shows that a “social robot,” named Huggable (pictured), can be used in support sessions to boost positive emotions in hospitalized children.
Image: Courtesy of the Personal Robots Group, MIT Media Lab

A new study demonstrates, for the first time, that “social robots” used in support sessions held in pediatric units at hospitals can lead to more positive emotions in sick children.

Many hospitals host interventions in pediatric units, where child life specialists will provide clinical interventions to hospitalized children for developmental and coping support. This involves play, preparation, education, and behavioral distraction for both routine medical care, as well as before, during, and after difficult procedures. Traditional interventions include therapeutic medical play and normalizing the environment through activities such as arts and crafts, games, and celebrations.

For the study, published today in the journal Pediatrics, researchers from the MIT Media Lab, Boston Children’s Hospital, and Northeastern University deployed a robotic teddy bear, “Huggable,” across several pediatric units at Boston Children’s Hospital. More than 50 hospitalized children were randomly split into three groups of interventions that involved Huggable, a tablet-based virtual Huggable, or a traditional plush teddy bear. In general, Huggable improved various patient outcomes over those other two options.  

The study primarily demonstrated the feasibility of integrating Huggable into the interventions. But results also indicated that children playing with Huggable experienced more positive emotions overall. They also got out of bed and moved around more, and emotionally connected with the robot, asking it personal questions and inviting it to come back later to meet their families. “Such improved emotional, physical, and verbal outcomes are all positive factors that could contribute to better and faster recovery in hospitalized children,” the researchers write in their study.

Although it is a small study, it is the first to explore social robotics in a real-world inpatient pediatric setting with ill children, the researchers say. Other studies have been conducted in labs, have studied very few children, or were conducted in public settings without any patient identification.

But Huggable is designed only to assist health care specialists — not replace them, the researchers stress. “It’s a companion,” says co-author Cynthia Breazeal, an associate professor of media arts and sciences and founding director of the Personal Robots group. “Our group designs technologies with the mindset that they’re teammates. We don’t just look at the child-robot interaction. It’s about [helping] specialists and parents, because we want technology to support everyone who’s invested in the quality care of a child.”

“Child life staff provide a lot of human interaction to help normalize the hospital experience, but they can’t be with every kid, all the time. Social robots create a more consistent presence throughout the day,” adds first author Deirdre Logan, a pediatric psychologist at Boston Children’s Hospital. “There may also be kids who don’t always want to talk to people, and respond better to having a robotic stuffed animal with them. It’s exciting knowing what types of support we can provide kids who may feel isolated or scared about what they’re going through.”

Joining Breazeal and Logan on the paper are: Sooyeon Jeong, a PhD student in the Personal Robots group; Brianna O’Connell, Duncan Smith-Freedman, and Peter Weinstock, all of Boston Children’s Hospital; and Matthew Goodwin and James Heathers, both of Northeastern University.

Boosting mood

First prototyped in 2006, Huggable is a plush teddy bear with a screen depicting animated eyes. While the eventual goal is to make the robot fully autonomous, it is currently operated remotely by a specialist in the hall outside a child’s room. Through custom software, a specialist can control the robot’s facial expressions and body actions, and direct its gaze. The specialists could also talk through a speaker — with their voice automatically shifted to a higher pitch to sound more childlike — and monitor the participants via camera feed. The tablet-based avatar of the bear had identical gestures and was also remotely operated.

During the interventions involving Huggable — involving kids ages 3 to 10 years — a specialist would sing nursery rhymes to younger children through robot and move the arms during the song. Older kids would play the I Spy game, where they have to guess an object in the room described by the specialist through Huggable.  

Through self-reports and questionnaires, the researchers recorded how much the patients and families liked interacting with Huggable. Additional questionnaires assessed patient’s positive moods, as well as anxiety and perceived pain levels. The researchers also used cameras mounted in the child’s room to capture and analyze speech patterns, characterizing them as joyful or sad, using software.

A greater percentage of children and their parents reported that the children enjoyed playing with Huggable more than with the avatar or traditional teddy bear. Speech analysis backed up that result, detecting significantly more joyful expressions among the children during robotic interventions. Additionally, parents noted lower levels of perceived pain among their children.

The researchers noted that 93 percent of patients completed the Huggable-based interventions, and found few barriers to practical implementation, as determined by comments from the specialists.

A previous paper based on the same study found that the robot also seemed to facilitate greater family involvement in the interventions, compared to the other two methods, which improved the intervention overall. “Those are findings we didn’t necessarily expect in the beginning,” says Jeong, also a co-author on the previous paper. “We didn’t tell family to join any of the play sessions — it just happened naturally. When the robot came in, the child and robot and parents all interacted more, playing games or in introducing the robot.”

An automated, take-home bot

The study also generated valuable insights for developing a fully autonomous Huggable robot, which is the researchers’ ultimate goal. They were able to determine which physical gestures are used most and least often, and which features specialists may want for future iterations. Huggable, for instance, could introduce doctors before they enter a child’s room or learn a child’s interests and share that information with specialists. The researchers may also equip the robot with computer vision, so it can detect certain objects in a room to talk about those with children.

“In these early studies, we capture data … to wrap our heads around an authentic use-case scenario where, if the bear was automated, what does it need to do to provide high-quality standard of care,” Breazeal says.

In the future, that automated robot could be used to improve continuity of care. A child would take home a robot after a hospital visit to further support engagement, adherence to care regimens, and monitoring well-being.

“We want to continue thinking about how robots can become part of the whole clinical team and help everyone,” Jeong says. “When the robot goes home, we want to see the robot monitor a child’s progress. … If there’s something clinicians need to know earlier, the robot can let the clinicians know, so [they’re not] surprised at the next appointment that the child hasn’t been doing well.”

Next, the researchers are hoping to zero in on which specific patient populations may benefit the most from the Huggable interventions. “We want to find the sweet spot for the children who need this type of of extra support,” Logan says.

Spotting objects amid clutter

Robots currently attempt to identify objects in a point cloud by comparing a template object — a 3-D dot representation of an object, such as a rabbit — with a point cloud representation of the real world that may contain that object.
Image: Christine Daniloff, MIT

A new MIT-developed technique enables robots to quickly identify objects hidden in a three-dimensional cloud of data, reminiscent of how some people can make sense of a densely patterned “Magic Eye” image if they observe it in just the right way.

Robots typically “see” their environment through sensors that collect and translate a visual scene into a matrix of dots. Think of the world of, well, “The Matrix,” except that the 1s and 0s seen by the fictional character Neo are replaced by dots — lots of dots — whose patterns and densities outline the objects in a particular scene.

Conventional techniques that try to pick out objects from such clouds of dots, or point clouds, can do so with either speed or accuracy, but not both.

With their new technique, the researchers say a robot can accurately pick out an object, such as a small animal, that is otherwise obscured within a dense cloud of dots, within seconds of receiving the visual data. The team says the technique can be used to improve a host of situations in which machine perception must be both speedy and accurate, including driverless cars and robotic assistants in the factory and the home.

“The surprising thing about this work is, if I ask you to find a bunny in this cloud of thousands of points, there’s no way you could do that,” says Luca Carlone, assistant professor of aeronautics and astronautics and a member of MIT’s Laboratory for Information and Decision Systems (LIDS). “But our algorithm is able to see the object through all this clutter. So we’re getting to a level of superhuman performance in localizing objects.”

Carlone and graduate student Heng Yang will present details of the technique later this month at the Robotics: Science and Systems conference in Germany.

“Failing without knowing”

Robots currently attempt to identify objects in a point cloud by comparing a template object — a 3-D dot representation of an object, such as a rabbit — with a point cloud representation of the real world that may contain that object. The template image includes “features,” or collections of dots that indicate characteristic curvatures or angles of that object, such the bunny’s ear or tail. Existing algorithms first extract similar features from the real-life point cloud, then attempt to match those features and the template’s features, and ultimately rotate and align the features to the template to determine if the point cloud contains the object in question.

But the point cloud data that streams into a robot’s sensor invariably includes errors, in the form of dots that are in the wrong position or incorrectly spaced, which can significantly confuse the process of feature extraction and matching. As a consequence, robots can make a huge number of wrong associations, or what researchers call “outliers” between point clouds, and ultimately misidentify objects or miss them entirely.

Carlone says state-of-the-art algorithms are able to sift the bad associations from the good once features have been matched, but they do so in “exponential time,” meaning that even a cluster of processing-heavy computers, sifting through dense point cloud data with existing algorithms, would not be able to solve the problem in a reasonable time. Such techniques, while accurate, are impractical for analyzing larger, real-life datasets containing dense point clouds.

Other algorithms that can quickly identify features and associations do so hastily, creating a huge number of outliers or misdetections in the process, without being aware of these errors.

“That’s terrible if this is running on a self-driving car, or any safety-critical application,” Carlone says. “Failing without knowing you’re failing is the worst thing an algorithm can do.”

A relaxed view

Yang and Carlone instead devised a technique that prunes away outliers in “polynomial time,” meaning that it can do so quickly, even for increasingly dense clouds of dots. The technique can thus quickly and accurately identify objects hidden in cluttered scenes.

The MIT-developed technique quickly and smoothly matches objects to those hidden in dense point clouds (left), versus existing techniques (right) that produce incorrect, disjointed matches. Gif: Courtesy of the researchers

The researchers first used conventional techniques to extract features of a template object from a point cloud. They then developed a three-step process to match the size, position, and orientation of the object in a point cloud with the template object, while simultaneously identifying good from bad feature associations.

The team developed an “adaptive voting scheme” algorithm to prune outliers and match an object’s size and position. For size, the algorithm makes associations between template and point cloud features, then compares the relative distance between features in a template and corresponding features in the point cloud. If, say, the distance between two features in the point cloud is five times that of the corresponding points in the template, the algorithm assigns a “vote” to the hypothesis that the object is five times larger than the template object.

The algorithm does this for every feature association. Then, the algorithm selects those associations that fall under the size hypothesis with the most votes, and identifies those as the correct associations, while pruning away the others.  In this way, the technique simultaneously reveals the correct associations and the relative size of the object represented by those associations. The same process is used to determine the object’s position.  

The researchers developed a separate algorithm for rotation, which finds the orientation of the template object in three-dimensional space.

To do this is an incredibly tricky computational task. Imagine holding a mug and trying to tilt it just so, to match a blurry image of something that might be that same mug. There are any number of angles you could tilt that mug, and each of those angles has a certain likelihood of matching the blurry image.

Existing techniques handle this problem by considering each possible tilt or rotation of the object as a “cost” — the lower the cost, the more likely that that rotation creates an accurate match between features. Each rotation and associated cost is represented in a topographic map of sorts, made up of multiple hills and valleys, with lower elevations associated with lower cost.

But Carlone says this can easily confuse an algorithm, especially if there are multiple valleys and no discernible lowest point representing the true, exact match between a particular rotation of an object and the object in a point cloud. Instead, the team developed a “convex relaxation” algorithm that simplifies the topographic map, with one single valley representing the optimal rotation. In this way, the algorithm is able to quickly identify the rotation that defines the orientation of the object in the point cloud.

With their approach, the team was able to quickly and accurately identify three different objects — a bunny, a dragon, and a Buddha — hidden in point clouds of increasing density. They were also able to identify objects in real-life scenes, including a living room, in which the algorithm quickly was able to spot a cereal box and a baseball hat.

Carlone says that because the approach is able to work in “polynomial time,” it can be easily scaled up to analyze even denser point clouds, resembling the complexity of sensor data for driverless cars, for example.

“Navigation, collaborative manufacturing, domestic robots, search and rescue, and self-driving cars is where we hope to make an impact,” Carlone says.

This research was supported in part by the Army Research Laboratory, the Office of Naval Research, and the Google Daydream Research Program.

Chip design drastically reduces energy needed to compute with light


A new photonic chip design drastically reduces energy needed to compute with light, with simulations suggesting it could run optical neural networks 10 million times more efficiently than its electrical counterparts.
Image: courtesy of the researchers, edited by MIT News

By Rob Matheson

MIT researchers have developed a novel “photonic” chip that uses light instead of electricity — and consumes relatively little power in the process. The chip could be used to process massive neural networks millions of times more efficiently than today’s classical computers do.

Neural networks are machine-learning models that are widely used for such tasks as robotic object identification, natural language processing, drug development, medical imaging, and powering driverless cars. Novel optical neural networks, which use optical phenomena to accelerate computation, can run much faster and more efficiently than their electrical counterparts.  

But as traditional and optical neural networks grow more complex, they eat up tons of power. To tackle that issue, researchers and major tech companies — including Google, IBM, and Tesla — have developed “AI accelerators,” specialized chips that improve the speed and efficiency of training and testing neural networks.

For electrical chips, including most AI accelerators, there is a theoretical minimum limit for energy consumption. Recently, MIT researchers have started developing photonic accelerators for optical neural networks. These chips perform orders of magnitude more efficiently, but they rely on some bulky optical components that limit their use to relatively small neural networks.

In a paper published in Physical Review X, MIT researchers describe a new photonic accelerator that uses more compact optical components and optical signal-processing techniques, to drastically reduce both power consumption and chip area. That allows the chip to scale to neural networks several orders of magnitude larger than its counterparts.

Simulated training of neural networks on the MNIST image-classification dataset suggest the accelerator can theoretically process neural networks more than 10 million times below the energy-consumption limit of traditional electrical-based accelerators and about 1,000 times below the limit of photonic accelerators. The researchers are now working on a prototype chip to experimentally prove the results.

“People are looking for technology that can compute beyond the fundamental limits of energy consumption,” says Ryan Hamerly, a postdoc in the Research Laboratory of Electronics. “Photonic accelerators are promising … but our motivation is to build a [photonic accelerator] that can scale up to large neural networks.”

Practical applications for such technologies include reducing energy consumption in data centers. “There’s a growing demand for data centers for running large neural networks, and it’s becoming increasingly computationally intractable as the demand grows,” says co-author Alexander Sludds, a graduate student in the Research Laboratory of Electronics. The aim is “to meet computational demand with neural network hardware … to address the bottleneck of energy consumption and latency.”

Joining Sludds and Hamerly on the paper are: co-author Liane Bernstein, an RLE graduate student; Marin Soljacic, an MIT professor of physics; and Dirk Englund, an MIT associate professor of electrical engineering and computer science, a researcher in RLE, and head of the Quantum Photonics Laboratory.  

Compact design

Neural networks process data through many computational layers containing interconnected nodes, called “neurons,” to find patterns in the data. Neurons receive input from their upstream neighbors and compute an output signal that is sent to neurons further downstream. Each input is also assigned a “weight,” a value based on its relative importance to all other inputs. As the data propagate “deeper” through layers, the network learns progressively more complex information. In the end, an output layer generates a prediction based on the calculations throughout the layers.

All AI accelerators aim to reduce the energy needed to process and move around data during a specific linear algebra step in neural networks, called “matrix multiplication.” There, neurons and weights are encoded into separate tables of rows and columns and then combined to calculate the outputs.

In traditional photonic accelerators, pulsed lasers encoded with information about each neuron in a layer flow into waveguides and through beam splitters. The resulting optical signals are fed into a grid of square optical components, called “Mach-Zehnder interferometers,” which are programmed to perform matrix multiplication. The interferometers, which are encoded with information about each weight, use signal-interference techniques that process the optical signals and weight values to compute an output for each neuron. But there’s a scaling issue: For each neuron there must be one waveguide and, for each weight, there must be one interferometer. Because the number of weights squares with the number of neurons, those interferometers take up a lot of real estate.

“You quickly realize the number of input neurons can never be larger than 100 or so, because you can’t fit that many components on the chip,” Hamerly says. “If your photonic accelerator can’t process more than 100 neurons per layer, then it makes it difficult to implement large neural networks into that architecture.”

The researchers’ chip relies on a more compact, energy efficient “optoelectronic” scheme that encodes data with optical signals, but uses “balanced homodyne detection” for matrix multiplication. That’s a technique that produces a measurable electrical signal after calculating the product of the amplitudes (wave heights) of two optical signals.

Pulses of light encoded with information about the input and output neurons for each neural network layer — which are needed to train the network — flow through a single channel. Separate pulses encoded with information of entire rows of weights in the matrix multiplication table flow through separate channels. Optical signals carrying the neuron and weight data fan out to grid of homodyne photodetectors. The photodetectors use the amplitude of the signals to compute an output value for each neuron. Each detector feeds an electrical output signal for each neuron into a modulator, which converts the signal back into a light pulse. That optical signal becomes the input for the next layer, and so on.

The design requires only one channel per input and output neuron, and only as many homodyne photodetectors as there are neurons, not weights. Because there are always far fewer neurons than weights, this saves significant space, so the chip is able to scale to neural networks with more than a million neurons per layer.

Finding the sweet spot

With photonic accelerators, there’s an unavoidable noise in the signal. The more light that’s fed into the chip, the less noise and greater the accuracy — but that gets to be pretty inefficient. Less input light increases efficiency but negatively impacts the neural network’s performance. But there’s a “sweet spot,” Bernstein says, that uses minimum optical power while maintaining accuracy.

That sweet spot for AI accelerators is measured in how many joules it takes to perform a single operation of multiplying two numbers — such as during matrix multiplication. Right now, traditional accelerators are measured in picojoules, or one-trillionth of a joule. Photonic accelerators measure in attojoules, which is a million times more efficient.

In their simulations, the researchers found their photonic accelerator could operate with sub-attojoule efficiency. “There’s some minimum optical power you can send in, before losing accuracy. The fundamental limit of our chip is a lot lower than traditional accelerators … and lower than other photonic accelerators,” Bernstein says.

Autonomous boats can target and latch onto each other


MIT researchers have given their fleet of autonomous “roboats” the ability to automatically target and clasp onto each other — and keep trying if they fail. The roboats are being designed to transport people, collect trash, and self-assemble into floating structures in the canals of Amsterdam.
Courtesy of the researchers

By Rob Matheson

The city of Amsterdam envisions a future where fleets of autonomous boats cruise its many canals to transport goods and people, collect trash, or self-assemble into floating stages and bridges. To further that vision, MIT researchers have given new capabilities to their fleet of robotic boats — which are being developed as part of an ongoing project — that lets them target and clasp onto each other, and keep trying if they fail.

About a quarter of Amsterdam’s surface area is water, with 165 canals winding alongside busy city streets. Several years ago, MIT and the Amsterdam Institute for Advanced Metropolitan Solutions (AMS Institute) teamed up on the “Roboat” project. The idea is to build a fleet of autonomous robotic boats — rectangular hulls equipped with sensors, thrusters, microcontrollers, GPS modules, cameras, and other hardware — that provides intelligent mobility on water to relieve congestion in the city’s busy streets.

One of project’s objectives is to create roboat units that provide on-demand transporation on waterways. Another objective is using the roboat units to automatically form “pop-up” structures, such as foot bridges, performance stages, or even food markets. The structures could then automatically disassemble at set times and reform into target structures for different activities. Additionally, the roboat units could be used as agile sensors to gather data on the city’s infrastructure, and air and water quality, among other things.

In 2016, MIT researchers tested a roboat prototype that cruised around Amsterdam’s canals, moving forward, backward, and laterally along a preprogrammed path. Last year, researchers designed low-cost, 3-D-printed, one-quarter scale versions of the boats, which were more efficient and agile, and came equipped with advanced trajectory-tracking algorithms. 

In a paper presented at the International Conference on Robotics and Automation, the researchers describe roboat units that can now identify and connect to docking stations. Control algorithms guide the roboats to the target, where they automatically connect to a customized latching mechanism with millimeter precision. Moreover, the roboat notices if it has missed the connection, backs up, and tries again.

The researchers tested the latching technique in a swimming pool at MIT and in the Charles River, where waters are rougher. In both instances, the roboat units were usually able to successfully connect in about 10 seconds, starting from around 1 meter away, or they succeeded after a few failed attempts. In Amsterdam, the system could be especially useful for overnight garbage collection. Roboat units could sail around a canal, locate and latch onto platforms holding trash containers, and haul them back to collection facilities.

“In Amsterdam, canals were once used for transportation and other things the roads are now used for. Roads near canals are now very congested — and have noise and pollution — so the city wants to add more functionality back to the canals,” says first author Luis Mateos, a graduate student in the Department of Urban Studies and Planning (DUSP) and a researcher in the MIT Senseable City Lab. “Self-driving technologies can save time, costs and energy, and improve the city moving forward.”

“The aim is to use roboat units to bring new capabilities to life on the water,” adds co-author Daniela Rus, director of the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science. “The new latching mechanism is very important for creating pop-up structures. Roboat does not need latching for autonomous transporation on water, but you need the latching to create any structure, whether it’s mobile or fixed.”

Joining Mateos on the paper are: Wei Wang, a joint postdoc in CSAIL and the Senseable City Lab; Banti Gheneti, a graduate student in the Department of Electrical Engineering and Computer Science; Fabio Duarte, a DUSP and Senseable City Lab research scientist; and Carlo Ratti, director of the Senseable City Lab and a principal investigator and professor of the practice in DUSP.

Making the connection

Each roboat is equipped with latching mechanisms, including ball and socket components, on its front, back, and sides. The ball component resembles a badminton shuttlecock — a cone-shaped, rubber body with a metal ball at the end. The socket component is a wide funnel that guides the ball component into a receptor. Inside the funnel, a laser beam acts like a security system that detects when the ball crosses into the receptor. That activates a mechanism with three arms that closes around and captures the ball, while also sending a feedback signal to both roboats that the connection is complete.

On the software side, the roboats run on custom computer vision and control techniques. Each roboat has a LIDAR system and camera, so they can autonomously move from point to point around the canals. Each docking station — typically an unmoving roboat — has a sheet of paper imprinted with an augmented reality tag, called an AprilTag, which resembles a simplified QR code. Commonly used for robotic applications, AprilTags enable robots to detect and compute their precise 3-D position and orientation relative to the tag.

Both the AprilTags and cameras are located in the same locations in center of the roboats. When a traveling roboat is roughly one or two meters away from the stationary AprilTag, the roboat calculates its position and orientation to the tag. Typically, this would generate a 3-D map for boat motion, including roll, pitch, and yaw (left and right). But an algorithm strips away everything except yaw. This produces an easy-to-compute 2-D plane that measures the roboat camera’s distance away and distance left and right of the tag. Using that information, the roboat steers itself toward the tag. By keeping the camera and tag perfectly aligned, the roboat is able to precisely connect.

The funnel compensates for any misalignment in the roboat’s pitch (rocking up and down) and heave (vertical up and down), as canal waves are relatively small. If, however, the roboat goes beyond its calculated distance, and doesn’t receive a feedback signal from the laser beam, it knows it has missed. “In challenging waters, sometimes roboat units at the current one-quarter scale, are not strong enough to overcome wind gusts or heavy water currents,” Mateos says. “A logic component on the roboat says, ‘You missed, so back up, recalculate your position, and try again.’”

Future iterations

The researchers are now designing roboat units roughly four times the size of the current iterations, so they’ll be more stable on water. Mateos is also working on an update to the funnel that includes tentacle-like rubber grippers that tighten around the pin — like a squid grasping its prey. That could help give the roboat units more control when, say, they’re towing platforms or other roboats through narrow canals.

In the works is also a system that displays the AprilTags on an LCD monitor that changes codes to signal multiple roboat units to assemble in a given order. At first, all roboat units will be given a code to stay exactly a meter apart. Then, the code changes to direct the first roboat to latch. After, the screen switches codes to order the next roboat to latch, and so on. “It’s like the telephone game. The changing code passes a message to one roboat at a time, and that message tells them what to do,” Mateos says.

Darwin Caldwell, the research director of Advanced Robotics at the Italian Institute of Technology, envisions even more possible applications for the autonomous latching capability. “I can certainly see this type of autonomous docking being of use in many areas of robotic ‘refuelling’ and docking … beyond aquatic/naval systems,” he says, “including inflight refuelling, space docking, cargo container handling, [and] robot in-house recharging.”

The research was funded by the AMS Institute and the City of Amsterdam.

Sensor-packed glove learns signatures of the human grasp

MIT researchers have developed a low-cost, sensor-packed glove that captures pressure signals as humans interact with objects. The glove can be used to create high-resolution tactile datasets that robots can leverage to better identify, weigh, and manipulate objects.
Image: Courtesy of the researchers
By Rob Matheson

Wearing a sensor-packed glove while handling a variety of objects, MIT researchers have compiled a massive dataset that enables an AI system to recognize objects through touch alone. The information could be leveraged to help robots identify and manipulate objects, and may aid in prosthetics design.

The researchers developed a low-cost knitted glove, called “scalable tactile glove” (STAG), equipped with about 550 tiny sensors across nearly the entire hand. Each sensor captures pressure signals as humans interact with objects in various ways. A neural network processes the signals to “learn” a dataset of pressure-signal patterns related to specific objects. Then, the system uses that dataset to classify the objects and predict their weights by feel alone, with no visual input needed.

In a paper published today in Nature, the researchers describe a dataset they compiled using STAG for 26 common objects — including a soda can, scissors, tennis ball, spoon, pen, and mug. Using the dataset, the system predicted the objects’ identities with up to 76 percent accuracy. The system can also predict the correct weights of most objects within about 60 grams.

Similar sensor-based gloves used today run thousands of dollars and often contain only around 50 sensors that capture less information. Even though STAG produces very high-resolution data, it’s made from commercially available materials totaling around $10.

The tactile sensing system could be used in combination with traditional computer vision and image-based datasets to give robots a more human-like understanding of interacting with objects.

“Humans can identify and handle objects well because we have tactile feedback. As we touch objects, we feel around and realize what they are. Robots don’t have that rich feedback,” says Subramanian Sundaram PhD ’18, a former graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “We’ve always wanted robots to do what humans can do, like doing the dishes or other chores. If you want robots to do these things, they must be able to manipulate objects really well.”

The researchers also used the dataset to measure the cooperation between regions of the hand during object interactions. For example, when someone uses the middle joint of their index finger, they rarely use their thumb. But the tips of the index and middle fingers always correspond to thumb usage. “We quantifiably show, for the first time, that, if I’m using one part of my hand, how likely I am to use another part of my hand,” he says.

Prosthetics manufacturers can potentially use information to, say, choose optimal spots for placing pressure sensors and help customize prosthetics to the tasks and objects people regularly interact with.

Joining Sundaram on the paper are: CSAIL postdocs Petr Kellnhofer and Jun-Yan Zhu; CSAIL graduate student Yunzhu Li; Antonio Torralba, a professor in EECS and director of the MIT-IBM Watson AI Lab; and Wojciech Matusik, an associate professor in electrical engineering and computer science and head of the Computational Fabrication group.  

STAG is laminated with an electrically conductive polymer that changes resistance to applied pressure. The researchers sewed conductive threads through holes in the conductive polymer film, from fingertips to the base of the palm. The threads overlap in a way that turns them into pressure sensors. When someone wearing the glove feels, lifts, holds, and drops an object, the sensors record the pressure at each point.

The threads connect from the glove to an external circuit that translates the pressure data into “tactile maps,” which are essentially brief videos of dots growing and shrinking across a graphic of a hand. The dots represent the location of pressure points, and their size represents the force — the bigger the dot, the greater the pressure.

From those maps, the researchers compiled a dataset of about 135,000 video frames from interactions with 26 objects. Those frames can be used by a neural network to predict the identity and weight of objects, and provide insights about the human grasp.

To identify objects, the researchers designed a convolutional neural network (CNN), which is usually used to classify images, to associate specific pressure patterns with specific objects. But the trick was choosing frames from different types of grasps to get a full picture of the object.

The idea was to mimic the way humans can hold an object in a few different ways in order to recognize it, without using their eyesight. Similarly, the researchers’ CNN chooses up to eight semirandom frames from the video that represent the most dissimilar grasps — say, holding a mug from the bottom, top, and handle.

But the CNN can’t just choose random frames from the thousands in each video, or it probably won’t choose distinct grips. Instead, it groups similar frames together, resulting in distinct clusters corresponding to unique grasps. Then, it pulls one frame from each of those clusters, ensuring it has a representative sample. Then the CNN uses the contact patterns it learned in training to predict an object classification from the chosen frames.

“We want to maximize the variation between the frames to give the best possible input to our network,” Kellnhofer says. “All frames inside a single cluster should have a similar signature that represent the similar ways of grasping the object. Sampling from multiple clusters simulates a human interactively trying to find different grasps while exploring an object.”

For weight estimation, the researchers built a separate dataset of around 11,600 frames from tactile maps of objects being picked up by finger and thumb, held, and dropped. Notably, the CNN wasn’t trained on any frames it was tested on, meaning it couldn’t learn to just associate weight with an object. In testing, a single frame was inputted into the CNN. Essentially, the CNN picks out the pressure around the hand caused by the object’s weight, and ignores pressure caused by other factors, such as hand positioning to prevent the object from slipping. Then it calculates the weight based on the appropriate pressures.

The system could be combined with the sensors already on robot joints that measure torque and force to help them better predict object weight. “Joints are important for predicting weight, but there are also important components of weight from fingertips and the palm that we capture,” Sundaram says.

Bringing human-like reasoning to driverless car navigation

To bring more human-like reasoning to autonomous vehicle navigation, MIT researchers have created a system that enables driverless cars to check a simple map and use visual data to follow routes in new, complex environments.
Image: Chelsea Turner

By Rob Matheson

With aims of bringing more human-like reasoning to autonomous vehicles, MIT researchers have created a system that uses only simple maps and visual data to enable driverless cars to navigate routes in new, complex environments.

Human drivers are exceptionally good at navigating roads they haven’t driven on before, using observation and simple tools. We simply match what we see around us to what we see on our GPS devices to determine where we are and where we need to go. Driverless cars, however, struggle with this basic reasoning. In every new area, the cars must first map and analyze all the new roads, which is very time consuming. The systems also rely on complex maps — usually generated by 3-D scans — which are computationally intensive to generate and process on the fly.

In a paper being presented at this week’s International Conference on Robotics and Automation, MIT researchers describe an autonomous control system that “learns” the steering patterns of human drivers as they navigate roads in a small area, using only data from video camera feeds and a simple GPS-like map. Then, the trained system can control a driverless car along a planned route in a brand-new area, by imitating the human driver.

Similarly to human drivers, the system also detects any mismatches between its map and features of the road. This helps the system determine if its position, sensors, or mapping are incorrect, in order to correct the car’s course.

To train the system initially, a human operator controlled an automated Toyota Prius — equipped with several cameras and a basic GPS navigation system — to collect data from local suburban streets including various road structures and obstacles. When deployed autonomously, the system successfully navigated the car along a preplanned path in a different forested area, designated for autonomous vehicle tests.

“With our system, you don’t need to train on every road beforehand,” says first author Alexander Amini, an MIT graduate student. “You can download a new map for the car to navigate through roads it has never seen before.”

“Our objective is to achieve autonomous navigation that is robust for driving in new environments,” adds co-author Daniela Rus, director of the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science. “For example, if we train an autonomous vehicle to drive in an urban setting such as the streets of Cambridge, the system should also be able to drive smoothly in the woods, even if that is an environment it has never seen before.”

Joining Rus and Amini on the paper are Guy Rosman, a researcher at the Toyota Research Institute, and Sertac Karaman, an associate professor of aeronautics and astronautics at MIT.

Point-to-point navigation

Traditional navigation systems process data from sensors through multiple modules customized for tasks such as localization, mapping, object detection, motion planning, and steering control. For years, Rus’s group has been developing “end-to-end” navigation systems, which process inputted sensory data and output steering commands, without a need for any specialized modules.

Until now, however, these models were strictly designed to safely follow the road, without any real destination in mind. In the new paper, the researchers advanced their end-to-end system to drive from goal to destination, in a previously unseen environment. To do so, the researchers trained their system to predict a full probability distribution over all possible steering commands at any given instant while driving.

The system uses a machine learning model called a convolutional neural network (CNN), commonly used for image recognition. During training, the system watches and learns how to steer from a human driver. The CNN correlates steering wheel rotations to road curvatures it observes through cameras and an inputted map. Eventually, it learns the most likely steering command for various driving situations, such as straight roads, four-way or T-shaped intersections, forks, and rotaries.

“Initially, at a T-shaped intersection, there are many different directions the car could turn,” Rus says. “The model starts by thinking about all those directions, but as it sees more and more data about what people do, it will see that some people turn left and some turn right, but nobody goes straight. Straight ahead is ruled out as a possible direction, and the model learns that, at T-shaped intersections, it can only move left or right.”

What does the map say?

In testing, the researchers input the system with a map with a randomly chosen route. When driving, the system extracts visual features from the camera, which enables it to predict road structures. For instance, it identifies a distant stop sign or line breaks on the side of the road as signs of an upcoming intersection. At each moment, it uses its predicted probability distribution of steering commands to choose the most likely one to follow its route.

Importantly, the researchers say, the system uses maps that are easy to store and process. Autonomous control systems typically use LIDAR scans to create massive, complex maps that take roughly 4,000 gigabytes (4 terabytes) of data to store just the city of San Francisco. For every new destination, the car must create new maps, which amounts to tons of data processing. Maps used by the researchers’ system, however, captures the entire world using just 40 gigabytes of data.  

During autonomous driving, the system also continuously matches its visual data to the map data and notes any mismatches. Doing so helps the autonomous vehicle better determine where it is located on the road. And it ensures the car stays on the safest path if it’s being fed contradictory input information: If, say, the car is cruising on a straight road with no turns, and the GPS indicates the car must turn right, the car will know to keep driving straight or to stop.

“In the real world, sensors do fail,” Amini says. “We want to make sure that the system is robust to different failures of different sensors by building a system that can accept these noisy inputs and still navigate and localize itself correctly on the road.”

How to tell whether machine-learning systems are robust enough for the real world

Adversarial examples are slightly altered inputs that cause neural networks to make classification mistakes they normally wouldn’t, such as classifying an image of a cat as a dog.
Image: MIT News Office

By Rob Matheson

MIT researchers have devised a method for assessing how robust machine-learning models known as neural networks are for various tasks, by detecting when the models make mistakes they shouldn’t.

Convolutional neural networks (CNNs) are designed to process and classify images for computer vision and many other tasks. But slight modifications that are imperceptible to the human eye — say, a few darker pixels within an image — may cause a CNN to produce a drastically different classification. Such modifications are known as “adversarial examples.” Studying the effects of adversarial examples on neural networks can help researchers determine how their models could be vulnerable to unexpected inputs in the real world.

For example, driverless cars can use CNNs to process visual input and produce an appropriate response. If the car approaches a stop sign, it would recognize the sign and stop. But a 2018 paper found that placing a certain black-and-white sticker on the stop sign could, in fact, fool a driverless car’s CNN to misclassify the sign, which could potentially cause it to not stop at all.

However, there has been no way to fully evaluate a large neural network’s resilience to adversarial examples for all test inputs. In a paper they are presenting this week at the International Conference on Learning Representations, the researchers describe a technique that, for any input, either finds an adversarial example or guarantees that all perturbed inputs — that still appear similar to the original — are correctly classified. In doing so, it gives a measurement of the network’s robustness for a particular task.

Similar evaluation techniques do exist but have not been able to scale up to more complex neural networks. Compared to those methods, the researchers’ technique runs three orders of magnitude faster and can scale to more complex CNNs.

The researchers evaluated the robustness of a CNN designed to classify images in the MNIST dataset of handwritten digits, which comprises 60,000 training images and 10,000 test images. The researchers found around 4 percent of test inputs can be perturbed slightly to generate adversarial examples that would lead the model to make an incorrect classification.

“Adversarial examples fool a neural network into making mistakes that a human wouldn’t,” says first author Vincent Tjeng, a graduate student in the Computer Science and Artificial Intelligence Laboratory (CSAIL). “For a given input, we want to determine whether it is possible to introduce small perturbations that would cause a neural network to produce a drastically different output than it usually would. In that way, we can evaluate how robust different neural networks are, finding at least one adversarial example similar to the input or guaranteeing that none exist for that input.”

Joining Tjeng on the paper are CSAIL graduate student Kai Xiao and Russ Tedrake, a CSAIL researcher and a professor in the Department of Electrical Engineering and Computer Science (EECS).

CNNs process images through many computational layers containing units called neurons. For CNNs that classify images, the final layer consists of one neuron for each category. The CNN classifies an image based on the neuron with the highest output value. Consider a CNN designed to classify images into two categories: “cat” or “dog.” If it processes an image of a cat, the value for the “cat” classification neuron should be higher. An adversarial example occurs when a tiny modification to that image causes the “dog” classification neuron’s value to be higher.

The researchers’ technique checks all possible modifications to each pixel of the image. Basically, if the CNN assigns the correct classification (“cat”) to each modified image, no adversarial examples exist for that image.

Behind the technique is a modified version of “mixed-integer programming,” an optimization method where some of the variables are restricted to be integers. Essentially, mixed-integer programming is used to find a maximum of some objective function, given certain constraints on the variables, and can be designed to scale efficiently to evaluating the robustness of complex neural networks.

The researchers set the limits allowing every pixel in each input image to be brightened or darkened by up to some set value. Given the limits, the modified image will still look remarkably similar to the original input image, meaning the CNN shouldn’t be fooled. Mixed-integer programming is used to find the smallest possible modification to the pixels that could potentially cause a misclassification.

The idea is that tweaking the pixels could cause the value of an incorrect classification to rise. If cat image was fed in to the pet-classifying CNN, for instance, the algorithm would keep perturbing the pixels to see if it can raise the value for the neuron corresponding to “dog” to be higher than that for “cat.”

If the algorithm succeeds, it has found at least one adversarial example for the input image. The algorithm can continue tweaking pixels to find the minimum modification that was needed to cause that misclassification. The larger the minimum modification — called the “minimum adversarial distortion” — the more resistant the network is to adversarial examples. If, however, the correct classifying neuron fires for all different combinations of modified pixels, then the algorithm can guarantee that the image has no adversarial example.

“Given one input image, we want to know if we can modify it in a way that it triggers an incorrect classification,” Tjeng says. “If we can’t, then we have a guarantee that we searched across the whole space of allowable modifications, and found that there is no perturbed version of the original image that is misclassified.”

In the end, this generates a percentage for how many input images have at least one adversarial example, and guarantees the remainder don’t have any adversarial examples. In the real world, CNNs have many neurons and will train on massive datasets with dozens of different classifications, so the technique’s scalability is critical, Tjeng says.

“Across different networks designed for different tasks, it’s important for CNNs to be robust against adversarial examples,” he says. “The larger the fraction of test samples where we can prove that no adversarial example exists, the better the network should perform when exposed to perturbed inputs.”

“Provable bounds on robustness are important as almost all [traditional] defense mechanisms could be broken again,” says Matthias Hein, a professor of mathematics and computer science at Saarland University, who was not involved in the study but has tried the technique. “We used the exact verification framework to show that our networks are indeed robust … [and] made it also possible to verify them compared to normal training.”

Page 6 of 12
1 4 5 6 7 8 12