Page 10 of 12
1 8 9 10 11 12

Custom carpentry with help from robots

PhD student Adriana Schulz was co-lead on AutoSaw, which lets nonexperts customize different items that can then be constructed with the help of robots.
Photo: Jason Dorfman, MIT CSAIL

By Adam Conner-Simons and Rachel Gordon

Every year thousands of carpenters injure their hands and fingers doing dangerous tasks such as sawing.

In an effort to minimize injury and let carpenters focus on design and other bigger-picture tasks, a team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has created AutoSaw, a system that lets nonexperts customize different items that can then be constructed with the help of robots.

Users can choose from a range of carpenter-designed templates for chairs, desks, and other furniture. The team says that AutoSaw could eventually be used for projects as large as a deck or a porch.

“If you’re building a deck, you have to cut large sections of lumber to length, and that’s often done on site,” says CSAIL postdoc Jeffrey Lipton, who was a lead author on a related paper about the system. “Every time you put a hand near a blade, you’re at risk. To avoid that, we’ve largely automated the process using a chop-saw and jigsaw.”

The system also offers flexibility for designing furniture to fit space-constrained houses and apartments. For example, it could allow a user to modify a desk to squeeze into an L-shaped living room, or customize a table to fit in a microkitchen.  

“Robots have already enabled mass production, but with artificial intelligence (AI) they have the potential to enable mass customization and personalization in almost everything we produce,” says CSAIL director and co-author Daniela Rus. “AutoSaw shows this potential for easy access and customization in carpentry.”

The paper, which will be presented in May at the International Conference on Robotics and Automation (ICRA) in Brisbane, Australia, was co-written by Lipton, Rus, and PhD student Adriana Schulz. Other co-authors include MIT Professor Wojciech Matusik, PhD student Andrew Spielberg, and undergraduate Luis Trueba.

How it works

Software isn’t a foreign concept for some carpenters. “Computer Numerical Control” (CNC) can convert designs into numbers that are fed to specially programmed tools to execute. However, the machines used for CNC fabrication are usually large and cumbersome, and users are limited to the size of the existing CNC tools.

As a result, many carpenters continue to use chop-saws, jigsaws, and other hand tools that are low cost, easy to move, and simple to use. These tools, while useful for customization, still put people at a high risk of injury.

AutoSaw draws on expert knowledge for designing, and robotics for the more risky cutting tasks. Using the existing CAD system OnShape with an interface of design templates, users can customize their furniture for things like size, sturdiness, and aesthetics. Once the design is finalized, it’s sent to the robots to assist in the cutting process using the jigsaw and chop-saw.

To cut lumber the team used motion-tracking software and small mobile robots — an approach that takes up less space and is more cost-effective than large robotic arms.

Specifically, the team used a modified Roomba with a jigsaw attached to cut lumber of any shape on a plank. For the chopping, the team used two Kuka youBots to lift the beam, place it on the chop saw, and cut.

“We added soft grippers to the robots to give them more flexibility, like that of a human carpenter,” says Lipton. “This meant we could rely on the accuracy of the power tools instead of the rigid-bodied robots.”

After the robots finish with cutting, the user then assembles the new piece of furniture using step-by-step directions from the system.

Democratizing custom furniture

When testing the system, the teams’ simulations showed that they could build a chair, shed, and deck. Using the robots, the team also made a table with an accuracy comparable to that of a human, without a real hand ever getting near a blade.

“There have been many recent AI achievements in virtual environments, like playing Go and composing music,” says Hod Lipson, a professor of mechanical engineering and data science at Columbia University. “Systems that can work in unstructured physical environments, such as this carpentry system, are notoriously difficult to make. This is truly a fascinating step forward.”

While AutoSaw is still a research platform, in the future the team plans to use materials such as wood, and integrate complex tasks such as drilling and gluing.

“Our aim is to democratize furniture-customization,” says Schulz. “We’re trying to open up a realm of opportunities so users aren’t bound to what they’ve bought at Ikea. Instead, they can make what best fits their needs.”

The project was supported in part by the National Science Foundation.

Robo-picker grasps and packs

The “pick-and-place” system consists of a standard industrial robotic arm that the researchers outfitted with a custom gripper and suction cup. They developed an “object-agnostic” grasping algorithm that enables the robot to assess a bin of random objects and determine the best way to grip or suction onto an item amid the clutter, without having to know anything about the object before picking it up.
Image: Melanie Gonick/MIT
By Jennifer Chu

Unpacking groceries is a straightforward albeit tedious task: You reach into a bag, feel around for an item, and pull it out. A quick glance will tell you what the item is and where it should be stored.

Now engineers from MIT and Princeton University have developed a robotic system that may one day lend a hand with this household chore, as well as assist in other picking and sorting tasks, from organizing products in a warehouse to clearing debris from a disaster zone.

The team’s “pick-and-place” system consists of a standard industrial robotic arm that the researchers outfitted with a custom gripper and suction cup. They developed an “object-agnostic” grasping algorithm that enables the robot to assess a bin of random objects and determine the best way to grip or suction onto an item amid the clutter, without having to know anything about the object before picking it up.

Once it has successfully grasped an item, the robot lifts it out from the bin. A set of cameras then takes images of the object from various angles, and with the help of a new image-matching algorithm the robot can compare the images of the picked object with a library of other images to find the closest match. In this way, the robot identifies the object, then stows it away in a separate bin.

In general, the robot follows a “grasp-first-then-recognize” workflow, which turns out to be an effective sequence compared to other pick-and-place technologies.

“This can be applied to warehouse sorting, but also may be used to pick things from your kitchen cabinet or clear debris after an accident. There are many situations where picking technologies could have an impact,” says Alberto Rodriguez, the Walter Henry Gale Career Development Professor in Mechanical Engineering at MIT.

Rodriguez and his colleagues at MIT and Princeton will present a paper detailing their system at the IEEE International Conference on Robotics and Automation, in May. 

Building a library of successes and failures

While pick-and-place technologies may have many uses, existing systems are typically designed to function only in tightly controlled environments.

Today, most industrial picking robots are designed for one specific, repetitive task, such as gripping a car part off an assembly line, always in the same, carefully calibrated orientation. However, Rodriguez is working to design robots as more flexible, adaptable, and intelligent pickers, for unstructured settings such as retail warehouses, where a picker may consistently encounter and have to sort hundreds, if not thousands of novel objects each day, often amid dense clutter.

The team’s design is based on two general operations: picking — the act of successfully grasping an object, and perceiving — the ability to recognize and classify an object, once grasped.   

The researchers trained the robotic arm to pick novel objects out from a cluttered bin, using any one of four main grasping behaviors: suctioning onto an object, either vertically, or from the side; gripping the object vertically like the claw in an arcade game; or, for objects that lie flush against a wall, gripping vertically, then using a flexible spatula to slide between the object and the wall.

Rodriguez and his team showed the robot images of bins cluttered with objects, captured from the robot’s vantage point. They then showed the robot which objects were graspable, with which of the four main grasping behaviors, and which were not, marking each example as a success or failure. They did this for hundreds of examples, and over time, the researchers built up a library of picking successes and failures. They then incorporated this library into a “deep neural network” — a class of learning algorithms that enables the robot to match the current problem it faces with a successful outcome from the past, based on its library of successes and failures.

“We developed a system where, just by looking at a tote filled with objects, the robot knew how to predict which ones were graspable or suctionable, and which configuration of these picking behaviors was likely to be successful,” Rodriguez says. “Once it was in the gripper, the object was much easier to recognize, without all the clutter.”

From pixels to labels

The researchers developed a perception system in a similar manner, enabling the robot to recognize and classify an object once it’s been successfully grasped.

To do so, they first assembled a library of product images taken from online sources such as retailer websites. They labeled each image with the correct identification — for instance, duct tape versus masking tape — and then developed another learning algorithm to relate the pixels in a given image to the correct label for a given object.

“We’re comparing things that, for humans, may be very easy to identify as the same, but in reality, as pixels, they could look significantly different,” Rodriguez says. “We make sure that this algorithm gets it right for these training examples. Then the hope is that we’ve given it enough training examples that, when we give it a new object, it will also predict the correct label.”

Last July, the team packed up the 2-ton robot and shipped it to Japan, where, a month later, they reassembled it to participate in the Amazon Robotics Challenge, a yearly competition sponsored by the online megaretailer to encourage innovations in warehouse technology. Rodriguez’s team was one of 16 taking part in a competition to pick and stow objects from a cluttered bin.

In the end, the team’s robot had a 54 percent success rate in picking objects up using suction and a 75 percent success rate using grasping, and was able to recognize novel objects with 100 percent accuracy. The robot also stowed all 20 objects within the allotted time.

For his work, Rodriguez was recently granted an Amazon Research Award and will be working with the company to further improve pick-and-place technology — foremost, its speed and reactivity.

“Picking in unstructured environments is not reliable unless you add some level of reactiveness,” Rodriguez says. “When humans pick, we sort of do small adjustments as we are picking. Figuring out how to do this more responsive picking, I think, is one of the key technologies we’re interested in.”

The team has already taken some steps toward this goal by adding tactile sensors to the robot’s gripper and running the system through a new training regime.

“The gripper now has tactile sensors, and we’ve enabled a system where the robot spends all day continuously picking things from one place to another. It’s capturing information about when it succeeds and fails, and how it feels to pick up, or fails to pick up objects,” Rodriguez says. “Hopefully it will use that information to start bringing that reactiveness to grasping.”

This research was sponsored in part by ABB Inc., Mathworks, and Amazon.

Programming drones to fly in the face of uncertainty

Researchers trail a drone on a test flight outdoors.
Photo: Jonathan How/MIT

Companies like Amazon have big ideas for drones that can deliver packages right to your door. But even putting aside the policy issues, programming drones to fly through cluttered spaces like cities is difficult. Being able to avoid obstacles while traveling at high speeds is computationally complex, especially for small drones that are limited in how much they can carry onboard for real-time processing.

Many existing approaches rely on intricate maps that aim to tell drones exactly where they are relative to obstacles, which isn’t particularly practical in real-world settings with unpredictable objects. If their estimated location is off by even just a small margin, they can easily crash.

With that in mind, a team from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has developed NanoMap, a system that allows drones to consistently fly 20 miles per hour through dense environments such as forests and warehouses.

One of NanoMap’s key insights is a surprisingly simple one: The system considers the drone’s position in the world over time to be uncertain, and actually models and accounts for that uncertainty.

“Overly confident maps won’t help you if you want drones that can operate at higher speeds in human environments,” says graduate student Pete Florence, lead author on a new related paper. “An approach that is better aware of uncertainty gets us a much higher level of reliability in terms of being able to fly in close quarters and avoid obstacles.”

Specifically, NanoMap uses a depth-sensing system to stitch together a series of measurements about the drone’s immediate surroundings. This allows it to not only make motion plans for its current field of view, but also anticipate how it should move around in the hidden fields of view that it has already seen.

“It’s kind of like saving all of the images you’ve seen of the world as a big tape in your head,” says Florence. “For the drone to plan motions, it essentially goes back into time to think individually of all the different places that it was in.”

The team’s tests demonstrate the impact of uncertainty. For example, if NanoMap wasn’t modeling uncertainty and the drone drifted just 5 percent away from where it was expected to be, the drone would crash more than once every four flights. Meanwhile, when it accounted for uncertainty, the crash rate reduced to 2 percent.

The paper was co-written by Florence and MIT Professor Russ Tedrake alongside research software engineers John Carter and Jake Ware. It was recently accepted to the IEEE International Conference on Robotics and Automation, which takes place in May in Brisbane, Australia.

For years computer scientists have worked on algorithms that allow drones to know where they are, what’s around them, and how to get from one point to another. Common approaches such as simultaneous localization and mapping (SLAM) take raw data of the world and convert them into mapped representations.

But the output of SLAM methods aren’t typically used to plan motions. That’s where researchers often use methods like “occupancy grids,” in which many measurements are incorporated into one specific representation of the 3-D world.

The problem is that such data can be both unreliable and hard to gather quickly. At high speeds, computer-vision algorithms can’t make much of their surroundings, forcing drones to rely on inexact data from the inertial measurement unit (IMU) sensor, which measures things like the drone’s acceleration and rate of rotation.

The way NanoMap handles this is that it essentially doesn’t sweat the minor details. It operates under the assumption that, to avoid an obstacle, you don’t have to take 100 different measurements and find the average to figure out its exact location in space; instead, you can simply gather enough information to know that the object is in a general area.

“The key difference to previous work is that the researchers created a map consisting of a set of images with their position uncertainty rather than just a set of images and their positions and orientation,” says Sebastian Scherer, a systems scientist at Carnegie Mellon University’s Robotics Institute. “Keeping track of the uncertainty has the advantage of allowing the use of previous images even if the robot doesn’t know exactly where it is and allows in improved planning.”

Florence describes NanoMap as the first system that enables drone flight with 3-D data that is aware of “pose uncertainty,” meaning that the drone takes into consideration that it doesn’t perfectly know its position and orientation as it moves through the world. Future iterations might also incorporate other pieces of information, such as the uncertainty in the drone’s individual depth-sensing measurements.

NanoMap is particularly effective for smaller drones moving through smaller spaces, and works well in tandem with a second system that is focused on more long-horizon planning. (The researchers tested NanoMap last year in a program tied to the Defense Advanced Research Projects Agency, or DARPA.)

The team says that the system could be used in fields ranging from search and rescue and defense to package delivery and entertainment. It can also be applied to self-driving cars and other forms of autonomous navigation.

“The researchers demonstrated impressive results avoiding obstacles and this work enables robots to quickly check for collisions,” says Scherer. “Fast flight among obstacles is a key capability that will allow better filming of action sequences, more efficient information gathering and other advances in the future.”

This work was supported in part by DARPA’s Fast Lightweight Autonomy program.

Robotic interiors

MIT Media Lab spinout Ori is developing smart robotic furniture that transforms into a bedroom, working or storage area, or large closet — or slides back against the wall — to optimize space in small apartments.
Courtesy of Ori

By Rob Matheson

Imagine living in a cramped studio apartment in a large city — but being able to summon your bed or closet through a mobile app, call forth your desk using voice command, or have everything retract at the push of a button.

MIT Media Lab spinout Ori aims to make that type of robotic living a reality. The Boston-based startup is selling smart robotic furniture that transforms into a bedroom, working or storage area, or large closet — or slides back against the wall — to optimize space in small apartments.

Based on years of Media Lab work, Ori’s system is an L-shaped unit installed on a track along a wall, so can slide back and forth. One side features a closet, a small fold-out desk, and several drawers and large cubbies. At the bottom is a pull-out bed. The other side of the unit includes a horizontal surface that can open out to form a table. The vertical surface above that features a large nook where a television can be placed, and additional drawers and cubbies. The third side, opposite the wall, contains still more shelving, and pegs to hang coats and other items.

Users control the unit through a control hub plugged into a wall, or through Ori’s mobile app or a smart home system, such as Amazon’s Echo.

Essentially, a small studio can at any time become a bedroom, lounge, walk-in closet, or living and working area, says Ori founder and CEO Hasier Larrea SM ’15. “We use robotics to … make small spaces act like they were two or three times bigger,” he says. “Around 200 square feet seems too small [total area] to live in, but a 200-square-foot bedroom or living room doesn’t seem so small.” Larrea was named to Forbes’ 2017 30 Under 30 list for his work with Ori.

The first commercial line of the systems, which goes for about $10,000, is now being sold to real estate developers in Boston and other major cities across the U.S. and Canada, for newly built or available apartments. In Boston, partners include Skanska, which has apartments in the Seaport; Samuels and Associates, with buildings around Harvard Square; and Hines for its Marina Bay units. Someday, Larrea says, the system could be bought directly by consumers.

Once the system catches on and the technology evolves, Larrea imagines future apartments could be furnished entirely with robotic furniture from Ori and other companies.

“These technologies can evolve for kitchens, bathrooms, and general partition walls. At some point, a two-bedroom apartment could turn into a large studio, transform into three rooms for your startup, or go into ‘party mode,’ where it all opens up again,” Larrea says. “Spaces will adapt to us, instead of us adapting to spaces, which is what we’ve been doing for so many years.”

Architectural robotics

In 2011, Larrea joined the Media Lab’s City Science research group, directed by Principal Research Scientist Kent Larson, which included his three co-founders: Chad Bean ’14, Carlos Rubio ’14, and Ivan Fernandez de Casadevante, who was a visiting researcher.

The group’s primary focus was tackling challenges of mass urbanization, as cities are becoming increasingly popular living destinations. “Data tells us that, in places like China and India, 600 million people will move from towns to cities in the next 15 years,” Larrea says. “Not only is the way we move through cities and feed people going to need to evolve, but so will the way people live and work in spaces.”

A second emerging phenomenon was the Internet of Things, which saw an influx of smart gadgets, including household items and furniture, designed to connect to the Internet. “Those two megatrends were bound to converge,” Larrea says.

The group started a project called CityHome, creating what it called “architectural robotics,” which integrated robotics, architecture, computer science, and engineering to design smart, modular furniture. The group prototyped a moveable wall that could be controlled via gesture control — which looked similar to today’s Ori system — and constructed a mock 200-square-foot studio apartment on the fifth floor of the Media Lab to test it out. Within the group, the unit was called “furniture with superpowers,” Larrea says, as it made small spaces seem bigger.

After they had constructed their working prototype, in early 2015 the researchers wanted to scale up. Inspiration came from the Media Lab-LEGO MindStorms collaboration from the late 1990s, where researchers created kits that incorporated sensors and motors inside traditional LEGO bricks so kids could build robots and researchers could prototype.

Drawing from that concept, the group built standardized components that could be assembled into a larger piece of modular furniture — what Ori now calls the robotic “muscle,” “skeleton,” “brains,” and the furniture “skins.” Specifically, the muscle consists of the track, motors, and electronics that actuate the system. The skeleton is the frame and the wheels that give the unit structure and movement. The brain is the microcomputer that controls all the safety features and connects the device to the Internet. And the skin is the various pieces of furniture that can be integrated, using the same robotic architecture.

Today, units fit full- or queen-size mattresses and come in different colors. In the future, however, any type of furniture could be integrated, creating units of various shapes, sizes, uses, and price. “The robotics will keep evolving but stay standardized … so, by adding different skins, you can really create anything you can imagine,” Larrea says.

Kickstarting Ori

Going through the Martin Trust Center for MIT Entrepreneurship’s summer accelerator delta V (then called the Global Founders Skills Accelerator) in 2015 “kickstarted” the startup, Larrea says. One lesson that particularly stood out: the importance of conducting market research. “At MIT, sometimes we assume, because we have such a cool technology, marketing it will be easy. … But we forget to talk to people,” he says.

In the early days, the co-founders put tech development aside to speak with owners of studios, offices, and hotels, as well as tenants. In doing so, they learned studio renters in particular had three major complaints: Couples wanted separate living areas, and everyone wanted walk-in closets and space to host parties. The startup then focused on developing a furniture unit that addressed those issues.

After earning one of its first investors in the Media Lab’s E14 Fund in fall 2015, the startup installed an early version of its system in several Boston apartments for renters to test and provide feedback. Soon after, the system hit apartments in 10 major cities across the U.S. and Canada, including San Francisco, Vancouver, Chicago, Miami, and New York. Over the past two years, the startup has used feedback from those pilots to refine the system into today’s commercial model.

Ori will ship an initial production run of 500 units for apartments over the next few months. Soon, Larrea says, the startup also aims to penetrate adjacent markets, such as hotels, dormitories, and offices. “The idea is to prove this isn’t a one-trick pony,” Larrea says. “It’s part of a more comprehensive strategy to unlock the potential of space.”

3Q: Daron Acemoglu on technology and the future of work

K. Daron Acemoglu, the Elizabeth and James Killian Professor of Economics at MIT, is a leading thinker on the labor market implications of artificial intelligence, robotics, automation, and new technologies.
Photo: Jared Charney

By Meg Murphy
K. Daron Acemoglu, the Elizabeth and James Killian Professor of Economics at MIT, is a leading thinker on the labor market implications of artificial intelligence, robotics, automation, and new technologies. His innovative work challenges the way people think about these technologies intersect with the world of work. In 2005, he won the John Bates Clark Medal, an honor shared by a number of Nobel Prize recipients and luminaries in the field of economics.

Acemoglu holds a bachelor’s degree in economics from University of York. His master’s degree in mathematical economics and econometrics and doctorate in economics are from the London School of Economics. With political scientist James Robinson, Acemoglu co-authored the much discussed book “Why Nations Fail” (Crown Business, 2012) and “Economic Origins of Dictatorship and Democracy” (Cambridge University Press, 2006). He also wrote the book, “Introduction to Modern Economic Growth” (Princeton University Press, 2008). Acemoglu recently answered a few questions about technology and work.

Q: How do we begin to understand the rise of artificial intelligence and its future impact on society?

A: We need to look to the past in the face of modern innovations in machine learning, robotics, artificial intelligence, big data, and beyond. The process of machines replacing labor in the production process is not a new one. It’s been going on pretty much continuously since the Industrial Revolution. Spinning and weaving machines took jobs away from spinners and weavers. One innovation would follow another, and people would be thrown out of work by a machine performing the job in a cheaper way.

But at the end of the day, the Industrial Revolution and its aftermath created much better opportunities for people. For much of the 20th century in the U.S., workers’ wages and employment kept growing. New occupations and new tasks and new jobs were generated within the framework of new technological knowledge. A huge number of occupations in the American economy today did not exist 50 years ago — radiologists, management consultants, software developers, and so on. Go back a century and most of the white-collar jobs today did not exist.

Q:  Do you think public fears about the future of work are just?

A: The way we live continuously changes in significant ways — how we learn, how we acquire food, what we emphasize, our social organizations.

Our adjustments to technology — especially transformative technologies — are not a walk in the park. It is not going to be easy and seamless and just sort itself out. A lot of historical evidence shows the process is a painful one. The mechanization of agriculture is one of the greatest achievement of the American economy but it was hugely disruptive for millions of people who suffered joblessness.

At the same time, we are capable technologically and socially of creating many new jobs that will take people to new horizons in terms of productivity and freedom from the hardest types of manual labor. There are great opportunities with artificial intelligence but whether or not we exploit them is a different question. I think you should never be too optimistic but neither should you be too pessimistic.

Q: How do you suggest people prepare for the future job market?

A: We are very much in the midst of understanding what sort of process we are going through. We don’t even necessarily know what skills are needed for the jobs of the future.

Imagine one scenario. Artificial intelligence removes the need for seasoned accountants to fulfill numeracy-related tasks. But we need tax professionals, for instance, to inform clients about their choices and options in some sort of emphatic human way. They will have to become the interface between the machines and the customers. The jobs of the future, in this instance and many others, would require communications, flexibility, and social skills.

However, I don’t know if my hypothesis is true because we haven’t tested it. We haven’t lived through it. I see the biggest void in our knowledge. People at institutions like MIT must learn more about what’s is going on so that we are better prepared to understand the future.

Engineers design artificial synapse for “brain-on-a-chip” hardware

From left: MIT researchers Scott H. Tan, Jeehwan Kim, and Shinhyun Choi
Image: Kuan Qiao

By Jennifer Chu

When it comes to processing power, the human brain just can’t be beat.

Packed within the squishy, football-sized organ are somewhere around 100 billion neurons. At any given moment, a single neuron can relay instructions to thousands of other neurons via synapses — the spaces between neurons, across which neurotransmitters are exchanged. There are more than 100 trillion synapses that mediate neuron signaling in the brain, strengthening some connections while pruning others, in a process that enables the brain to recognize patterns, remember facts, and carry out other learning tasks, at lightning speeds.

Researchers in the emerging field of “neuromorphic computing” have attempted to design computer chips that work like the human brain. Instead of carrying out computations based on binary, on/off signaling, like digital chips do today, the elements of a “brain on a chip” would work in an analog fashion, exchanging a gradient of signals, or “weights,” much like neurons that activate in various ways depending on the type and number of ions that flow across a synapse.

In this way, small neuromorphic chips could, like the brain, efficiently process millions of streams of parallel computations that are currently only possible with large banks of supercomputers. But one significant hangup on the way to such portable artificial intelligence has been the neural synapse, which has been particularly tricky to reproduce in hardware.

Now engineers at MIT have designed an artificial synapse in such a way that they can precisely control the strength of an electric current flowing across it, similar to the way ions flow between neurons. The team has built a small chip with artificial synapses, made from silicon germanium. In simulations, the researchers found that the chip and its synapses could be used to recognize samples of handwriting, with 95 percent accuracy.

The design, published today in the journal Nature Materials, is a major step toward building portable, low-power neuromorphic chips for use in pattern recognition and other learning tasks.

The research was led by Jeehwan Kim, the Class of 1947 Career Development Assistant Professor in the departments of Mechanical Engineering and Materials Science and Engineering, and a principal investigator in MIT’s Research Laboratory of Electronics and Microsystems Technology Laboratories. His co-authors are Shinhyun Choi (first author), Scott Tan (co-first author), Zefan Li, Yunjo Kim, Chanyeol Choi, and Hanwool Yeon of MIT, along with Pai-Yu Chen and Shimeng Yu of Arizona State University.

Too many paths

Most neuromorphic chip designs attempt to emulate the synaptic connection between neurons using two conductive layers separated by a “switching medium,” or synapse-like space. When a voltage is applied, ions should move in the switching medium to create conductive filaments, similarly to how the “weight” of a synapse changes.

But it’s been difficult to control the flow of ions in existing designs. Kim says that’s because most switching mediums, made of amorphous materials, have unlimited possible paths through which ions can travel — a bit like Pachinko, a mechanical arcade game that funnels small steel balls down through a series of pins and levers, which act to either divert or direct the balls out of the machine.

Like Pachinko, existing switching mediums contain multiple paths that make it difficult to predict where ions will make it through. Kim says that can create unwanted nonuniformity in a synapse’s performance.

“Once you apply some voltage to represent some data with your artificial neuron, you have to erase and be able to write it again in the exact same way,” Kim says. “But in an amorphous solid, when you write again, the ions go in different directions because there are lots of defects. This stream is changing, and it’s hard to control. That’s the biggest problem — nonuniformity of the artificial synapse.”

A perfect mismatch

Instead of using amorphous materials as an artificial synapse, Kim and his colleagues looked to single-crystalline silicon, a defect-free conducting material made from atoms arranged in a continuously ordered alignment. The team sought to create a precise, one-dimensional line defect, or dislocation, through the silicon, through which ions could predictably flow.

To do so, the researchers started with a wafer of silicon, resembling, at microscopic resolution, a chicken-wire pattern. They then grew a similar pattern of silicon germanium — a material also used commonly in transistors — on top of the silicon wafer. Silicon germanium’s lattice is slightly larger than that of silicon, and Kim found that together, the two perfectly mismatched materials can form a funnel-like dislocation, creating a single path through which ions can flow. 

The researchers fabricated a neuromorphic chip consisting of artificial synapses made from silicon germanium, each synapse measuring about 25 nanometers across. They applied voltage to each synapse and found that all synapses exhibited more or less the same current, or flow of ions, with about a 4 percent variation between synapses — a much more uniform performance compared with synapses made from amorphous material.

They also tested a single synapse over multiple trials, applying the same voltage over 700 cycles, and found the synapse exhibited the same current, with just 1 percent variation from cycle to cycle.

“This is the most uniform device we could achieve, which is the key to demonstrating artificial neural networks,” Kim says.

Writing, recognized

As a final test, Kim’s team explored how its device would perform if it were to carry out actual learning tasks — specifically, recognizing samples of handwriting, which researchers consider to be a first practical test for neuromorphic chips. Such chips would consist of “input/hidden/output neurons,” each connected to other “neurons” via filament-based artificial synapses.

Scientists believe such stacks of neural nets can be made to “learn.” For instance, when fed an input that is a handwritten ‘1,’ with an output that labels it as ‘1,’ certain output neurons will be activated by input neurons and weights from an artificial synapse. When more examples of handwritten ‘1s’ are fed into the same chip, the same output neurons may be activated when they sense similar features between different samples of the same letter, thus “learning” in a fashion similar to what the brain does.

Kim and his colleagues ran a computer simulation of an artificial neural network consisting of three sheets of neural layers connected via two layers of artificial synapses, the properties of which they based on measurements from their actual neuromorphic chip. They fed into their simulation tens of thousands of samples from a handwritten recognition dataset commonly used by neuromorphic designers, and found that their neural network hardware recognized handwritten samples 95 percent of the time, compared to the 97 percent accuracy of existing software algorithms.

The team is in the process of fabricating a working neuromorphic chip that can carry out handwriting-recognition tasks, not in simulation but in reality. Looking beyond handwriting, Kim says the team’s artificial synapse design will enable much smaller, portable neural network devices that can perform complex computations that currently are only possible with large supercomputers.

“Ultimately we want a chip as big as a fingernail to replace one big supercomputer,” Kim says. “This opens a stepping stone to produce real artificial hardware.”

This research was supported in part by the National Science Foundation.

Computer systems predict objects’ responses to physical forces

As part of an investigation into the nature of humans’ physical intuitions, MIT researchers trained a neural network to predict how unstably stacked blocks would respond to the force of gravity.
Image: Christine Daniloff/MIT

Josh Tenenbaum, a professor of brain and cognitive sciences at MIT, directs research on the development of intelligence at the Center for Brains, Minds, and Machines, a multiuniversity, multidisciplinary project based at MIT that seeks to explain and replicate human intelligence.

Presenting their work at this year’s Conference on Neural Information Processing Systems, Tenenbaum and one of his students, Jiajun Wu, are co-authors on four papers that examine the fundamental cognitive abilities that an intelligent agent requires to navigate the world: discerning distinct objects and inferring how they respond to physical forces.

By building computer systems that begin to approximate these capacities, the researchers believe they can help answer questions about what information-processing resources human beings use at what stages of development. Along the way, the researchers might also generate some insights useful for robotic vision systems.

“The common theme here is really learning to perceive physics,” Tenenbaum says. “That starts with seeing the full 3-D shapes of objects, and multiple objects in a scene, along with their physical properties, like mass and friction, then reasoning about how these objects will move over time. Jiajun’s four papers address this whole space. Taken together, we’re starting to be able to build machines that capture more and more of people’s basic understanding of the physical world.”

Three of the papers deal with inferring information about the physical structure of objects, from both visual and aural data. The fourth deals with predicting how objects will behave on the basis of that data.

Two-way street

Something else that unites all four papers is their unusual approach to machine learning, a technique in which computers learn to perform computational tasks by analyzing huge sets of training data. In a typical machine-learning system, the training data are labeled: Human analysts will have, say, identified the objects in a visual scene or transcribed the words of a spoken sentence. The system attempts to learn what features of the data correlate with what labels, and it’s judged on how well it labels previously unseen data.

In Wu and Tenenbaum’s new papers, the system is trained to infer a physical model of the world — the 3-D shapes of objects that are mostly hidden from view, for instance. But then it works backward, using the model to resynthesize the input data, and its performance is judged on how well the reconstructed data matches the original data.

For instance, using visual images to build a 3-D model of an object in a scene requires stripping away any occluding objects; filtering out confounding visual textures, reflections, and shadows; and inferring the shape of unseen surfaces. Once Wu and Tenenbaum’s system has built such a model, however, it rotates it in space and adds visual textures back in until it can approximate the input data.

Indeed, two of the researchers’ four papers address the complex problem of inferring 3-D models from visual data. On those papers, they’re joined by four other MIT researchers, including William Freeman, the Perkins Professor of Electrical Engineering and Computer Science, and by colleagues at DeepMind, ShanghaiTech University, and Shanghai Jiao Tong University.

Divide and conquer

The researchers’ system is based on the influential theories of the MIT neuroscientist David Marr, who died in 1980 at the tragically young age of 35. Marr hypothesized that in interpreting a visual scene, the brain first creates what he called a 2.5-D sketch of the objects it contained — a representation of just those surfaces of the objects facing the viewer. Then, on the basis of the 2.5-D sketch — not the raw visual information about the scene — the brain infers the full, three-dimensional shapes of the objects.

“Both problems are very hard, but there’s a nice way to disentangle them,” Wu says. “You can do them one at a time, so you don’t have to deal with both of them at the same time, which is even harder.”

Wu and his colleagues’ system needs to be trained on data that include both visual images and 3-D models of the objects the images depict. Constructing accurate 3-D models of the objects depicted in real photographs would be prohibitively time consuming, so initially, the researchers train their system using synthetic data, in which the visual image is generated from the 3-D model, rather than vice versa. The process of creating the data is like that of creating a computer-animated film.

Once the system has been trained on synthetic data, however, it can be fine-tuned using real data. That’s because its ultimate performance criterion is the accuracy with which it reconstructs the input data. It’s still building 3-D models, but they don’t need to be compared to human-constructed models for performance assessment.

In evaluating their system, the researchers used a measure called intersection over union, which is common in the field. On that measure, their system outperforms its predecessors. But a given intersection-over-union score leaves a lot of room for local variation in the smoothness and shape of a 3-D model. So Wu and his colleagues also conducted a qualitative study of the models’ fidelity to the source images. Of the study’s participants, 74 percent preferred the new system’s reconstructions to those of its predecessors.

All that fall

In another of Wu and Tenenbaum’s papers, on which they’re joined again by Freeman and by researchers at MIT, Cambridge University, and ShanghaiTech University, they train a system to analyze audio recordings of an object being dropped, to infer properties such as the object’s shape, its composition, and the height from which it fell. Again, the system is trained to produce an abstract representation of the object, which, in turn, it uses to synthesize the sound the object would make when dropped from a particular height. The system’s performance is judged on the similarity between the synthesized sound and the source sound.

Finally, in their fourth paper, Wu, Tenenbaum, Freeman, and colleagues at DeepMind and Oxford University describe a system that begins to model humans’ intuitive understanding of the physical forces acting on objects in the world. This paper picks up where the previous papers leave off: It assumes that the system has already deduced objects’ 3-D shapes.

Those shapes are simple: balls and cubes. The researchers trained their system to perform two tasks. The first is to estimate the velocities of balls traveling on a billiard table and, on that basis, to predict how they will behave after a collision. The second is to analyze a static image of stacked cubes and determine whether they will fall and, if so, where the cubes will land.

Wu developed a representational language he calls scene XML that can quantitatively characterize the relative positions of objects in a visual scene. The system first learns to describe input data in that language. It then feeds that description to something called a physics engine, which models the physical forces acting on the represented objects. Physics engines are a staple of both computer animation, where they generate the movement of clothing, falling objects, and the like, and of scientific computing, where they’re used for large-scale physical simulations.

After the physics engine has predicted the motions of the balls and boxes, that information is fed to a graphics engine, whose output is, again, compared with the source images. As with the work on visual discrimination, the researchers train their system on synthetic data before refining it with real data.

In tests, the researchers’ system again outperformed its predecessors. In fact, in some of the tests involving billiard balls, it frequently outperformed human observers as well.

Reading a neural network’s mind

Neural nets are so named because they roughly approximate the structure of the human brain. Typically, they’re arranged into layers, and each layer consists of many simple processing units — nodes — each of which is connected to several nodes in the layers above and below. Data is fed into the lowest layer, whose nodes process it and pass it to the next layer. The connections between layers have different “weights,” which determine how much the output of any one node figures into the calculation performed by the next.
Image: Chelsea Turner/MIT

By Larry Hardesty

Neural networks, which learn to perform computational tasks by analyzing huge sets of training data, have been responsible for the most impressive recent advances in artificial intelligence, including speech-recognition and automatic-translation systems.

During training, however, a neural net continually adjusts its internal settings in ways that even its creators can’t interpret. Much recent work in computer science has focused on clever techniques for determining just how neural nets do what they do.

In several recent papers, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Qatar Computing Research Institute have used a recently developed interpretive technique, which had been applied in other areas, to analyze neural networks trained to do machine translation and speech recognition.

They find empirical support for some common intuitions about how the networks probably work. For example, the systems seem to concentrate on lower-level tasks, such as sound recognition or part-of-speech recognition, before moving on to higher-level tasks, such as transcription or semantic interpretation.

But the researchers also find a surprising omission in the type of data the translation network considers, and they show that correcting that omission improves the network’s performance. The improvement is modest, but it points toward the possibility that analysis of neural networks could help improve the accuracy of artificial intelligence systems.

“In machine translation, historically, there was sort of a pyramid with different layers,” says Jim Glass, a CSAIL senior research scientist who worked on the project with Yonatan Belinkov, an MIT graduate student in electrical engineering and computer science. “At the lowest level there was the word, the surface forms, and the top of the pyramid was some kind of interlingual representation, and you’d have different layers where you were doing syntax, semantics. This was a very abstract notion, but the idea was the higher up you went in the pyramid, the easier it would be to translate to a new language, and then you’d go down again. So part of what Yonatan is doing is trying to figure out what aspects of this notion are being encoded in the network.”

The work on machine translation was presented recently in two papers at the International Joint Conference on Natural Language Processing. On one, Belinkov is first author, and Glass is senior author, and on the other, Belinkov is a co-author. On both, they’re joined by researchers from the Qatar Computing Research Institute (QCRI), including Lluís Màrquez, Hassan Sajjad, Nadir Durrani, Fahim Dalvi, and Stephan Vogel. Belinkov and Glass are sole authors on the paper analyzing speech recognition systems, which Belinkov presented at the Neural Information Processing Symposium last week.

Leveling down

Neural nets are so named because they roughly approximate the structure of the human brain. Typically, they’re arranged into layers, and each layer consists of many simple processing units — nodes — each of which is connected to several nodes in the layers above and below. Data are fed into the lowest layer, whose nodes process it and pass it to the next layer. The connections between layers have different “weights,” which determine how much the output of any one node figures into the calculation performed by the next.

During training, the weights between nodes are constantly readjusted. After the network is trained, its creators can determine the weights of all the connections, but with thousands or even millions of nodes, and even more connections between them, deducing what algorithm those weights encode is nigh impossible.

The MIT and QCRI researchers’ technique consists of taking a trained network and using the output of each of its layers, in response to individual training examples, to train another neural network to perform a particular task. This enables them to determine what task each layer is optimized for.

In the case of the speech recognition network, Belinkov and Glass used individual layers’ outputs to train a system to identify “phones,” distinct phonetic units particular to a spoken language. The “t” sounds in the words “tea,” “tree,” and “but,” for instance, might be classified as separate phones, but a speech recognition system has to transcribe all of them using the letter “t.” And indeed, Belinkov and Glass found that lower levels of the network were better at recognizing phones than higher levels, where, presumably, the distinction is less important.

Similarly, in an earlier paper, presented last summer at the Annual Meeting of the Association for Computational Linguistics, Glass, Belinkov, and their QCRI colleagues showed that the lower levels of a machine-translation network were particularly good at recognizing parts of speech and morphology — features such as tense, number, and conjugation.

Making meaning

But in the new paper, they show that higher levels of the network are better at something called semantic tagging. As Belinkov explains, a part-of-speech tagger will recognize that “herself” is a pronoun, but the meaning of that pronoun — its semantic sense — is very different in the sentences “she bought the book herself” and “she herself bought the book.” A semantic tagger would assign different tags to those two instances of “herself,” just as a machine translation system might find different translations for them in a given target language.

The best-performing machine-translation networks use so-called encoding-decoding models, so the MIT and QCRI researchers’ network uses it as well. In such systems, the input, in the source language, passes through several layers of the network — known as the encoder — to produce a vector, a string of numbers that somehow represent the semantic content of the input. That vector passes through several more layers of the network — the decoder — to yield a translation in the target language.

Although the encoder and decoder are trained together, they can be thought of as separate networks. The researchers discovered that, curiously, the lower layers of the encoder are good at distinguishing morphology, but the higher layers of the decoder are not. So Belinkov and the QCRI researchers retrained the network, scoring its performance according to not only accuracy of translation but also analysis of morphology in the target language. In essence, they forced the decoder to get better at distinguishing morphology.

Using this technique, they retrained the network to translate English into German and found that its accuracy increased by 3 percent. That’s not an overwhelming improvement, but it’s an indication that looking under the hood of neural networks could be more than an academic exercise.

Can artificial intelligence learn to scare us?

Just in time for Halloween, a research team from the MIT Media Lab’s Scalable Cooperation group has introduced Shelley: the world’s first artificial intelligence-human horror story collaboration.

Shelley, named for English writer Mary Shelley — best known as the author of “Frankenstein: or, the Modern Prometheus” — is a deep-learning powered artificial intelligence (AI) system that was trained on over 140,000 horror stories on Reddit’s infamous r/nosleep subreddit. She lives on Twitter, where every hour, @shelley_ai tweets out the beginning of a new horror story and the hashtag #yourturn to invite a human collaborator. Anyone is welcome to reply to the tweet with the next part of the story, then Shelley will reply again with the next part, and so on. The results are weird, fun, and unpredictable horror stories that represent both creativity and collaboration — traits that explore the limits of artificial intelligence and machine learning.

“Shelley is a combination of a multi-layer recurrent neural network and an online learning algorithm that learns from crowd’s feedback over time,” explains Pinar Yanardhag, the project’s lead researcher. “The more collaboration Shelley gets from people, the more and scarier stories she will write.”

Shelley starts stories based on the AI’s own learning dataset, but she responds directly to additions to the story from human contributors — which, in turn, adds to her knowledge base. Each completed story is then collected on the Shelley project website.

“Shelley’s creative mind has no boundaries,” the research team says. “She writes stories about a pregnant man who woke up in a hospital, a mouth on the floor with a calm smile, an entire haunted town, a faceless man on the mirror anything is possible!”

One final note on Shelley: The AI was trained on a subreddit filled with adult content, and the researchers have limited control over her — so parents beware.

Teleoperating robots with virtual reality


by Rachel Gordon
Consisting of a headset and hand controllers, CSAIL’s new VR system enables users to teleoperate a robot using an Oculus Rift headset.
Photo: Jason Dorfman/MIT CSAIL

Certain industries have traditionally not had the luxury of telecommuting. Many manufacturing jobs, for example, require a physical presence to operate machinery.

But what if such jobs could be done remotely? Last week researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) presented a virtual reality (VR) system that lets you teleoperate a robot using an Oculus Rift headset.

The system embeds the user in a VR control room with multiple sensor displays, making it feel like they’re inside the robot’s head. By using hand controllers, users can match their movements to the robot’s movements to complete various tasks.

“A system like this could eventually help humans supervise robots from a distance,” says CSAIL postdoc Jeffrey Lipton, who was the lead author on a related paper about the system. “By teleoperating robots from home, blue-collar workers would be able to tele-commute and benefit from the IT revolution just as white-collars workers do now.”  

The researchers even imagine that such a system could help employ increasing numbers of jobless video-gamers by “gameifying” manufacturing positions.

The team used the Baxter humanoid robot from Rethink Robotics, but said that it can work on other robot platforms and is also compatible with the HTC Vive headset.

Lipton co-wrote the paper with CSAIL Director Daniela Rus and researcher Aidan Fay. They presented the paper at the recent IEEE/RSJ International Conference on Intelligent Robots and Systems in Vancouver.

There have traditionally been two main approaches to using VR for teleoperation.

In a direct model, the user’s vision is directly coupled to the robot’s state. With these systems, a delayed signal could lead to nausea and headaches, and the user’s viewpoint is limited to one perspective.

In a cyber-physical model, the user is separate from the robot. The user interacts with a virtual copy of the robot and the environment. This requires much more data, and specialized spaces.

The CSAIL team’s system is halfway between these two methods. It solves the delay problem, since the user is constantly receiving visual feedback from the virtual world. It also solves the the cyber-physical issue of being distinct from the robot: Once a user puts on the headset and logs into the system, they’ll feel as if they’re inside Baxter’s head.

The system mimics the homunculus model of the mind — the idea that there’s a small human inside our brains controlling our actions, viewing the images we see, and understanding them for us. While it’s a peculiar idea for humans, for robots it fits: Inside the robot is a human in a virtual control room, seeing through its eyes and controlling its actions.

Using Oculus’ controllers, users can interact with controls that appear in the virtual space to open and close the hand grippers to pick up, move, and retrieve items. A user can plan movements based on the distance between the arm’s location marker and their hand while looking at the live display of the arm.

To make these movements possible, the human’s space is mapped into the virtual space, and the virtual space is then mapped into the robot space to provide a sense of co-location.

The system is also more flexible compared to previous systems that require many resources. Other systems might extract 2-D information from each camera, build out a full 3-D model of the environment, and then process and redisplay the data. In contrast, the CSAIL team’s approach bypasses all of that by simply taking the 2-D images that are displayed to each eye. (The human brain does the rest by automatically inferring the 3-D information.) 

To test the system, the team first teleoperated Baxter to do simple tasks like picking up screws or stapling wires. They then had the test users teleoperate the robot to pick up and stack blocks.

Users successfully completed the tasks at a much higher rate compared to the direct model. Unsurprisingly, users with gaming experience had much more ease with the system.

Tested against current state-of-the-art systems, CSAIL’s system was better at grasping objects 95 percent of the time and 57 percent faster at doing tasks. The team also showed that the system could pilot the robot from hundreds of miles away; testing included controling Baxter at MIT from a hotel’s wireless network in Washington.

“This contribution represents a major milestone in the effort to connect the user with the robot’s space in an intuitive, natural, and effective manner.” says Oussama Khatib, a computer science professor at Stanford University who was not involved in the paper.

The team eventually wants to focus on making the system more scalable, with many users and different types of robots that can be compatible with current automation technologies.

The project was funded, in part, by the Boeing Company and the National Science Foundation.

“Superhero” robot wears different outfits for different tasks

Dubbed “Primer,” a new cube-shaped robot can be controlled via magnets to make it walk, roll, sail, and glide. It carries out these actions by wearing different exoskeletons, which start out as sheets of plastic that fold into specific shapes when heated. After Primer finishes its task, it can shed its “skin” by immersing itself in water, which dissolves the exoskeleton. Credit: the researchers.

From butterflies that sprout wings to hermit crabs that switch their shells, many animals must adapt their exterior features in order to survive. While humans don’t undergo that kind of metamorphosis, we often try to create functional objects that are similarly adaptive — including our robots.

Despite what you might have seen in “Transformers” movies, though, today’s robots are still pretty inflexible. Each of their parts usually has a fixed structure and a single defined purpose, making it difficult for them to perform a wide variety of actions.

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) are aiming to change that with a new shape-shifting robot that’s something of a superhero: It can transform itself with different “outfits” that allow it to perform different tasks.

Dubbed “Primer,” the cube-shaped robot can be controlled via magnets to make it walk, roll, sail, and glide. It carries out these actions by wearing different exoskeletons, which start out as sheets of plastic that fold into specific shapes when heated. After Primer finishes its task, it can shed its “skin” by immersing itself in water, which dissolves the exoskeleton.

“If we want robots to help us do things, it’s not very efficient to have a different one for each task,” says Daniela Rus, CSAIL director and principal investigator on the project. “With this metamorphosis-inspired approach, we can extend the capabilities of a single robot by giving it different ‘accessories’ to use in different situations.”

Primer’s various forms have a range of advantages. For example, “Wheel-bot” has wheels that allow it to move twice as fast as “Walk-bot.” “Boat-bot” can float on water and carry nearly twice its weight. “Glider-bot” can soar across longer distances, which could be useful for deploying robots or switching environments.

Primer can even wear multiple outfits at once, like a Russian nesting doll. It can add one exoskeleton to become “Walk-bot,” and then interface with another, larger exoskeleton that allows it to carry objects and move two body lengths per second. To deploy the second exoskeleton, “Walk-bot” steps onto the sheet, which then blankets the bot with its four self-folding arms.

“Imagine future applications for space exploration, where you could send a single robot with a stack of exoskeletons to Mars,” says postdoc Shuguang Li, one of the co-authors of the study. “The robot could then perform different tasks by wearing different ‘outfits.’”

The project was led by Rus and Shuhei Miyashita, a former CSAIL postdoc who is now director of the Microrobotics Group at the University of York. Their co-authors include Li and graduate student Steven Guitron. An article about the work appears in the journal Science Robotics on Sept. 27.

Robot metamorphosis

Primer builds on several previous projects from Rus’ team, including magnetic blocks that can assemble themselves into different shapes and centimeter-long microrobots that can be precisely customized from sheets of plastic.

While robots that can change their form or function have been developed at larger sizes, it’s generally been difficult to build such structures at much smaller scales.

“This work represents an advance over the authors’ previous work in that they have now demonstrated a scheme that allows for the creation of five different functionalities,” says Eric Diller, a microrobotics expert and assistant professor of mechanical engineering at the University of Toronto, who was not involved in the paper. “Previous work at most shifted between only two functionalities, such as ‘open’ or ‘closed’ shapes.”

The team outlines many potential applications for robots that can perform multiple actions with just a quick costume change. For example, say some equipment needs to be moved across a stream. A single robot with multiple exoskeletons could potentially sail across the stream and then carry objects on the other side.

“Our approach shows that origami-inspired manufacturing allows us to have robotic components that are versatile, accessible, and reusable,” says Rus, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT.

Designed in a matter of hours, the exoskeletons fold into shape after being heated for just a few seconds, suggesting a new approach to rapid fabrication of robots.

“I could envision devices like these being used in ‘microfactories’ where prefabricated parts and tools would enable a single microrobot to do many complex tasks on demand,” Diller says.

As a next step, the team plans to explore giving the robots an even wider range of capabilities, from driving through water and burrowing in sand to camouflaging their color. Guitron pictures a future robotics community that shares open-source designs for parts much the way 3-D-printing enthusiasts trade ideas on sites such as Thingiverse.

“I can imagine one day being able to customize robots with different arms and appendages,” says Rus. “Why update a whole robot when you can just update one part of it?”

This project was supported, in part, by the National Science Foundation.

Automatic code reuse

“CodeCarbonCopy enables one of the holy grails of software engineering: automatic code reuse,” says Stelios Sidiroglou-Douskos, a research scientist at CSAIL. Credit: MIT News

by Larry Hardesty

Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have developed a new system that allows programmers to transplant code from one program into another. The programmer can select the code from one program and an insertion point in a second program, and the system will automatically make modifications necessary — such as changing variable names — to integrate the code into its new context.

Crucially, the system is able to translate between “data representations” used by the donor and recipient programs. An image-processing program, for instance, needs to be able to handle files in a range of formats, such as jpeg, tiff, or png. But internally, it will represent all such images using a single standardized scheme. Different programs, however, may use different internal schemes. The CSAIL researchers’ system automatically maps the donor program’s scheme onto that of the recipient, to import code seamlessly.

The researchers presented the new system, dubbed CodeCarbonCopy, at the Association for Computing Machinery’s Symposium on the Foundations of Software Engineering.

“CodeCarbonCopy enables one of the holy grails of software engineering: automatic code reuse,” says Stelios Sidiroglou-Douskos, a research scientist at CSAIL and first author on the paper. “It’s another step toward automating the human away from the development cycle. Our view is that perhaps we have written most of the software that we’ll ever need — we now just need to reuse it.”

The researchers conducted eight experiments in which they used CodeCarbonCopy to transplant code between six popular open-source image-processing programs. Seven of the eight transplants were successful, with the recipient program properly executing the new functionality.

Joining Sidiroglou-Douskos on the paper are Martin Rinard, a professor of electrical engineering and computer science; Fan Long, an MIT graduate student in electrical engineering and computer science; and Eric Lahtinen and Anthony Eden, who were contract programmers at MIT when the work was done.

Mutatis mutandis

With CodeCarbonCopy, the first step in transplanting code from one program to another is to feed both of them the same input file. The system then compares how the two programs process the file.

If, for instance, the donor program performs a series of operations on a particular piece of data and loads the result into a variable named “mem_clip->width,” and the recipient performs the same operations on the same piece of data and loads the result into a variable named “picture.width,” the system will infer that the variables are playing the same roles in their respective programs.

Once it has identified correspondences between variables, CodeCarbonCopy presents them to the user. It also presents all the variables in the donor for which it could not find matches in the recipient, together with those variables’ initial definitions. Frequently, those variables are playing some role in the donor that’s irrelevant to the recipient. The user can flag those variables as unnecessary, and CodeCarbonCopy will automatically excise any operations that make use of them from the transplanted code.

New order

To map the data representations from one program onto those of the other, CodeCarbonCopy looks at the precise values that both programs store in memory. Every pixel in a digital image, for instance, is governed by three color values: red, green, and blue. Some programs, however, store those triplets of values in the order red, green, blue, and others store them in the order blue, green, red.

If CodeCarbonCopy finds a systematic relationship between the values stored by one program and those stored by the other, it generates a set of operations for translating between representations.

CodeCarbonCopy works well with file formats, such as images, whose data is rigidly organized, and with programs, such as image processors, that store data representations in arrays, which are essentially rows of identically sized memory units. In ongoing work, the researchers are looking to generalize their approach to file formats that permit more flexible data organization and programs that use data structures other than arrays, such as trees or linked lists.

“In general, code quoting is where a lot of problems in software come from,” says Vitaly Shmatikov, a professor of computer science at Cornell Tech, a joint academic venture between Cornell University and Israel’s Technion. “Both bugs and security vulnerabilities — a lot of them occur when there is functionality in one place, and someone tries to either cut and paste or reimplement this functionality in another place. They make a small mistake, and that’s how things break. So having an automated way of moving code from one place to another would be a huge, huge deal, and this is a very solid step toward having it.”

“Recognizing irrelevant code that’s not important for the functionality that they’re quoting, that’s another technical innovation that’s important,” Shmatikov adds. “That’s the kind of thing that was an obstacle for a lot of previous approaches — that you know the right code is there, but it’s mixed up with a lot of code that is not relevant to what you’re trying to do. So being able to separate that out is a fairly significant technical contribution.”

“Peel-and-go” printable structures fold themselves

A new method produces a printable structure that begins to fold itself up as soon as it’s peeled off the printing platform. Credit: MIT
by Larry Hardesty

As 3-D printing has become a mainstream technology, industry and academic researchers have been investigating printable structures that will fold themselves into useful three-dimensional shapes when heated or immersed in water.

In a paper appearing in the American Chemical Society’s journal Applied Materials and Interfaces, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and colleagues report something new: a printable structure that begins to fold itself up as soon as it’s peeled off the printing platform.

One of the big advantages of devices that self-fold without any outside stimulus, the researchers say, is that they can involve a wider range of materials and more delicate structures.

“If you want to add printed electronics, you’re generally going to be using some organic materials, because a majority of printed electronics rely on them,” says Subramanian Sundaram, an MIT graduate student in electrical engineering and computer science and first author on the paper. “These materials are often very, very sensitive to moisture and temperature. So if you have these electronics and parts, and you want to initiate folds in them, you wouldn’t want to dunk them in water or heat them, because then your electronics are going to degrade.”

To illustrate this idea, the researchers built a prototype self-folding printable device that includes electrical leads and a polymer “pixel” that changes from transparent to opaque when a voltage is applied to it. The device, which is a variation on the “printable goldbug” that Sundaram and his colleagues announced earlier this year, starts out looking something like the letter “H.” But each of the legs of the H folds itself in two different directions, producing a tabletop shape.

The researchers also built several different versions of the same basic hinge design, which show that they can control the precise angle at which a joint folds. In tests, they forcibly straightened the hinges by attaching them to a weight, but when the weight was removed, the hinges resumed their original folds.

In the short term, the technique could enable the custom manufacture of sensors, displays, or antennas whose functionality depends on their three-dimensional shape. Longer term, the researchers envision the possibility of printable robots.

Sundaram is joined on the paper by his advisor, Wojciech Matusik, an associate professor of electrical engineering and computer science (EECS) at MIT; Marc Baldo, also an associate professor of EECS, who specializes in organic electronics; David Kim, a technical assistant in Matusik’s Computational Fabrication Group; and Ryan Hayward, a professor of polymer science and engineering at the University of Massachusetts at Amherst.

This clip shows an example of an accelerated fold. (Image: Tom Buehler/CSAIL)

Stress relief

The key to the researchers’ design is a new printer-ink material that expands after it solidifies, which is unusual. Most printer-ink materials contract slightly as they solidify, a technical limitation that designers frequently have to work around.

Printed devices are built up in layers, and in their prototypes the MIT researchers deposit their expanding material at precise locations in either the top or bottom few layers. The bottom layer adheres slightly to the printer platform, and that adhesion is enough to hold the device flat as the layers are built up. But as soon as the finished device is peeled off the platform, the joints made from the new material begin to expand, bending the device in the opposite direction.

Like many technological breakthroughs, the CSAIL researchers’ discovery of the material was an accident. Most of the printer materials used by Matusik’s Computational Fabrication Group are combinations of polymers, long molecules that consist of chainlike repetitions of single molecular components, or monomers. Mixing these components is one method for creating printer inks with specific physical properties.

While trying to develop an ink that yielded more flexible printed components, the CSAIL researchers inadvertently hit upon one that expanded slightly after it hardened. They immediately recognized the potential utility of expanding polymers and began experimenting with modifications of the mixture, until they arrived at a recipe that let them build joints that would expand enough to fold a printed device in half.

Whys and wherefores

Hayward’s contribution to the paper was to help the MIT team explain the material’s expansion. The ink that produces the most forceful expansion includes several long molecular chains and one much shorter chain, made up of the monomer isooctyl acrylate. When a layer of the ink is exposed to ultraviolet light — or “cured,” a process commonly used in 3-D printing to harden materials deposited as liquids — the long chains connect to each other, producing a rigid thicket of tangled molecules.

When another layer of the material is deposited on top of the first, the small chains of isooctyl acrylate in the top, liquid layer sink down into the lower, more rigid layer. There, they interact with the longer chains to exert an expansive force, which the adhesion to the printing platform temporarily resists.

The researchers hope that a better theoretical understanding of the reason for the material’s expansion will enable them to design material tailored to specific applications — including materials that resist the 1–3 percent contraction typical of many printed polymers after curing.

“This work is exciting because it provides a way to create functional electronics on 3-D objects,” says Michael Dickey, a professor of chemical engineering at North Carolina State University. “Typically, electronic processing is done in a planar, 2-D fashion and thus needs a flat surface. The work here provides a route to create electronics using more conventional planar techniques on a 2-D surface and then transform them into a 3-D shape, while retaining the function of the electronics. The transformation happens by a clever trick to build stress into the materials during printing.”

3 Questions: Iyad Rahwan on the “psychological roadblocks” facing self-driving cars

An image of some connected autonomous cars

by Peter Dizikes

This summer, a survey released by the American Automobile Association showed that 78 percent of Americans feared riding in a self-driving car, with just 19 percent trusting the technology. What might it take to alter public opinion on the issue? Iyad Rahwan, the AT&T Career Development Professor in the MIT Media Lab, has studied the issue at length, and, along with Jean-Francois Bonnefon of the Toulouse School of Economics and Azim Shariff of the University of California at Irvine, has authored a new commentary on the subject, titled, “Psychological roadblocks to the adoption of self-driving vehicles,” published today in Nature Human Behavior. Rahwan spoke to MIT News about the hurdles automakers face if they want greater public buy-in for autonomous vehicles.  

Q: Your new paper states that when it comes to autonomous vehicles, trust “will determine how widely they are adopted by consumers, and how tolerated they are by everyone else.” Why is this?

A: It’s a new kind of agent in the world. We’ve always built tools and had to trust that technology will function in the way it was intended. We’ve had to trust that the materials are reliable and don’t have health hazards, and that there are consumer protection entities that promote the interests of consumers. But these are passive products that we choose to use. For the first time in history we are building objects that are proactive and have autonomy and are even adaptive. They are learning behaviors that may be different from the ones they were originally programmed for. We don’t really know how to get people to trust such entities, because humans don’t have mental models of what these entities are, what they’re capable of, how they learn.

Before we can trust machines like autonomous vehicles, we have a number of challenges. The first is technical: the challenge of building an AI [artificial intelligence] system that can drive a car. The second is legal and regulatory: Who is liable for different kinds of faults? A third class of challenges is psychological. Unless people are comfortable putting their lives in the hands of AI, then none of this will matter. People won’t buy the product, the economics won’t work, and that’s the end of the story. What we’re trying to highlight in this paper is that these psychological challenges have to be taken seriously, even if [people] are irrational in the way they assess risk, even if the technology is safe and the legal framework is reliable.

Q: What are the specific psychological issues people have with autonomous vehicles?

A: We classify three psychological challenges that we think are fairly big. One of them is dilemmas: A lot of people are concerned about how autonomous vehicles will resolve ethical dilemmas. How will they decide, for example, whether to prioritize safety for the passenger or safety for pedestrians? Should this influence the way in which the car makes a decision about relative risk? And what we’re finding is that people have an idea about how to solve this dilemma: The car should just minimize harm. But the problem is that people are not willing to buy such cars, because they want to buy cars that will always prioritize themselves.

A second one is that people don’t always reason about risk in an unbiased way. People may overplay the risk of dying in a car crash caused by an autonomous vehicle even if autonomous vehicles are, on the average, safer. We’ve seen this kind of overreaction in other fields. Many people are afraid of flying even though you’re incredibly less likely to die from a plane crash than a car crash. So people don’t always reason about risk.

The third class of psychological challenges is this idea that we don’t always have transparency about what the car is thinking and why it’s doing what it’s doing. The carmaker has better knowledge of what the car thinks and how it behaves … which makes it more difficult for people to predict the behavior of autonomous vehicles, which can also dimish trust. One of the preconditions of trust is predictability: If I can trust that you will behave in a particular way, I can behave according to that expectation.

Q: In the paper you state that autonomous vehicles are better depicted “as being perfected, not as perfect.” In essence, is that your advice to the auto industry?

A: Yes, I think setting up very high expectations can be a recipe for disaster, because if you overpromise and underdeliver, you get in trouble. That is not to say that we should underpromise. We should just be a bit realistic about what we promise. If the promise is an improvement on the current status quo, that is, a reduction in risk to everyone, both pedestrians as well as passengers in cars, that’s an admirable goal. Even if we achieve it in a small way, that’s already progress that we should take seriously. I think being transparent about that, and being transparent about the progress being made toward that goal, is crucial.

IBM and MIT to pursue joint research in artificial intelligence, establish new MIT–IBM Watson AI Lab

MIT President L. Rafael Reif, left, and John Kelly III, IBM senior vice president, Cognitive Solutions and Research, shake hands at the conclusion of a signing ceremony establishing the new MIT–IBM Watson AI Lab. Credit: Jake Belcher

IBM and MIT today announced that IBM plans to make a 10-year, $240 million investment to create the MIT–IBM Watson AI Lab in partnership with MIT. The lab will carry out fundamental artificial intelligence (AI) research and seek to propel scientific breakthroughs that unlock the potential of AI. The collaboration aims to advance AI hardware, software, and algorithms related to deep learning and other areas; increase AI’s impact on industries, such as health care and cybersecurity; and explore the economic and ethical implications of AI on society. IBM’s $240 million investment in the lab will support research by IBM and MIT scientists.

The new lab will be one of the largest long-term university-industry AI collaborations to date, mobilizing the talent of more than 100 AI scientists, professors, and students to pursue joint research at IBM’s Research Lab in Cambridge, Massachusetts — co-located with the IBM Watson Health and IBM Security headquarters in Kendall Square — and on the neighboring MIT campus.

The lab will be co-chaired by Dario Gil, IBM Research VP of AI and IBM Q, and Anantha P. Chandrakasan, dean of MIT’s School of Engineering. (Read a related Q&A with Chandrakasan.) IBM and MIT plan to issue a call for proposals to MIT researchers and IBM scientists to submit their ideas for joint research to push the boundaries in AI science and technology in several areas, including:

  • AI algorithms: Developing advanced algorithms to expand capabilities in machine learning and reasoning. Researchers will create AI systems that move beyond specialized tasks to tackle more complex problems and benefit from robust, continuous learning. Researchers will invent new algorithms that can not only leverage big data when available, but also learn from limited data to augment human intelligence.
  • Physics of AI: Investigating new AI hardware materials, devices, and architectures that will support future analog computational approaches to AI model training and deployment, as well as the intersection of quantum computing and machine learning. The latter involves using AI to help characterize and improve quantum devices, and researching the use of quantum computing to optimize and speed up machine-learning algorithms and other AI applications.
  • Application of AI to industries: Given its location in IBM Watson Health and IBM Security headquarters in Kendall Square, a global hub of biomedical innovation, the lab will develop new applications of AI for professional use, including fields such as health care and cybersecurity. The collaboration will explore the use of AI in areas such as the security and privacy of medical data, personalization of health care, image analysis, and the optimum treatment paths for specific patients.
  • Advancing shared prosperity through AI: The MIT–IBM Watson AI Lab will explore how AI can deliver economic and societal benefits to a broader range of people, nations, and enterprises. The lab will study the economic implications of AI and investigate how AI can improve prosperity and help individuals achieve more in their lives.

In addition to IBM’s plan to produce innovations that advance the frontiers of AI, a distinct objective of the new lab is to encourage MIT faculty and students to launch companies that will focus on commercializing AI inventions and technologies that are developed at the lab. The lab’s scientists also will publish their work, contribute to the release of open source material, and foster an adherence to the ethical application of AI.

“The field of artificial intelligence has experienced incredible growth and progress over the past decade. Yet today’s AI systems, as remarkable as they are, will require new innovations to tackle increasingly difficult real-world problems to improve our work and lives,” says John Kelly III, IBM senior vice president, Cognitive Solutions and Research. “The extremely broad and deep technical capabilities and talent at MIT and IBM are unmatched, and will lead the field of AI for at least the next decade.”

“I am delighted by this new collaboration,” MIT President L. Rafael Reif says. “True breakthroughs are often the result of fresh thinking inspired by new kinds of research teams. The combined MIT and IBM talent dedicated to this new effort will bring formidable power to a field with staggering potential to advance knowledge and help solve important challenges.”

Both MIT and IBM have been pioneers in artificial intelligence research, and the new AI lab builds on a decades-long research relationship between the two. In 2016, IBM Research announced a multiyear collaboration with MIT’s Department of Brain and Cognitive Sciences to advance the scientific field of machine vision, a core aspect of artificial intelligence. The collaboration has brought together leading brain, cognitive, and computer scientists to conduct research in the field of unsupervised machine understanding of audio-visual streams of data, using insights from next-generation models of the brain to inform advances in machine vision. In addition, IBM and the Broad Institute of MIT and Harvard have established a five-year, $50 million research collaboration on AI and genomics.

MIT researchers were among those who helped coin and popularize the very phrase “artificial intelligence” in the 1950s. MIT pushed several major advances in the subsequent decades, from neural networks to data encryption to quantum computing to crowdsourcing. Marvin Minsky, a founder of the discipline, collaborated on building the first artificial neural network and he, along with Seymour Papert, advanced learning algorithms. Currently, the Computer Science and Artificial Intelligence Laboratory, the Media Lab, the Department of Brain and Cognitive Sciences, and the MIT Institute for Data, Systems, and Society serve as connected hubs for AI and related research at MIT.

For more than 20 years, IBM has explored the application of AI across many areas and industries. IBM researchers invented and built Watson, which is a cloud-based AI platform being used by businesses, developers, and universities to fight cancer, improve classroom learning, minimize pollution, enhance agriculture and oil and gas exploration, better manage financial investments, and much more. Today, IBM scientists across the globe are working on fundamental advances in AI algorithms, science and technology that will pave the way for the next generation of artificially intelligent systems.

For information about employment opportunities with IBM at the new AI Lab, please visit MITIBMWatsonAILab.mit.edu.

Page 10 of 12
1 8 9 10 11 12