Page 1 of 12
1 2 3 12

Using language to give robots a better grasp of an open-ended world

Feature Fields for Robotic Manipulation (F3RM) enables robots to interpret open-ended text prompts using natural language, helping the machines manipulate unfamiliar objects. The system’s 3D feature fields could be helpful in environments that contain thousands of objects, such as warehouses. Images courtesy of the researchers.

By Alex Shipps | MIT CSAIL

Imagine you’re visiting a friend abroad, and you look inside their fridge to see what would make for a great breakfast. Many of the items initially appear foreign to you, with each one encased in unfamiliar packaging and containers. Despite these visual distinctions, you begin to understand what each one is used for and pick them up as needed.

Inspired by humans’ ability to handle unfamiliar objects, a group from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) designed Feature Fields for Robotic Manipulation (F3RM), a system that blends 2D images with foundation model features into 3D scenes to help robots identify and grasp nearby items. F3RM can interpret open-ended language prompts from humans, making the method helpful in real-world environments that contain thousands of objects, like warehouses and households.

F3RM offers robots the ability to interpret open-ended text prompts using natural language, helping the machines manipulate objects. As a result, the machines can understand less-specific requests from humans and still complete the desired task. For example, if a user asks the robot to “pick up a tall mug,” the robot can locate and grab the item that best fits that description.

“Making robots that can actually generalize in the real world is incredibly hard,” says Ge Yang, postdoc at the National Science Foundation AI Institute for Artificial Intelligence and Fundamental Interactions and MIT CSAIL. “We really want to figure out how to do that, so with this project, we try to push for an aggressive level of generalization, from just three or four objects to anything we find in MIT’s Stata Center. We wanted to learn how to make robots as flexible as ourselves, since we can grasp and place objects even though we’ve never seen them before.”

Learning “what’s where by looking”

The method could assist robots with picking items in large fulfillment centers with inevitable clutter and unpredictability. In these warehouses, robots are often given a description of the inventory that they’re required to identify. The robots must match the text provided to an object, regardless of variations in packaging, so that customers’ orders are shipped correctly.

For example, the fulfillment centers of major online retailers can contain millions of items, many of which a robot will have never encountered before. To operate at such a scale, robots need to understand the geometry and semantics of different items, with some being in tight spaces. With F3RM’s advanced spatial and semantic perception abilities, a robot could become more effective at locating an object, placing it in a bin, and then sending it along for packaging. Ultimately, this would help factory workers ship customers’ orders more efficiently.

“One thing that often surprises people with F3RM is that the same system also works on a room and building scale, and can be used to build simulation environments for robot learning and large maps,” says Yang. “But before we scale up this work further, we want to first make this system work really fast. This way, we can use this type of representation for more dynamic robotic control tasks, hopefully in real-time, so that robots that handle more dynamic tasks can use it for perception.”

The MIT team notes that F3RM’s ability to understand different scenes could make it useful in urban and household environments. For example, the approach could help personalized robots identify and pick up specific items. The system aids robots in grasping their surroundings — both physically and perceptively.

“Visual perception was defined by David Marr as the problem of knowing ‘what is where by looking,’” says senior author Phillip Isola, MIT associate professor of electrical engineering and computer science and CSAIL principal investigator. “Recent foundation models have gotten really good at knowing what they are looking at; they can recognize thousands of object categories and provide detailed text descriptions of images. At the same time, radiance fields have gotten really good at representing where stuff is in a scene. The combination of these two approaches can create a representation of what is where in 3D, and what our work shows is that this combination is especially useful for robotic tasks, which require manipulating objects in 3D.”

Creating a “digital twin”

F3RM begins to understand its surroundings by taking pictures on a selfie stick. The mounted camera snaps 50 images at different poses, enabling it to build a neural radiance field (NeRF), a deep learning method that takes 2D images to construct a 3D scene. This collage of RGB photos creates a “digital twin” of its surroundings in the form of a 360-degree representation of what’s nearby.

In addition to a highly detailed neural radiance field, F3RM also builds a feature field to augment geometry with semantic information. The system uses CLIP, a vision foundation model trained on hundreds of millions of images to efficiently learn visual concepts. By reconstructing the 2D CLIP features for the images taken by the selfie stick, F3RM effectively lifts the 2D features into a 3D representation.

Keeping things open-ended

After receiving a few demonstrations, the robot applies what it knows about geometry and semantics to grasp objects it has never encountered before. Once a user submits a text query, the robot searches through the space of possible grasps to identify those most likely to succeed in picking up the object requested by the user. Each potential option is scored based on its relevance to the prompt, similarity to the demonstrations the robot has been trained on, and if it causes any collisions. The highest-scored grasp is then chosen and executed.

To demonstrate the system’s ability to interpret open-ended requests from humans, the researchers prompted the robot to pick up Baymax, a character from Disney’s “Big Hero 6.” While F3RM had never been directly trained to pick up a toy of the cartoon superhero, the robot used its spatial awareness and vision-language features from the foundation models to decide which object to grasp and how to pick it up.

F3RM also enables users to specify which object they want the robot to handle at different levels of linguistic detail. For example, if there is a metal mug and a glass mug, the user can ask the robot for the “glass mug.” If the bot sees two glass mugs and one of them is filled with coffee and the other with juice, the user can ask for the “glass mug with coffee.” The foundation model features embedded within the feature field enable this level of open-ended understanding.

“If I showed a person how to pick up a mug by the lip, they could easily transfer that knowledge to pick up objects with similar geometries such as bowls, measuring beakers, or even rolls of tape. For robots, achieving this level of adaptability has been quite challenging,” says MIT PhD student, CSAIL affiliate, and co-lead author William Shen. “F3RM combines geometric understanding with semantics from foundation models trained on internet-scale data to enable this level of aggressive generalization from just a small number of demonstrations.”

Shen and Yang wrote the paper under the supervision of Isola, with MIT professor and CSAIL principal investigator Leslie Pack Kaelbling and undergraduate students Alan Yu and Jansen Wong as co-authors. The team was supported, in part, by Services, the National Science Foundation, the Air Force Office of Scientific Research, the Office of Naval Research’s Multidisciplinary University Initiative, the Army Research Office, the MIT-IBM Watson Lab, and the MIT Quest for Intelligence. Their work will be presented at the 2023 Conference on Robot Learning.

New technique helps robots pack objects into a tight space

MIT researchers are using generative AI models to help robots more efficiently solve complex object manipulation problems, such as packing a box with different objects. Image: courtesy of the researchers.

By Adam Zewe | MIT News

Anyone who has ever tried to pack a family-sized amount of luggage into a sedan-sized trunk knows this is a hard problem. Robots struggle with dense packing tasks, too.

For the robot, solving the packing problem involves satisfying many constraints, such as stacking luggage so suitcases don’t topple out of the trunk, heavy objects aren’t placed on top of lighter ones, and collisions between the robotic arm and the car’s bumper are avoided.

Some traditional methods tackle this problem sequentially, guessing a partial solution that meets one constraint at a time and then checking to see if any other constraints were violated. With a long sequence of actions to take, and a pile of luggage to pack, this process can be impractically time consuming.   

MIT researchers used a form of generative AI, called a diffusion model, to solve this problem more efficiently. Their method uses a collection of machine-learning models, each of which is trained to represent one specific type of constraint. These models are combined to generate global solutions to the packing problem, taking into account all constraints at once.

Their method was able to generate effective solutions faster than other techniques, and it produced a greater number of successful solutions in the same amount of time. Importantly, their technique was also able to solve problems with novel combinations of constraints and larger numbers of objects, that the models did not see during training.

Due to this generalizability, their technique can be used to teach robots how to understand and meet the overall constraints of packing problems, such as the importance of avoiding collisions or a desire for one object to be next to another object. Robots trained in this way could be applied to a wide array of complex tasks in diverse environments, from order fulfillment in a warehouse to organizing a bookshelf in someone’s home.

“My vision is to push robots to do more complicated tasks that have many geometric constraints and more continuous decisions that need to be made — these are the kinds of problems service robots face in our unstructured and diverse human environments. With the powerful tool of compositional diffusion models, we can now solve these more complex problems and get great generalization results,” says Zhutian Yang, an electrical engineering and computer science graduate student and lead author of a paper on this new machine-learning technique.

Her co-authors include MIT graduate students Jiayuan Mao and Yilun Du; Jiajun Wu, an assistant professor of computer science at Stanford University; Joshua B. Tenenbaum, a professor in MIT’s Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); Tomás Lozano-Pérez, an MIT professor of computer science and engineering and a member of CSAIL; and senior author Leslie Kaelbling, the Panasonic Professor of Computer Science and Engineering at MIT and a member of CSAIL. The research will be presented at the Conference on Robot Learning.

Constraint complications

Continuous constraint satisfaction problems are particularly challenging for robots. These problems appear in multistep robot manipulation tasks, like packing items into a box or setting a dinner table. They often involve achieving a number of constraints, including geometric constraints, such as avoiding collisions between the robot arm and the environment; physical constraints, such as stacking objects so they are stable; and qualitative constraints, such as placing a spoon to the right of a knife.

There may be many constraints, and they vary across problems and environments depending on the geometry of objects and human-specified requirements.

To solve these problems efficiently, the MIT researchers developed a machine-learning technique called Diffusion-CCSP. Diffusion models learn to generate new data samples that resemble samples in a training dataset by iteratively refining their output.

To do this, diffusion models learn a procedure for making small improvements to a potential solution. Then, to solve a problem, they start with a random, very bad solution and then gradually improve it.

Using generative AI models, MIT researchers created a technique that could enable robots to efficiently solve continuous constraint satisfaction problems, such as packing objects into a box while avoiding collisions, as shown in this simulation. Image: Courtesy of the researchers.

For example, imagine randomly placing plates and utensils on a simulated table, allowing them to physically overlap. The collision-free constraints between objects will result in them nudging each other away, while qualitative constraints will drag the plate to the center, align the salad fork and dinner fork, etc.

Diffusion models are well-suited for this kind of continuous constraint-satisfaction problem because the influences from multiple models on the pose of one object can be composed to encourage the satisfaction of all constraints, Yang explains. By starting from a random initial guess each time, the models can obtain a diverse set of good solutions.

Working together

For Diffusion-CCSP, the researchers wanted to capture the interconnectedness of the constraints. In packing for instance, one constraint might require a certain object to be next to another object, while a second constraint might specify where one of those objects must be located.

Diffusion-CCSP learns a family of diffusion models, with one for each type of constraint. The models are trained together, so they share some knowledge, like the geometry of the objects to be packed.

The models then work together to find solutions, in this case locations for the objects to be placed, that jointly satisfy the constraints.

“We don’t always get to a solution at the first guess. But when you keep refining the solution and some violation happens, it should lead you to a better solution. You get guidance from getting something wrong,” she says.

Training individual models for each constraint type and then combining them to make predictions greatly reduces the amount of training data required, compared to other approaches.

However, training these models still requires a large amount of data that demonstrate solved problems. Humans would need to solve each problem with traditional slow methods, making the cost to generate such data prohibitive, Yang says.

Instead, the researchers reversed the process by coming up with solutions first. They used fast algorithms to generate segmented boxes and fit a diverse set of 3D objects into each segment, ensuring tight packing, stable poses, and collision-free solutions.

“With this process, data generation is almost instantaneous in simulation. We can generate tens of thousands of environments where we know the problems are solvable,” she says.

Trained using these data, the diffusion models work together to determine locations objects should be placed by the robotic gripper that achieve the packing task while meeting all of the constraints.

They conducted feasibility studies, and then demonstrated Diffusion-CCSP with a real robot solving a number of difficult problems, including fitting 2D triangles into a box, packing 2D shapes with spatial relationship constraints, stacking 3D objects with stability constraints, and packing 3D objects with a robotic arm.

This figure shows examples of 2D triangle packing. These are collision-free configurations. Image: courtesy of the researchers.

This figure shows 3D object stacking with stability constraints. Researchers say at least one object is supported by multiple objects. Image: courtesy of the researchers.

Their method outperformed other techniques in many experiments, generating a greater number of effective solutions that were both stable and collision-free.

In the future, Yang and her collaborators want to test Diffusion-CCSP in more complicated situations, such as with robots that can move around a room. They also want to enable Diffusion-CCSP to tackle problems in different domains without the need to be retrained on new data.

“Diffusion-CCSP is a machine-learning solution that builds on existing powerful generative models,” says Danfei Xu, an assistant professor in the School of Interactive Computing at the Georgia Institute of Technology and a Research Scientist at NVIDIA AI, who was not involved with this work. “It can quickly generate solutions that simultaneously satisfy multiple constraints by composing known individual constraint models. Although it’s still in the early phases of development, the ongoing advancements in this approach hold the promise of enabling more efficient, safe, and reliable autonomous systems in various applications.”

This research was funded, in part, by the National Science Foundation, the Air Force Office of Scientific Research, the Office of Naval Research, the MIT-IBM Watson AI Lab, the MIT Quest for Intelligence, the Center for Brains, Minds, and Machines, Boston Dynamics Artificial Intelligence Institute, the Stanford Institute for Human-Centered Artificial Intelligence, Analog Devices, JPMorgan Chase and Co., and Salesforce.

Finger-shaped sensor enables more dexterous robots

MIT researchers have developed a camera-based touch sensor that is long, curved, and shaped like a human finger. Their device, which provides high-resolution tactile sensing over a large area, could enable a robotic hand to perform multiple types of grasps. Image: Courtesy of the researchers

By Adam Zewe | MIT News

Imagine grasping a heavy object, like a pipe wrench, with one hand. You would likely grab the wrench using your entire fingers, not just your fingertips. Sensory receptors in your skin, which run along the entire length of each finger, would send information to your brain about the tool you are grasping.

In a robotic hand, tactile sensors that use cameras to obtain information about grasped objects are small and flat, so they are often located in the fingertips. These robots, in turn, use only their fingertips to grasp objects, typically with a pinching motion. This limits the manipulation tasks they can perform.

MIT researchers have developed a camera-based touch sensor that is long, curved, and shaped like a human finger. Their device provides high-resolution tactile sensing over a large area. The sensor, called the GelSight Svelte, uses two mirrors to reflect and refract light so that one camera, located in the base of the sensor, can see along the entire finger’s length.

In addition, the researchers built the finger-shaped sensor with a flexible backbone. By measuring how the backbone bends when the finger touches an object, they can estimate the force being placed on the sensor.

They used GelSight Svelte sensors to produce a robotic hand that was able to grasp a heavy object like a human would, using the entire sensing area of all three of its fingers. The hand could also perform the same pinch grasps common to traditional robotic grippers.

This gif shows a robotic hand that incorporates three, finger-shaped GelSight Svelte sensors. The sensors, which provide high-resolution tactile sensing over a large area, enable the hand to perform multiple grasps, including pinch grasps that use only the fingertips and a power grasp that uses the entire sensing area of all three fingers. Credit: Courtesy of the researchers

“Because our new sensor is human finger-shaped, we can use it to do different types of grasps for different tasks, instead of using pinch grasps for everything. There’s only so much you can do with a parallel jaw gripper. Our sensor really opens up some new possibilities on different manipulation tasks we could do with robots,” says Alan (Jialiang) Zhao, a mechanical engineering graduate student and lead author of a paper on GelSight Svelte.

Zhao wrote the paper with senior author Edward Adelson, the John and Dorothy Wilson Professor of Vision Science in the Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the IEEE Conference on Intelligent Robots and Systems.

Mirror mirror

Cameras used in tactile sensors are limited by their size, the focal distance of their lenses, and their viewing angles. Therefore, these tactile sensors tend to be small and flat, which confines them to a robot’s fingertips.

With a longer sensing area, one that more closely resembles a human finger, the camera would need to sit farther from the sensing surface to see the entire area. This is particularly challenging due to size and shape restrictions of a robotic gripper.

Zhao and Adelson solved this problem using two mirrors that reflect and refract light toward a single camera located at the base of the finger.

GelSight Svelte incorporates one flat, angled mirror that sits across from the camera and one long, curved mirror that sits along the back of the sensor. These mirrors redistribute light rays from the camera in such a way that the camera can see the along the entire finger’s length.

To optimize the shape, angle, and curvature of the mirrors, the researchers designed software to simulate reflection and refraction of light.

“With this software, we can easily play around with where the mirrors are located and how they are curved to get a sense of how well the image will look after we actually make the sensor,” Zhao explains.

The mirrors, camera, and two sets of LEDs for illumination are attached to a plastic backbone and encased in a flexible skin made from silicone gel. The camera views the back of the skin from the inside; based on the deformation, it can see where contact occurs and measure the geometry of the object’s contact surface.

A breakdown of the components that make up the finger-like touch sensor. Image: Courtesy of the researchers

In addition, the red and green LED arrays give a sense of how deeply the gel is being pressed down when an object is grasped, due to the saturation of color at different locations on the sensor.

The researchers can use this color saturation information to reconstruct a 3D depth image of the object being grasped.

The sensor’s plastic backbone enables it to determine proprioceptive information, such as the twisting torques applied to the finger. The backbone bends and flexes when an object is grasped. The researchers use machine learning to estimate how much force is being applied to the sensor, based on these backbone deformations.

However, combining these elements into a working sensor was no easy task, Zhao says.

“Making sure you have the correct curvature for the mirror to match what we have in simulation is pretty challenging. Plus, I realized there are some kinds of superglue that inhibit the curing of silicon. It took a lot of experiments to make a sensor that actually works,” he adds.

Versatile grasping

Once they had perfected the design, the researchers tested the GelSight Svelte by pressing objects, like a screw, to different locations on the sensor to check image clarity and see how well it could determine the shape of the object.

They also used three sensors to build a GelSight Svelte hand that can perform multiple grasps, including a pinch grasp, lateral pinch grasp, and a power grasp that uses the entire sensing area of the three fingers. Most robotic hands, which are shaped like parallel jaw drippers, can only perform pinch grasps.

A three-finger power grasp enables a robotic hand to hold a heavier object more stably. However, pinch grasps are still useful when an object is very small. Being able to perform both types of grasps with one hand would give a robot more versatility, he says.

Moving forward, the researchers plan to enhance the GelSight Svelte so the sensor is articulated and can bend at the joints, more like a human finger.

“Optical-tactile finger sensors allow robots to use inexpensive cameras to collect high-resolution images of surface contact, and by observing the deformation of a flexible surface the robot estimates the contact shape and forces applied. This work represents an advancement on the GelSight finger design, with improvements in full-finger coverage and the ability to approximate bending deflection torques using image differences and machine learning,” says Monroe Kennedy III, assistant professor of mechanical engineering at Stanford University, who was not involved with this research. “Improving a robot’s sense of touch to approach human ability is a necessity and perhaps the catalyst problem for developing robots capable of working on complex, dexterous tasks.”

This research is supported, in part, by the Toyota Research Institute.

Making life friendlier with personal robots

Sharifa Alghowinem, a research scientist in the Media Lab’s Personal Robots Group, poses with Jibo, a friendly robot companion developed by Professor Cynthia Breazeal. Credits: Gretchen Ertl

By Dorothy Hanna | Department of Mechanical Engineering

“As a child, I wished for a robot that would explain others’ emotions to me” says Sharifa Alghowinem, a research scientist in the Media Lab’s Personal Robots Group (PRG). Growing up in Saudi Arabia, Alghowinem says she dreamed of coming to MIT one day to develop Arabic-based technologies, and of creating a robot that could help herself and others navigate a complex world.

In her early life, Alghowinem faced difficulties with understanding social cues and never scored well on standardized tests, but her dreams carried her through. She earned an undergraduate degree in computing before leaving home to pursue graduate education in Australia. At the Australian National University, she discovered affective computing for the first time and began working to help AI detect human emotions and moods, but it wasn’t until she came to MIT as a postdoc with the Ibn Khaldun Fellowship for Saudi Arabian Women, which is housed in the MIT Department of Mechanical Engineering, that she was finally able to work on a technology with the potential to explain others’ emotions in English and Arabic. Today, she says her work is so fun that she calls the lab “my playground.” 

Alghowinem can’t say no to an exciting project. She found one with great potential to make robots more helpful to people by working with Jibo, a friendly robot companion developed by the founder of the Personal Robots Group (PRG) and the social robot startup Jibo Inc., MIT Professor and Dean for Digital Learning Cynthia Breazeal. Breazeal’s research explores the potential for companion robots to go far beyond assistants who obey transactional commands, like requests for the daily weather, adding items to shopping lists, or controlling lighting. At the MIT Media Lab, the PRG team designs Jibo to make him an insightful coach and companion to advance social robotics technologies and research. Visitors to the MIT Museum can experience Jibo’s charming personality.

Alghowinem’s research has focused on mental health care and education, often working with other graduate students and Undergraduate Research Opportunity Program students in the group. In one study, Jibo coached young and older adults via positive psychology. He adapted his interventions based on the verbal and non-verbal responses he observed in the participants. For example, Jibo takes in the verbal content of a participant’s speech and combines it with non-verbal information like prolonged pauses and self-hugs. If he concludes that deep emotions have been disclosed, Jibo responds with empathy. When the participant doesn’t disclose, Jibo asks a gentle follow up question like, “Can you tell me more?” 

Another project studied how a robot can effectively support high-quality parent and child interactions while reading a storybook together. Multiple PRG studies work together to learn what types of data are needed for a robot to understand people’s social and emotional states.

Research Scientist Sharifa Alghowinem (left) and visiting students Deim Alfozan and Tasneem Burghleh from Saudi Arabia’s Prince Sultan University, interact with Jibo. Credits: Gretchen Ertl

“I would like to see Jibo become a companion for the whole household,” says Alghowinem. Jibo can take on different roles with different family members such as a companion, reminding elders to take medication, or as a playmate for children. Alghowinem is especially motivated by the unique role Jibo could play in emotional wellness, and playing a preventative role in depression or even suicide. Integrating Jibo into daily life provides the opportunity for Jibo to detect emerging concerns and intervene, acting as a confidential resource or mental health coach. 

Alghowinem is also passionate about teaching and mentoring others, and not only via robots. She makes sure to meet individually with the students she mentors every week and she was instrumental earlier this year in bringing two visiting undergraduate students from Prince Sultan University in Saudi Arabia. Mindful of their social-emotional experience, she worked hard to create the opportunity for the two students, together, to visit MIT so they could support each other. One of the visiting students, Tasneem Burghleh, says she was curious to meet the person who went out of her way to make opportunities for strangers and discovered in her an “endless passion that makes her want to pass it on and share it with everyone else.”

Next, Alghowinem is working to create opportunities for children who are refugees from Syria. Still in the fundraising stage, the plan is to equip social robots to teach the children English language and social-emotional skills and provide activities to preserve cultural heritage and Arabic abilities.

“We’ve laid the groundwork by making sure Jibo can speak Arabic as well as several other languages,” says Alghowinem. “Now I hope we can learn how to make Jibo really useful to kids like me who need some support as they learn how to interact with the world around them.”

MIT engineers use kirigami to make ultrastrong, lightweight structures

MIT researchers used kirigami, the art of Japanese paper cutting and folding, to develop ultrastrong, lightweight materials that have tunable mechanical properties, like stiffness and flexibility. These materials could be used in airplanes, automobiles, or spacecraft. Image: Courtesy of the researchers

By Adam Zewe | MIT News

Cellular solids are materials composed of many cells that have been packed together, such as a honeycomb. The shape of those cells largely determines the material’s mechanical properties, including its stiffness or strength. Bones, for instance, are filled with a natural material that enables them to be lightweight, but stiff and strong.

Inspired by bones and other cellular solids found in nature, humans have used the same concept to develop architected materials. By changing the geometry of the unit cells that make up these materials, researchers can customize the material’s mechanical, thermal, or acoustic properties. Architected materials are used in many applications, from shock-absorbing packing foam to heat-regulating radiators.

Using kirigami, the ancient Japanese art of folding and cutting paper, MIT researchers have now manufactured a type of high-performance architected material known as a plate lattice, on a much larger scale than scientists have previously been able to achieve by additive fabrication. This technique allows them to create these structures from metal or other materials with custom shapes and specifically tailored mechanical properties. 

“This material is like steel cork. It is lighter than cork, but with high strength and high stiffness,” says Professor Neil Gershenfeld, who leads the Center for Bits and Atoms (CBA) at MIT and is senior author of a new paper on this approach.

The researchers developed a modular construction process in which many smaller components are formed, folded, and assembled into 3D shapes. Using this method, they fabricated ultralight and ultrastrong structures and robots that, under a specified load, can morph and hold their shape.

Because these structures are lightweight but strong, stiff, and relatively easy to mass-produce at larger scales, they could be especially useful in architectural, airplane, automotive, or aerospace components.

Joining Gershenfeld on the paper are co-lead authors Alfonso Parra Rubio, a research assistant in the CBA, and Klara Mundilova, an MIT electrical engineering and computer science graduate student; along with David Preiss, a graduate student in the CBA; and Erik D. Demaine, an MIT professor of computer science. The research will be presented at ASME’s Computers and Information in Engineering Conference.

The researchers actuate a corrugated structure by tensioning steel wires across the compliant surfaces and then connecting them to a system of pulleys and motors, enabling the structure to bend in either direction. Image: Courtesy of the researchers

Fabricating by folding

Architected materials, like lattices, are often used as cores for a type of composite material known as a sandwich structure. To envision a sandwich structure, think of an airplane wing, where a series of intersecting, diagonal beams form a lattice core that is sandwiched between a top and bottom panel. This truss lattice has high stiffness and strength, yet is very lightweight.

Plate lattices are cellular structures made from three-dimensional intersections of plates, rather than beams. These high-performance structures are even stronger and stiffer than truss lattices, but their complex shape makes them challenging to fabricate using common techniques like 3D printing, especially for large-scale engineering applications.

The MIT researchers overcame these manufacturing challenges using kirigami, a technique for making 3D shapes by folding and cutting paper that traces its history to Japanese artists in the 7th century.

Kirigami has been used to produce plate lattices from partially folded zigzag creases. But to make a sandwich structure, one must attach flat plates to the top and bottom of this corrugated core onto the narrow points formed by the zigzag creases. This often requires strong adhesives or welding techniques that can make assembly slow, costly, and challenging to scale.

The MIT researchers modified a common origami crease pattern, known as a Miura-ori pattern, so the sharp points of the corrugated structure are transformed into facets. The facets, like those on a diamond, provide flat surfaces to which the plates can be attached more easily, with bolts or rivets.

The MIT researchers modified a common origami crease pattern, known as a Miura-ori pattern, so the sharp points of the corrugated structure are transformed into facets. The facets, like those on a diamond, provide flat surfaces to which the plates can be attached more easily, with bolts or rivets. Image: Courtesy of the researchers

“Plate lattices outperform beam lattices in strength and stiffness while maintaining the same weight and internal structure,” says Parra Rubio. “Reaching the H-S upper bound for theoretical stiffness and strength has been demonstrated through nanoscale production using two-photon lithography. Plate lattices construction has been so difficult that there has been little research on the macro scale. We think folding is a path to easier utilization of this type of plate structure made from metals.”

Customizable properties

Moreover, the way the researchers design, fold, and cut the pattern enables them to tune certain mechanical properties, such as stiffness, strength, and flexural modulus (the tendency of a material to resist bending). They encode this information, as well as the 3D shape, into a creasing map that is used to create these kirigami corrugations.

For instance, based on the way the folds are designed, some cells can be shaped so they hold their shape when compressed while others can be modified so they bend. In this way, the researchers can precisely control how different areas of the structure will deform when compressed.

Because the flexibility of the structure can be controlled, these corrugations could be used in robots or other dynamic applications with parts that move, twist, and bend.

To craft larger structures like robots, the researchers introduced a modular assembly process. They mass produce smaller crease patterns and assemble them into ultralight and ultrastrong 3D structures. Smaller structures have fewer creases, which simplifies the manufacturing process.

Using the adapted Miura-ori pattern, the researchers create a crease pattern that will yield their desired shape and structural properties. Then they utilize a unique machine — a Zund cutting table — to score a flat, metal panel that they fold into the 3D shape.

“To make things like cars and airplanes, a huge investment goes into tooling. This manufacturing process is without tooling, like 3D printing. But unlike 3D printing, our process can set the limit for record material properties,” Gershenfeld says.

Using their method, they produced aluminum structures with a compression strength of more than 62 kilonewtons, but a weight of only 90 kilograms per square meter. (Cork weighs about 100 kilograms per square meter.) Their structures were so strong they could withstand three times as much force as a typical aluminum corrugation.

Using their method, researchers produced aluminum structures with a compression strength of more than 62 kilonewtons, but a weight of only 90 kilograms per square meter. Image: Courtesy of the researchers

The versatile technique could be used for many materials, such as steel and composites, making it well-suited for the production lightweight, shock-absorbing components for airplanes, automobiles, or spacecraft.

However, the researchers found that their method can be difficult to model. So, in the future, they plan to develop user-friendly CAD design tools for these kirigami plate lattice structures. In addition, they want to explore methods to reduce the computational costs of simulating a design that yields desired properties. 

“Kirigami corrugations holds exciting potential for architectural construction,” says James Coleman MArch ’14, SM ’14, co-founder of the design for fabrication and installation firm SumPoint, and former vice president for innovation and R&D at Zahner, who was not involved with this work. “In my experience producing complex architectural projects, current methods for constructing large-scale curved and doubly curved elements are material intensive and wasteful, and thus deemed impractical for most projects. While the authors’ technology offers novel solutions to the aerospace and automotive industries, I believe their cell-based method can also significantly impact the built environment. The ability to fabricate various plate lattice geometries with specific properties could enable higher performing and more expressive buildings with less material. Goodbye heavy steel and concrete structures, hello lightweight lattices!”

Parra Rubio, Mundilova and other MIT graduate students also used this technique to create three large-scale, folded artworks from aluminum composite that are on display at the MIT Media Lab. Despite the fact that each artwork is several meters in length, the structures only took a few hours to fabricate.

“At the end of the day, the artistic piece is only possible because of the math and engineering contributions we are showing in our papers. But we don’t want to ignore the aesthetic power of our work,” Parra Rubio says.

This work was funded, in part, by the Center for Bits and Atoms Research Consortia, an AAUW International Fellowship, and a GWI Fay Weber Grant.

AI helps robots manipulate objects with their whole bodies

MIT researchers developed an AI technique that enables a robot to develop complex plans for manipulating an object using its entire hand, not just the fingertips. This model can generate effective plans in about a minute using a standard laptop. Here, a robot attempts to rotate a bucket 180 degrees. Image: Courtesy of the researchers

By Adam Zewe | MIT News

Imagine you want to carry a large, heavy box up a flight of stairs. You might spread your fingers out and lift that box with both hands, then hold it on top of your forearms and balance it against your chest, using your whole body to manipulate the box. 

Humans are generally good at whole-body manipulation, but robots struggle with such tasks. To the robot, each spot where the box could touch any point on the carrier’s fingers, arms, and torso represents a contact event that it must reason about. With billions of potential contact events, planning for this task quickly becomes intractable.

Now MIT researchers found a way to simplify this process, known as contact-rich manipulation planning. They use an AI technique called smoothing, which summarizes many contact events into a smaller number of decisions, to enable even a simple algorithm to quickly identify an effective manipulation plan for the robot.

While still in its early days, this method could potentially enable factories to use smaller, mobile robots that can manipulate objects with their entire arms or bodies, rather than large robotic arms that can only grasp using fingertips. This may help reduce energy consumption and drive down costs. In addition, this technique could be useful in robots sent on exploration missions to Mars or other solar system bodies, since they could adapt to the environment quickly using only an onboard computer.      

“Rather than thinking about this as a black-box system, if we can leverage the structure of these kinds of robotic systems using models, there is an opportunity to accelerate the whole procedure of trying to make these decisions and come up with contact-rich plans,” says H.J. Terry Suh, an electrical engineering and computer science (EECS) graduate student and co-lead author of a paper on this technique.

Joining Suh on the paper are co-lead author Tao Pang PhD ’23, a roboticist at Boston Dynamics AI Institute; Lujie Yang, an EECS graduate student; and senior author Russ Tedrake, the Toyota Professor of EECS, Aeronautics and Astronautics, and Mechanical Engineering, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research appears this week in IEEE Transactions on Robotics.

Learning about learning

Reinforcement learning is a machine-learning technique where an agent, like a robot, learns to complete a task through trial and error with a reward for getting closer to a goal. Researchers say this type of learning takes a black-box approach because the system must learn everything about the world through trial and error.

It has been used effectively for contact-rich manipulation planning, where the robot seeks to learn the best way to move an object in a specified manner.

In these figures, a simulated robot performs three contact-rich manipulation tasks: in-hand manipulation of a ball, picking up a plate, and manipulating a pen into a specific orientation. Image: Courtesy of the researchers

But because there may be billions of potential contact points that a robot must reason about when determining how to use its fingers, hands, arms, and body to interact with an object, this trial-and-error approach requires a great deal of computation.

“Reinforcement learning may need to go through millions of years in simulation time to actually be able to learn a policy,” Suh adds.

On the other hand, if researchers specifically design a physics-based model using their knowledge of the system and the task they want the robot to accomplish, that model incorporates structure about this world that makes it more efficient.

Yet physics-based approaches aren’t as effective as reinforcement learning when it comes to contact-rich manipulation planning — Suh and Pang wondered why.

They conducted a detailed analysis and found that a technique known as smoothing enables reinforcement learning to perform so well.

Many of the decisions a robot could make when determining how to manipulate an object aren’t important in the grand scheme of things. For instance, each infinitesimal adjustment of one finger, whether or not it results in contact with the object, doesn’t matter very much.  Smoothing averages away many of those unimportant, intermediate decisions, leaving a few important ones.

Reinforcement learning performs smoothing implicitly by trying many contact points and then computing a weighted average of the results. Drawing on this insight, the MIT researchers designed a simple model that performs a similar type of smoothing, enabling it to focus on core robot-object interactions and predict long-term behavior. They showed that this approach could be just as effective as reinforcement learning at generating complex plans.

“If you know a bit more about your problem, you can design more efficient algorithms,” Pang says.

A winning combination

Even though smoothing greatly simplifies the decisions, searching through the remaining decisions can still be a difficult problem. So, the researchers combined their model with an algorithm that can rapidly and efficiently search through all possible decisions the robot could make.

With this combination, the computation time was cut down to about a minute on a standard laptop.

They first tested their approach in simulations where robotic hands were given tasks like moving a pen to a desired configuration, opening a door, or picking up a plate. In each instance, their model-based approach achieved the same performance as reinforcement learning, but in a fraction of the time. They saw similar results when they tested their model in hardware on real robotic arms.

“The same ideas that enable whole-body manipulation also work for planning with dexterous, human-like hands. Previously, most researchers said that reinforcement learning was the only approach that scaled to dexterous hands, but Terry and Tao showed that by taking this key idea of (randomized) smoothing from reinforcement learning, they can make more traditional planning methods work extremely well, too,” Tedrake says.

However, the model they developed relies on a simpler approximation of the real world, so it cannot handle very dynamic motions, such as objects falling. While effective for slower manipulation tasks, their approach cannot create a plan that would enable a robot to toss a can into a trash bin, for instance. In the future, the researchers plan to enhance their technique so it could tackle these highly dynamic motions.

“If you study your models carefully and really understand the problem you are trying to solve, there are definitely some gains you can achieve. There are benefits to doing things that are beyond the black box,” Suh says.

This work is funded, in part, by Amazon, MIT Lincoln Laboratory, the National Science Foundation, and the Ocado Group.

A faster way to teach a robot

Researchers from MIT and elsewhere have developed a technique that enables a human to efficiently fine-tune a robot that failed to complete a desired task— like picking up a unique mug— with very little effort on the part of the human. Image: Jose-Luis Olivares/MIT with images from iStock and The Coop

By Adam Zewe | MIT News Office

Imagine purchasing a robot to perform household tasks. This robot was built and trained in a factory on a certain set of tasks and has never seen the items in your home. When you ask it to pick up a mug from your kitchen table, it might not recognize your mug (perhaps because this mug is painted with an unusual image, say, of MIT’s mascot, Tim the Beaver). So, the robot fails.

“Right now, the way we train these robots, when they fail, we don’t really know why. So you would just throw up your hands and say, ‘OK, I guess we have to start over.’ A critical component that is missing from this system is enabling the robot to demonstrate why it is failing so the user can give it feedback,” says Andi Peng, an electrical engineering and computer science (EECS) graduate student at MIT.

Peng and her collaborators at MIT, New York University, and the University of California at Berkeley created a framework that enables humans to quickly teach a robot what they want it to do, with a minimal amount of effort.

When a robot fails, the system uses an algorithm to generate counterfactual explanations that describe what needed to change for the robot to succeed. For instance, maybe the robot would have been able to pick up the mug if the mug were a certain color. It shows these counterfactuals to the human and asks for feedback on why the robot failed. Then the system utilizes this feedback and the counterfactual explanations to generate new data it uses to fine-tune the robot.

Fine-tuning involves tweaking a machine-learning model that has already been trained to perform one task, so it can perform a second, similar task.

The researchers tested this technique in simulations and found that it could teach a robot more efficiently than other methods. The robots trained with this framework performed better, while the training process consumed less of a human’s time.

This framework could help robots learn faster in new environments without requiring a user to have technical knowledge. In the long run, this could be a step toward enabling general-purpose robots to efficiently perform daily tasks for the elderly or individuals with disabilities in a variety of settings.

Peng, the lead author, is joined by co-authors Aviv Netanyahu, an EECS graduate student; Mark Ho, an assistant professor at the Stevens Institute of Technology; Tianmin Shu, an MIT postdoc; Andreea Bobu, a graduate student at UC Berkeley; and senior authors Julie Shah, an MIT professor of aeronautics and astronautics and the director of the Interactive Robotics Group in the Computer Science and Artificial Intelligence Laboratory (CSAIL), and Pulkit Agrawal, a professor in CSAIL. The research will be presented at the International Conference on Machine Learning.

On-the-job training

Robots often fail due to distribution shift — the robot is presented with objects and spaces it did not see during training, and it doesn’t understand what to do in this new environment.

One way to retrain a robot for a specific task is imitation learning. The user could demonstrate the correct task to teach the robot what to do. If a user tries to teach a robot to pick up a mug, but demonstrates with a white mug, the robot could learn that all mugs are white. It may then fail to pick up a red, blue, or “Tim-the-Beaver-brown” mug.

Training a robot to recognize that a mug is a mug, regardless of its color, could take thousands of demonstrations.

“I don’t want to have to demonstrate with 30,000 mugs. I want to demonstrate with just one mug. But then I need to teach the robot so it recognizes that it can pick up a mug of any color,” Peng says.

To accomplish this, the researchers’ system determines what specific object the user cares about (a mug) and what elements aren’t important for the task (perhaps the color of the mug doesn’t matter). It uses this information to generate new, synthetic data by changing these “unimportant” visual concepts. This process is known as data augmentation.

The framework has three steps. First, it shows the task that caused the robot to fail. Then it collects a demonstration from the user of the desired actions and generates counterfactuals by searching over all features in the space that show what needed to change for the robot to succeed.

The system shows these counterfactuals to the user and asks for feedback to determine which visual concepts do not impact the desired action. Then it uses this human feedback to generate many new augmented demonstrations.

In this way, the user could demonstrate picking up one mug, but the system would produce demonstrations showing the desired action with thousands of different mugs by altering the color. It uses these data to fine-tune the robot.

Creating counterfactual explanations and soliciting feedback from the user are critical for the technique to succeed, Peng says.

From human reasoning to robot reasoning

Because their work seeks to put the human in the training loop, the researchers tested their technique with human users. They first conducted a study in which they asked people if counterfactual explanations helped them identify elements that could be changed without affecting the task.

“It was so clear right off the bat. Humans are so good at this type of counterfactual reasoning. And this counterfactual step is what allows human reasoning to be translated into robot reasoning in a way that makes sense,” she says.

Then they applied their framework to three simulations where robots were tasked with: navigating to a goal object, picking up a key and unlocking a door, and picking up a desired object then placing it on a tabletop. In each instance, their method enabled the robot to learn faster than with other techniques, while requiring fewer demonstrations from users.

Moving forward, the researchers hope to test this framework on real robots. They also want to focus on reducing the time it takes the system to create new data using generative machine-learning models.

“We want robots to do what humans do, and we want them to do it in a semantically meaningful way. Humans tend to operate in this abstract space, where they don’t think about every single property in an image. At the end of the day, this is really about enabling a robot to learn a good, human-like representation at an abstract level,” Peng says.

This research is supported, in part, by a National Science Foundation Graduate Research Fellowship, Open Philanthropy, an Apple AI/ML Fellowship, Hyundai Motor Corporation, the MIT-IBM Watson AI Lab, and the National Science Foundation Institute for Artificial Intelligence and Fundamental Interactions.

Magnetic robots walk, crawl, and swim

MIT professor of materials science and engineering and brain and cognitive sciences Polina Anikeeva in her lab. Photo: Steph Stevens

By Jennifer Michalowski | McGovern Institute for Brain Research

MIT scientists have developed tiny, soft-bodied robots that can be controlled with a weak magnet. The robots, formed from rubbery magnetic spirals, can be programmed to walk, crawl, swim — all in response to a simple, easy-to-apply magnetic field.

“This is the first time this has been done, to be able to control three-dimensional locomotion of robots with a one-dimensional magnetic field,” says Professor Polina Anikeeva, whose team published an open-access paper on the magnetic robots in the journal Advanced Materials. “And because they are predominantly composed of polymer and polymers are soft, you don’t need a very large magnetic field to activate them. It’s actually a really tiny magnetic field that drives these robots,” adds Anikeeva, who is a professor of materials science and engineering and brain and cognitive sciences at MIT, a McGovern Institute for Brain Research associate investigator, as well as the associate director of MIT’s Research Laboratory of Electronics and director of MIT’s K. Lisa Yang Brain-Body Center.

The new robots are well suited to transport cargo through confined spaces and their rubber bodies are gentle on fragile environments, opening the possibility that the technology could be developed for biomedical applications. Anikeeva and her team have made their robots millimeters long, but she says the same approach could be used to produce much smaller robots.

Magnetically actuated fiber-based soft robots

Engineering magnetic robots

Anikeeva says that until now, magnetic robots have moved in response to moving magnetic fields. She explains that for these models, “if you want your robot to walk, your magnet walks with it. If you want it to rotate, you rotate your magnet.” That limits the settings in which such robots might be deployed. “If you are trying to operate in a really constrained environment, a moving magnet may not be the safest solution. You want to be able to have a stationary instrument that just applies magnetic field to the whole sample,” she explains.

Youngbin Lee PhD ’22, a former graduate student in Anikeeva’s lab, engineered a solution to this problem. The robots he developed in Anikeeva’s lab are not uniformly magnetized. Instead, they are strategically magnetized in different zones and directions so a single magnetic field can enable a movement-driving profile of magnetic forces.

Before they are magnetized, however, the flexible, lightweight bodies of the robots must be fabricated. Lee starts this process with two kinds of rubber, each with a different stiffness. These are sandwiched together, then heated and stretched into a long, thin fiber. Because of the two materials’ different properties, one of the rubbers retains its elasticity through this stretching process, but the other deforms and cannot return to its original size. So when the strain is released, one layer of the fiber contracts, tugging on the other side and pulling the whole thing into a tight coil. Anikeeva says the helical fiber is modeled after the twisty tendrils of a cucumber plant, which spiral when one layer of cells loses water and contracts faster than a second layer.

A third material — one whose particles have the potential to become magnetic — is incorporated in a channel that runs through the rubbery fiber. So once the spiral has been made, a magnetization pattern that enables a particular type of movement can be introduced.

“Youngbin thought very carefully about how to magnetize our robots to make them able to move just as he programmed them to move,” Anikeeva says. “He made calculations to determine how to establish such a profile of forces on it when we apply a magnetic field that it will actually start walking or crawling.”

To form a caterpillar-like crawling robot, for example, the helical fiber is shaped into gentle undulations, and then the body, head, and tail are magnetized so that a magnetic field applied perpendicular to the robot’s plane of motion will cause the body to compress. When the field is reduced to zero, the compression is released, and the crawling robot stretches. Together, these movements propel the robot forward. Another robot in which two foot-like helical fibers are connected with a joint is magnetized in a pattern that enables a movement more like walking.

Biomedical potential

This precise magnetization process generates a program for each robot and ensures that that once the robots are made, they are simple to control. A weak magnetic field activates each robot’s program and drives its particular type of movement. A single magnetic field can even send multiple robots moving in opposite directions, if they have been programmed to do so. The team found that one minor manipulation of the magnetic field has a useful effect: With the flip of a switch to reverse the field, a cargo-carrying robot can be made to gently shake and release its payload.

Anikeeva says she can imagine these soft-bodied robots — whose straightforward production will be easy to scale up — delivering materials through narrow pipes, or even inside the human body. For example, they might carry a drug through narrow blood vessels, releasing it exactly where it is needed. She says the magnetically-actuated devices have biomedical potential beyond robots as well, and might one day be incorporated into artificial muscles or materials that support tissue regeneration.

A step toward safe and reliable autopilots for flying

MIT researchers developed a machine-learning technique that can autonomously drive a car or fly a plane through a very difficult “stabilize-avoid” scenario, in which the vehicle must stabilize its trajectory to arrive at and stay within some goal region, while avoiding obstacles. Image: Courtesy of the researchers

By Adam Zewe | MIT News Office

In the film “Top Gun: Maverick, Maverick, played by Tom Cruise, is charged with training young pilots to complete a seemingly impossible mission — to fly their jets deep into a rocky canyon, staying so low to the ground they cannot be detected by radar, then rapidly climb out of the canyon at an extreme angle, avoiding the rock walls. Spoiler alert: With Maverick’s help, these human pilots accomplish their mission.

A machine, on the other hand, would struggle to complete the same pulse-pounding task. To an autonomous aircraft, for instance, the most straightforward path toward the target is in conflict with what the machine needs to do to avoid colliding with the canyon walls or staying undetected. Many existing AI methods aren’t able to overcome this conflict, known as the stabilize-avoid problem, and would be unable to reach their goal safely.

MIT researchers have developed a new technique that can solve complex stabilize-avoid problems better than other methods. Their machine-learning approach matches or exceeds the safety of existing methods while providing a tenfold increase in stability, meaning the agent reaches and remains stable within its goal region.

In an experiment that would make Maverick proud, their technique effectively piloted a simulated jet aircraft through a narrow corridor without crashing into the ground. 

“This has been a longstanding, challenging problem. A lot of people have looked at it but didn’t know how to handle such high-dimensional and complex dynamics,” says Chuchu Fan, the Wilson Assistant Professor of Aeronautics and Astronautics, a member of the Laboratory for Information and Decision Systems (LIDS), and senior author of a new paper on this technique.

Fan is joined by lead author Oswin So, a graduate student. The paper will be presented at the Robotics: Science and Systems conference.

The stabilize-avoid challenge

Many approaches tackle complex stabilize-avoid problems by simplifying the system so they can solve it with straightforward math, but the simplified results often don’t hold up to real-world dynamics.

More effective techniques use reinforcement learning, a machine-learning method where an agent learns by trial-and-error with a reward for behavior that gets it closer to a goal. But there are really two goals here — remain stable and avoid obstacles — and finding the right balance is tedious.

The MIT researchers broke the problem down into two steps. First, they reframe the stabilize-avoid problem as a constrained optimization problem. In this setup, solving the optimization enables the agent to reach and stabilize to its goal, meaning it stays within a certain region. By applying constraints, they ensure the agent avoids obstacles, So explains. 

Then for the second step, they reformulate that constrained optimization problem into a mathematical representation known as the epigraph form and solve it using a deep reinforcement learning algorithm. The epigraph form lets them bypass the difficulties other methods face when using reinforcement learning. 

“But deep reinforcement learning isn’t designed to solve the epigraph form of an optimization problem, so we couldn’t just plug it into our problem. We had to derive the mathematical expressions that work for our system. Once we had those new derivations, we combined them with some existing engineering tricks used by other methods,” So says.

No points for second place

To test their approach, they designed a number of control experiments with different initial conditions. For instance, in some simulations, the autonomous agent needs to reach and stay inside a goal region while making drastic maneuvers to avoid obstacles that are on a collision course with it.

This video shows how the researchers used their technique to effectively fly a simulated jet aircraft in a scenario where it had to stabilize to a target near the ground while maintaining a very low altitude and staying within a narrow flight corridor. Courtesy of the researchers.

When compared with several baselines, their approach was the only one that could stabilize all trajectories while maintaining safety. To push their method even further, they used it to fly a simulated jet aircraft in a scenario one might see in a “Top Gun” movie. The jet had to stabilize to a target near the ground while maintaining a very low altitude and staying within a narrow flight corridor.

This simulated jet model was open-sourced in 2018 and had been designed by flight control experts as a testing challenge. Could researchers create a scenario that their controller could not fly? But the model was so complicated it was difficult to work with, and it still couldn’t handle complex scenarios, Fan says.

The MIT researchers’ controller was able to prevent the jet from crashing or stalling while stabilizing to the goal far better than any of the baselines.

In the future, this technique could be a starting point for designing controllers for highly dynamic robots that must meet safety and stability requirements, like autonomous delivery drones. Or it could be implemented as part of larger system. Perhaps the algorithm is only activated when a car skids on a snowy road to help the driver safely navigate back to a stable trajectory.

Navigating extreme scenarios that a human wouldn’t be able to handle is where their approach really shines, So adds.

“We believe that a goal we should strive for as a field is to give reinforcement learning the safety and stability guarantees that we will need to provide us with assurance when we deploy these controllers on mission-critical systems. We think this is a promising first step toward achieving that goal,” he says.

Moving forward, the researchers want to enhance their technique so it is better able to take uncertainty into account when solving the optimization. They also want to investigate how well the algorithm works when deployed on hardware, since there will be mismatches between the dynamics of the model and those in the real world.

“Professor Fan’s team has improved reinforcement learning performance for dynamical systems where safety matters. Instead of just hitting a goal, they create controllers that ensure the system can reach its target safely and stay there indefinitely,” says Stanley Bak, an assistant professor in the Department of Computer Science at Stony Brook University, who was not involved with this research. “Their improved formulation allows the successful generation of safe controllers for complex scenarios, including a 17-state nonlinear jet aircraft model designed in part by researchers from the Air Force Research Lab (AFRL), which incorporates nonlinear differential equations with lift and drag tables.”

The work is funded, in part, by MIT Lincoln Laboratory under the Safety in Aerobatic Flight Regimes program.

Helping robots handle fluids

Researchers created “FluidLab,” a simulation environment with a diverse set of manipulation tasks involving complex fluid dynamics. Image: Alex Shipps/MIT CSAIL via Midjourney

Imagine you’re enjoying a picnic by a riverbank on a windy day. A gust of wind accidentally catches your paper napkin and lands on the water’s surface, quickly drifting away from you. You grab a nearby stick and carefully agitate the water to retrieve it, creating a series of small waves. These waves eventually push the napkin back toward the shore, so you grab it. In this scenario, the water acts as a medium for transmitting forces, enabling you to manipulate the position of the napkin without direct contact.

Humans regularly engage with various types of fluids in their daily lives, but doing so has been a formidable and elusive goal for current robotic systems. Hand you a latte? A robot can do that. Make it? That’s going to require a bit more nuance. 

FluidLab, a new simulation tool from researchers at the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), enhances robot learning for complex fluid manipulation tasks like making latte art, ice cream, and even manipulating air. The virtual environment offers a versatile collection of intricate fluid handling challenges, involving both solids and liquids, and multiple fluids simultaneously. FluidLab supports modeling solid, liquid, and gas, including elastic, plastic, rigid objects, Newtonian and non-Newtonian liquids, and smoke and air. 

At the heart of FluidLab lies FluidEngine, an easy-to-use physics simulator capable of seamlessly calculating and simulating various materials and their interactions, all while harnessing the power of graphics processing units (GPUs) for faster processing. The engine is “differential,” meaning the simulator can incorporate physics knowledge for a more realistic physical world model, leading to more efficient learning and planning for robotic tasks. In contrast, most existing reinforcement learning methods lack that world model that just depends on trial and error. This enhanced capability, say the researchers, lets users experiment with robot learning algorithms and toy with the boundaries of current robotic manipulation abilities.

To set the stage, the researchers tested said robot learning algorithms using FluidLab, discovering and overcoming unique challenges in fluid systems. By developing clever optimization methods, they’ve been able to transfer these learnings from simulations to real-world scenarios effectively. 

“Imagine a future where a household robot effortlessly assists you with daily tasks, like making coffee, preparing breakfast, or cooking dinner. These tasks involve numerous fluid manipulation challenges. Our benchmark is a first step towards enabling robots to master these skills, benefiting households and workplaces alike,” says visiting researcher at MIT CSAIL and research scientist at the MIT-IBM Watson AI Lab Chuang Gan, the senior author on a new paper about the research. “For instance, these robots could reduce wait times and enhance customer experiences in busy coffee shops. FluidEngine is, to our knowledge, the first-of-its-kind physics engine that supports a wide range of materials and couplings while being fully differentiable. With our standardized fluid manipulation tasks, researchers can evaluate robot learning algorithms and push the boundaries of today’s robotic manipulation capabilities.”

Fluid fantasia

Over the past few decades, scientists in the robotic manipulation domain have mainly focused on manipulating rigid objects, or on very simplistic fluid manipulation tasks like pouring water. Studying these manipulation tasks involving fluids in the real world can also be an unsafe and costly endeavor. 

With fluid manipulation, it’s not always just about fluids, though. In many tasks, such as creating the perfect ice cream swirl, mixing solids into liquids, or paddling through the water to move objects, it’s a dance of interactions between fluids and various other materials. Simulation environments must support “coupling,” or how two different material properties interact. Fluid manipulation tasks usually require pretty fine-grained precision, with delicate interactions and handling of materials, setting them apart from straightforward tasks like pushing a block or opening a bottle. 

FluidLab’s simulator can quickly calculate how different materials interact with each other. 

Helping out the GPUs is “Taichi,” a domain-specific language embedded in Python. The system can compute gradients (rates of change in environment configurations with respect to the robot’s actions) for different material types and their interactions (couplings) with one another. This precise information can be used to fine-tune the robot’s movements for better performance. As a result, the simulator allows for faster and more efficient solutions, setting it apart from its counterparts.

The 10 tasks the team put forth fell into two categories: using fluids to manipulate hard-to-reach objects, and directly manipulating fluids for specific goals. Examples included separating liquids, guiding floating objects, transporting items with water jets, mixing liquids, creating latte art, shaping ice cream, and controlling air circulation. 

“The simulator works similarly to how humans use their mental models to predict the consequences of their actions and make informed decisions when manipulating fluids. This is a significant advantage of our simulator compared to others,” says Carnegie Mellon University PhD student Zhou Xian, another author on the paper. “While other simulators primarily support reinforcement learning, ours supports reinforcement learning and allows for more efficient optimization techniques. Utilizing the gradients provided by the simulator supports highly efficient policy search, making it a more versatile and effective tool.”

Next steps

FluidLab’s future looks bright. The current work attempted to transfer trajectories optimized in simulation to real-world tasks directly in an open-loop manner. For next steps, the team is working to develop a closed-loop policy in simulation that takes as input the state or the visual observations of the environments and performs fluid manipulation tasks in real time, and then transfers the learned policies in real-world scenes.

The platform is publicly publicly available, and researchers hope it will benefit future studies in developing better methods for solving complex fluid manipulation tasks.

“Humans interact with fluids in everyday tasks, including pouring and mixing liquids (coffee, yogurts, soups, batter), washing and cleaning with water, and more,” says University of Maryland computer science professor Ming Lin, who was not involved in the work. “For robots to assist humans and serve in similar capacities for day-to-day tasks, novel techniques for interacting and handling various liquids of different properties (e.g. viscosity and density of materials) would be needed and remains a major computational challenge for real-time autonomous systems. This work introduces the first comprehensive physics engine, FluidLab, to enable modeling of diverse, complex fluids and their coupling with other objects and dynamical systems in the environment. The mathematical formulation of ‘differentiable fluids’ as presented in the paper makes it possible for integrating versatile fluid simulation as a network layer in learning-based algorithms and neural network architectures for intelligent systems to operate in real-world applications.”

Gan and Xian wrote the paper alongside Hsiao-Yu Tung a postdoc in the MIT Department of Brain and Cognitive Sciences; Antonio Torralba, an MIT professor of electrical engineering and computer science and CSAIL principal investigator; Dartmouth College Assistant Professor Bo Zhu, Columbia University PhD student Zhenjia Xu, and CMU Assistant Professor Katerina Fragkiadaki. The team’s research is supported by the MIT-IBM Watson AI Lab, Sony AI, a DARPA Young Investigator Award, an NSF CAREER award, an AFOSR Young Investigator Award, DARPA Machine Common Sense, and the National Science Foundation.

The research was presented at the International Conference on Learning Representations earlier this month.

Miniscule device could help preserve the battery life of tiny sensors

Researchers from MIT and elsewhere have built a wake-up receiver that communicates using terahertz waves, which enabled them to produce a chip more than 10 times smaller than similar devices. Their receiver, which also includes authentication to protect it from a certain type of attack, could help preserve the battery life of tiny sensors or robots. Image: Jose-Luis Olivares/MIT with figure courtesy of the researchers

By Adam Zewe | MIT News Office

Scientists are striving to develop ever-smaller internet-of-things devices, like sensors tinier than a fingertip that could make nearly any object trackable. These diminutive sensors have miniscule batteries which are often nearly impossible to replace, so engineers incorporate wake-up receivers that keep devices in low-power “sleep” mode when not in use, preserving battery life.

Researchers at MIT have developed a new wake-up receiver that is less than one-tenth the size of previous devices and consumes only a few microwatts of power. Their receiver also incorporates a low-power, built-in authentication system, which protects the device from a certain type of attack that could quickly drain its battery.

Many common types of wake-up receivers are built on the centimeter scale since their antennas must be proportional to the size of the radio waves they use to communicate. Instead, the MIT team built a receiver that utilizes terahertz waves, which are about one-tenth the length of radio waves. Their chip is barely more than 1 square millimeter in size. 

They used their wake-up receiver to demonstrate effective, wireless communication with a signal source that was several meters away, showcasing a range that would enable their chip to be used in miniaturized sensors.

For instance, the wake-up receiver could be incorporated into microrobots that monitor environmental changes in areas that are either too small or hazardous for other robots to reach. Also, since the device uses terahertz waves, it could be utilized in emerging applications, such as field-deployable radio networks that work as swarms to collect localized data.

“By using terahertz frequencies, we can make an antenna that is only a few hundred micrometers on each side, which is a very small size. This means we can integrate these antennas to the chip, creating a fully integrated solution. Ultimately, this enabled us to build a very small wake-up receiver that could be attached to tiny sensors or radios,” says Eunseok Lee, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on the wake-up receiver.

Lee wrote the paper with his co-advisors and senior authors Anantha Chandrakasan, dean of the MIT School of Engineering and the Vannevar Bush Professor of Electrical Engineering and Computer Science, who leads the Energy-Efficient Circuits and Systems Group, and Ruonan Han, an associate professor in EECS, who leads the Terahertz Integrated Electronics Group in the Research Laboratory of Electronics; as well as others at MIT, the Indian Institute of Science, and Boston University. The research is being presented at the IEEE Custom Integrated Circuits Conference. 

Scaling down the receiver

Terahertz waves, found on the electromagnetic spectrum between microwaves and infrared light, have very high frequencies and travel much faster than radio waves. Sometimes called “pencil beams,” terahertz waves travel in a more direct path than other signals, which makes them more secure, Lee explains.

However, the waves have such high frequencies that terahertz receivers often multiply the terahertz signal by another signal to alter the frequency, a process known as frequency mixing modulation. Terahertz mixing consumes a great deal of power.

Instead, Lee and his collaborators developed a zero-power-consumption detector that can detect terahertz waves without the need for frequency mixing. The detector uses a pair of tiny transistors as antennas, which consume very little power.

Even with both antennas on the chip, their wake-up receiver was only 1.54 square millimeters in size and consumed less than 3 microwatts of power. This dual-antenna setup maximizes performance and makes it easier to read signals.

Once received, their chip amplifies a terahertz signal and then converts analog data into a digital signal for processing. This digital signal carries a token, which is a string of bits (0s and 1s). If the token corresponds to the wake-up receiver’s token, it will activate the device.

Ramping up security

In most wake-up receivers, the same token is reused multiple times, so an eavesdropping attacker could figure out what it is. Then the hacker could send a signal that would activate the device over and over again, using what is called a denial-of-sleep attack.

“With a wake-up receiver, the lifetime of a device could be improved from one day to one month, for instance, but an attacker could use a denial-of-sleep attack to drain that entire battery life in even less than a day. That is why we put authentication into our wake-up receiver,” he explains.

They added an authentication block that utilizes an algorithm to randomize the device’s token each time, using a key that is shared with trusted senders. This key acts like a password — if a sender knows the password, they can send a signal with the right token. The researchers do this using a technique known as lightweight cryptography, which ensures the entire authentication process only consumes a few extra nanowatts of power. 

They tested their device by sending terahertz signals to the wake-up receiver as they increased the distance between the chip and the terahertz source. In this way, they tested the sensitivity of their receiver — the minimum signal power needed for the device to successfully detect a signal. Signals that travel farther have less power.

“We achieved 5- to 10-meter longer distance demonstrations than others, using a device with a very small size and microwatt level power consumption,” Lee says.

But to be most effective, terahertz waves need to hit the detector dead-on. If the chip is at an angle, some of the signal will be lost. So, the researchers paired their device with a terahertz beam-steerable array, recently developed by the Han group, to precisely direct the terahertz waves. Using this technique, communication could be sent to multiple chips with minimal signal loss.

In the future, Lee and his collaborators want to tackle this problem of signal degradation. If they can find a way to maintain signal strength when receiver chips move or tilt slightly, they could increase the performance of these devices. They also want to demonstrate their wake-up receiver in very small sensors and fine-tune the technology for use in real-world devices.

“We have developed a rich technology portfolio for future millimeter-sized sensing, tagging, and authentication platforms, including terahertz backscattering, energy harvesting, and electrical beam steering and focusing. Now, this portfolio is more complete with Eunseok’s first-ever terahertz wake-up receiver, which is critical to save the extremely limited energy available on those mini platforms,” Han says.

Additional co-authors include Muhammad Ibrahim Wasiq Khan PhD ’22; Xibi Chen, an EECS graduate student; Ustav Banerjee PhD ’21, an assistant professor at the Indian Institute of Science; Nathan Monroe PhD ’22; and Rabia Tugce Yazicigil, an assistant professor of electrical and computer engineering at Boston University.

Drones navigate unseen environments with liquid neural networks

Makram Chahine, a PhD student in electrical engineering and computer science and an MIT CSAIL affiliate, leads a drone used to test liquid neural networks. Photo: Mike Grimmett/MIT CSAIL

By Rachel Gordon | MIT CSAIL

In the vast, expansive skies where birds once ruled supreme, a new crop of aviators is taking flight. These pioneers of the air are not living creatures, but rather a product of deliberate innovation: drones. But these aren’t your typical flying bots, humming around like mechanical bees. Rather, they’re avian-inspired marvels that soar through the sky, guided by liquid neural networks to navigate ever-changing and unseen environments with precision and ease.

Inspired by the adaptable nature of organic brains, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have introduced a method for robust flight navigation agents to master vision-based fly-to-target tasks in intricate, unfamiliar environments. The liquid neural networks, which can continuously adapt to new data inputs, showed prowess in making reliable decisions in unknown domains like forests, urban landscapes, and environments with added noise, rotation, and occlusion. These adaptable models, which outperformed many state-of-the-art counterparts in navigation tasks, could enable potential real-world drone applications like search and rescue, delivery, and wildlife monitoring.

The researchers’ recent study, published in Science Robotics, details how this new breed of agents can adapt to significant distribution shifts, a long-standing challenge in the field. The team’s new class of machine-learning algorithms, however, captures the causal structure of tasks from high-dimensional, unstructured data, such as pixel inputs from a drone-mounted camera. These networks can then extract crucial aspects of a task (i.e., understand the task at hand) and ignore irrelevant features, allowing acquired navigation skills to transfer targets seamlessly to new environments.

Drones navigate unseen environments with liquid neural networks.

“We are thrilled by the immense potential of our learning-based control approach for robots, as it lays the groundwork for solving problems that arise when training in one environment and deploying in a completely distinct environment without additional training,” says Daniela Rus, CSAIL director and the Andrew (1956) and Erna Viterbi Professor of Electrical Engineering and Computer Science at MIT. “Our experiments demonstrate that we can effectively teach a drone to locate an object in a forest during summer, and then deploy the model in winter, with vastly different surroundings, or even in urban settings, with varied tasks such as seeking and following. This adaptability is made possible by the causal underpinnings of our solutions. These flexible algorithms could one day aid in decision-making based on data streams that change over time, such as medical diagnosis and autonomous driving applications.”

A daunting challenge was at the forefront: Do machine-learning systems understand the task they are given from data when flying drones to an unlabeled object? And, would they be able to transfer their learned skill and task to new environments with drastic changes in scenery, such as flying from a forest to an urban landscape? What’s more, unlike the remarkable abilities of our biological brains, deep learning systems struggle with capturing causality, frequently over-fitting their training data and failing to adapt to new environments or changing conditions. This is especially troubling for resource-limited embedded systems, like aerial drones, that need to traverse varied environments and respond to obstacles instantaneously. 

The liquid networks, in contrast, offer promising preliminary indications of their capacity to address this crucial weakness in deep learning systems. The team’s system was first trained on data collected by a human pilot, to see how they transferred learned navigation skills to new environments under drastic changes in scenery and conditions. Unlike traditional neural networks that only learn during the training phase, the liquid neural net’s parameters can change over time, making them not only interpretable, but more resilient to unexpected or noisy data. 

In a series of quadrotor closed-loop control experiments, the drones underwent range tests, stress tests, target rotation and occlusion, hiking with adversaries, triangular loops between objects, and dynamic target tracking. They tracked moving targets, and executed multi-step loops between objects in never-before-seen environments, surpassing performance of other cutting-edge counterparts. 

The team believes that the ability to learn from limited expert data and understand a given task while generalizing to new environments could make autonomous drone deployment more efficient, cost-effective, and reliable. Liquid neural networks, they noted, could enable autonomous air mobility drones to be used for environmental monitoring, package delivery, autonomous vehicles, and robotic assistants. 

“The experimental setup presented in our work tests the reasoning capabilities of various deep learning systems in controlled and straightforward scenarios,” says MIT CSAIL Research Affiliate Ramin Hasani. “There is still so much room left for future research and development on more complex reasoning challenges for AI systems in autonomous navigation applications, which has to be tested before we can safely deploy them in our society.”

“Robust learning and performance in out-of-distribution tasks and scenarios are some of the key problems that machine learning and autonomous robotic systems have to conquer to make further inroads in society-critical applications,” says Alessio Lomuscio, professor of AI safety in the Department of Computing at Imperial College London. “In this context, the performance of liquid neural networks, a novel brain-inspired paradigm developed by the authors at MIT, reported in this study is remarkable. If these results are confirmed in other experiments, the paradigm here developed will contribute to making AI and robotic systems more reliable, robust, and efficient.”

Clearly, the sky is no longer the limit, but rather a vast playground for the boundless possibilities of these airborne marvels. 

Hasani and PhD student Makram Chahine; Patrick Kao ’22, MEng ’22; and PhD student Aaron Ray SM ’21 wrote the paper with Ryan Shubert ’20, MEng ’22; MIT postdocs Mathias Lechner and Alexander Amini; and Daniela Rus.

This research was supported, in part, by Schmidt Futures, the U.S. Air Force Research Laboratory, the U.S. Air Force Artificial Intelligence Accelerator, and the Boeing Co.

Robotic hand can identify objects with just one grasp

MIT researchers developed a soft-rigid robotic finger that incorporates powerful sensors along its entire length, enabling them to produce a robotic hand that could accurately identify objects after only one grasp. Image: Courtesy of the researchers

By Adam Zewe | MIT News Office

Inspired by the human finger, MIT researchers have developed a robotic hand that uses high-resolution touch sensing to accurately identify an object after grasping it just one time.

Many robotic hands pack all their powerful sensors into the fingertips, so an object must be in full contact with those fingertips to be identified, which can take multiple grasps. Other designs use lower-resolution sensors spread along the entire finger, but these don’t capture as much detail, so multiple regrasps are often required.

Instead, the MIT team built a robotic finger with a rigid skeleton encased in a soft outer layer that has multiple high-resolution sensors incorporated under its transparent “skin.” The sensors, which use a camera and LEDs to gather visual information about an object’s shape, provide continuous sensing along the finger’s entire length. Each finger captures rich data on many parts of an object simultaneously.

Using this design, the researchers built a three-fingered robotic hand that could identify objects after only one grasp, with about 85 percent accuracy. The rigid skeleton makes the fingers strong enough to pick up a heavy item, such as a drill, while the soft skin enables them to securely grasp a pliable item, like an empty plastic water bottle, without crushing it.

These soft-rigid fingers could be especially useful in an at-home-care robot designed to interact with an elderly individual. The robot could lift a heavy item off a shelf with the same hand it uses to help the individual take a bath.

“Having both soft and rigid elements is very important in any hand, but so is being able to perform great sensing over a really large area, especially if we want to consider doing very complicated manipulation tasks like what our own hands can do. Our goal with this work was to combine all the things that make our human hands so good into a robotic finger that can do tasks other robotic fingers can’t currently do,” says mechanical engineering graduate student Sandra Liu, co-lead author of a research paper on the robotic finger.

Liu wrote the paper with co-lead author and mechanical engineering undergraduate student Leonardo Zamora Yañez and her advisor, Edward Adelson, the John and Dorothy Wilson Professor of Vision Science in the Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). The research will be presented at the RoboSoft Conference.

A human-inspired finger

The robotic finger is comprised of a rigid, 3D-printed endoskeleton that is placed in a mold and encased in a transparent silicone “skin.” Making the finger in a mold removes the need for fasteners or adhesives to hold the silicone in place.

The researchers designed the mold with a curved shape so the robotic fingers are slightly curved when at rest, just like human fingers.

“Silicone will wrinkle when it bends, so we thought that if we have the finger molded in this curved position, when you curve it more to grasp an object, you won’t induce as many wrinkles. Wrinkles are good in some ways — they can help the finger slide along surfaces very smoothly and easily — but we didn’t want wrinkles that we couldn’t control,” Liu says.

The endoskeleton of each finger contains a pair of detailed touch sensors, known as GelSight sensors, embedded into the top and middle sections, underneath the transparent skin. The sensors are placed so the range of the cameras overlaps slightly, giving the finger continuous sensing along its entire length.

The GelSight sensor, based on technology pioneered in the Adelson group, is composed of a camera and three colored LEDs. When the finger grasps an object, the camera captures images as the colored LEDs illuminate the skin from the inside.

Image: Courtesy of the researchers

Using the illuminated contours that appear in the soft skin, an algorithm performs backward calculations to map the contours on the grasped object’s surface. The researchers trained a machine-learning model to identify objects using raw camera image data.

As they fine-tuned the finger fabrication process, the researchers ran into several obstacles.

First, silicone has a tendency to peel off surfaces over time. Liu and her collaborators found they could limit this peeling by adding small curves along the hinges between the joints in the endoskeleton.

When the finger bends, the bending of the silicone is distributed along the tiny curves, which reduces stress and prevents peeling. They also added creases to the joints so the silicone is not squashed as much when the finger bends.

While troubleshooting their design, the researchers realized wrinkles in the silicone prevent the skin from ripping.

“The usefulness of the wrinkles was an accidental discovery on our part. When we synthesized them on the surface, we found that they actually made the finger more durable than we expected,” she says.

Getting a good grasp

Once they had perfected the design, the researchers built a robotic hand using two fingers arranged in a Y pattern with a third finger as an opposing thumb. The hand captures six images when it grasps an object (two from each finger) and sends those images to a machine-learning algorithm which uses them as inputs to identify the object.

Because the hand has tactile sensing covering all of its fingers, it can gather rich tactile data from a single grasp.

“Although we have a lot of sensing in the fingers, maybe adding a palm with sensing would help it make tactile distinctions even better,” Liu says.

In the future, the researchers also want to improve the hardware to reduce the amount of wear and tear in the silicone over time and add more actuation to the thumb so it can perform a wider variety of tasks.

This work was supported, in part, by the Toyota Research Institute, the Office of Naval Research, and the SINTEF BIFROST project.

A four-legged robotic system for playing soccer on various terrains

Researchers created DribbleBot, a system for in-the-wild dribbling on diverse natural terrains including sand, gravel, mud, and snow using onboard sensing and computing. In addition to these football feats, such robots may someday aid humans in search-and-rescue missions. Photo: Mike Grimmett/MIT CSAIL

By Rachel Gordon | MIT CSAIL

If you’ve ever played soccer with a robot, it’s a familiar feeling. Sun glistens down on your face as the smell of grass permeates the air. You look around. A four-legged robot is hustling toward you, dribbling with determination. 

While the bot doesn’t display a Lionel Messi-like level of ability, it’s an impressive in-the-wild dribbling system nonetheless. Researchers from MIT’s Improbable Artificial Intelligence Lab, part of the Computer Science and Artificial Intelligence Laboratory (CSAIL), have developed a legged robotic system that can dribble a soccer ball under the same conditions as humans. The bot used a mixture of onboard sensing and computing to traverse different natural terrains such as sand, gravel, mud, and snow, and adapt to their varied impact on the ball’s motion. Like every committed athlete, “DribbleBot” could get up and recover the ball after falling. 

Programming robots to play soccer has been an active research area for some time. However, the team wanted to automatically learn how to actuate the legs during dribbling, to enable the discovery of hard-to-script skills for responding to diverse terrains like snow, gravel, sand, grass, and pavement. Enter, simulation. 

A robot, ball, and terrain are inside the simulation — a digital twin of the natural world. You can load in the bot and other assets and set physics parameters, and then it handles the forward simulation of the dynamics from there. Four thousand versions of the robot are simulated in parallel in real time, enabling data collection 4,000 times faster than using just one robot. That’s a lot of data. 


The robot starts without knowing how to dribble the ball — it just receives a reward when it does, or negative reinforcement when it messes up. So, it’s essentially trying to figure out what sequence of forces it should apply with its legs. “One aspect of this reinforcement learning approach is that we must design a good reward to facilitate the robot learning a successful dribbling behavior,” says MIT PhD student Gabe Margolis, who co-led the work along with Yandong Ji, research assistant in the Improbable AI Lab. “Once we’ve designed that reward, then it’s practice time for the robot: In real time, it’s a couple of days, and in the simulator, hundreds of days. Over time it learns to get better and better at manipulating the soccer ball to match the desired velocity.” 

The bot could also navigate unfamiliar terrains and recover from falls due to a recovery controller the team built into its system. This controller lets the robot get back up after a fall and switch back to its dribbling controller to continue pursuing the ball, helping it handle out-of-distribution disruptions and terrains. 

“If you look around today, most robots are wheeled. But imagine that there’s a disaster scenario, flooding, or an earthquake, and we want robots to aid humans in the search-and-rescue process. We need the machines to go over terrains that aren’t flat, and wheeled robots can’t traverse those landscapes,” says Pulkit Agrawal, MIT professor, CSAIL principal investigator, and director of Improbable AI Lab.” The whole point of studying legged robots is to go terrains outside the reach of current robotic systems,” he adds. “Our goal in developing algorithms for legged robots is to provide autonomy in challenging and complex terrains that are currently beyond the reach of robotic systems.” 

The fascination with robot quadrupeds and soccer runs deep — Canadian professor Alan Mackworth first noted the idea in a paper entitled “On Seeing Robots,” presented at VI-92, 1992. Japanese researchers later organized a workshop on “Grand Challenges in Artificial Intelligence,” which led to discussions about using soccer to promote science and technology. The project was launched as the Robot J-League a year later, and global fervor quickly ensued. Shortly after that, “RoboCup” was born. 

Compared to walking alone, dribbling a soccer ball imposes more constraints on DribbleBot’s motion and what terrains it can traverse. The robot must adapt its locomotion to apply forces to the ball to  dribble. The interaction between the ball and the landscape could be different than the interaction between the robot and the landscape, such as thick grass or pavement. For example, a soccer ball will experience a drag force on grass that is not present on pavement, and an incline will apply an acceleration force, changing the ball’s typical path. However, the bot’s ability to traverse different terrains is often less affected by these differences in dynamics — as long as it doesn’t slip — so the soccer test can be sensitive to variations in terrain that locomotion alone isn’t. 

“Past approaches simplify the dribbling problem, making a modeling assumption of flat, hard ground. The motion is also designed to be more static; the robot isn’t trying to run and manipulate the ball simultaneously,” says Ji. “That’s where more difficult dynamics enter the control problem. We tackled this by extending recent advances that have enabled better outdoor locomotion into this compound task which combines aspects of locomotion and dexterous manipulation together.”

On the hardware side, the robot has a set of sensors that let it perceive the environment, allowing it to feel where it is, “understand” its position, and “see” some of its surroundings. It has a set of actuators that lets it apply forces and move itself and objects. In between the sensors and actuators sits the computer, or “brain,” tasked with converting sensor data into actions, which it will apply through the motors. When the robot is running on snow, it doesn’t see the snow but can feel it through its motor sensors. But soccer is a trickier feat than walking — so the team leveraged cameras on the robot’s head and body for a new sensory modality of vision, in addition to the new motor skill. And then — we dribble. 

“Our robot can go in the wild because it carries all its sensors, cameras, and compute on board. That required some innovations in terms of getting the whole controller to fit onto this onboard compute,” says Margolis. “That’s one area where learning helps because we can run a lightweight neural network and train it to process noisy sensor data observed by the moving robot. This is in stark contrast with most robots today: Typically a robot arm is mounted on a fixed base and sits on a workbench with a giant computer plugged right into it. Neither the computer nor the sensors are in the robotic arm! So, the whole thing is weighty, hard to move around.”

There’s still a long way to go in making these robots as agile as their counterparts in nature, and some terrains were challenging for DribbleBot. Currently, the controller is not trained in simulated environments that include slopes or stairs. The robot isn’t perceiving the geometry of the terrain; it’s only estimating its material contact properties, like friction. If there’s a step up, for example, the robot will get stuck — it won’t be able to lift the ball over the step, an area the team wants to explore in the future. The researchers are also excited to apply lessons learned during development of DribbleBot to other tasks that involve combined locomotion and object manipulation, quickly transporting diverse objects from place to place using the legs or arms.

The research is supported by the DARPA Machine Common Sense Program, the MIT-IBM Watson AI Lab, the National Science Foundation Institute of Artificial Intelligence and Fundamental Interactions, the U.S. Air Force Research Laboratory, and the U.S. Air Force Artificial Intelligence Accelerator. The paper will be presented at the 2023 IEEE International Conference on Robotics and Automation (ICRA).

Resilient bug-sized robots keep flying even after wing damage

MIT researchers have developed resilient artificial muscles that can enable insect-scale aerial robots to effectively recover flight performance after suffering severe damage. Photo: Courtesy of the researchers

By Adam Zewe | MIT News Office

Bumblebees are clumsy fliers. It is estimated that a foraging bee bumps into a flower about once per second, which damages its wings over time. Yet despite having many tiny rips or holes in their wings, bumblebees can still fly.

Aerial robots, on the other hand, are not so resilient. Poke holes in the robot’s wing motors or chop off part of its propellor, and odds are pretty good it will be grounded.

Inspired by the hardiness of bumblebees, MIT researchers have developed repair techniques that enable a bug-sized aerial robot to sustain severe damage to the actuators, or artificial muscles, that power its wings — but to still fly effectively.

They optimized these artificial muscles so the robot can better isolate defects and overcome minor damage, like tiny holes in the actuator. In addition, they demonstrated a novel laser repair method that can help the robot recover from severe damage, such as a fire that scorches the device.

Using their techniques, a damaged robot could maintain flight-level performance after one of its artificial muscles was jabbed by 10 needles, and the actuator was still able to operate after a large hole was burnt into it. Their repair methods enabled a robot to keep flying even after the researchers cut off 20 percent of its wing tip.

This could make swarms of tiny robots better able to perform tasks in tough environments, like conducting a search mission through a collapsing building or dense forest.

“We spent a lot of time understanding the dynamics of soft, artificial muscles and, through both a new fabrication method and a new understanding, we can show a level of resilience to damage that is comparable to insects,” says Kevin Chen, the D. Reid Weedon, Jr. Assistant Professor in the Department of Electrical Engineering and Computer Science (EECS), the head of the Soft and Micro Robotics Laboratory in the Research Laboratory of Electronics (RLE), and the senior author of the paper on these latest advances. “We’re very excited about this. But the insects are still superior to us, in the sense that they can lose up to 40 percent of their wing and still fly. We still have some catch-up work to do.”

Chen wrote the paper with co-lead authors Suhan Kim and Yi-Hsuan Hsiao, who are EECS graduate students; Younghoon Lee, a postdoc; Weikun “Spencer” Zhu, a graduate student in the Department of Chemical Engineering; Zhijian Ren, an EECS graduate student; and Farnaz Niroui, the EE Landsman Career Development Assistant Professor of EECS at MIT and a member of the RLE. The article appeared in Science Robotics.

Robot repair techniques

Using the repair techniques developed by MIT researchers, this microrobot can still maintain flight-level performance even after the artificial muscles that power its wings were jabbed by 10 needles and 20 percent of one wing tip was cut off. Credit: Courtesy of the researchers.

The tiny, rectangular robots being developed in Chen’s lab are about the same size and shape as a microcassette tape, though one robot weighs barely more than a paper clip. Wings on each corner are powered by dielectric elastomer actuators (DEAs), which are soft artificial muscles that use mechanical forces to rapidly flap the wings. These artificial muscles are made from layers of elastomer that are sandwiched between two razor-thin electrodes and then rolled into a squishy tube. When voltage is applied to the DEA, the electrodes squeeze the elastomer, which flaps the wing.

But microscopic imperfections can cause sparks that burn the elastomer and cause the device to fail. About 15 years ago, researchers found they could prevent DEA failures from one tiny defect using a physical phenomenon known as self-clearing. In this process, applying high voltage to the DEA disconnects the local electrode around a small defect, isolating that failure from the rest of the electrode so the artificial muscle still works.

Chen and his collaborators employed this self-clearing process in their robot repair techniques.

First, they optimized the concentration of carbon nanotubes that comprise the electrodes in the DEA. Carbon nanotubes are super-strong but extremely tiny rolls of carbon. Having fewer carbon nanotubes in the electrode improves self-clearing, since it reaches higher temperatures and burns away more easily. But this also reduces the actuator’s power density.

“At a certain point, you will not be able to get enough energy out of the system, but we need a lot of energy and power to fly the robot. We had to find the optimal point between these two constraints — optimize the self-clearing property under the constraint that we still want the robot to fly,” Chen says.

However, even an optimized DEA will fail if it suffers from severe damage, like a large hole that lets too much air into the device.

Chen and his team used a laser to overcome major defects. They carefully cut along the outer contours of a large defect with a laser, which causes minor damage around the perimeter. Then, they can use self-clearing to burn off the slightly damaged electrode, isolating the larger defect.

“In a way, we are trying to do surgery on muscles. But if we don’t use enough power, then we can’t do enough damage to isolate the defect. On the other hand, if we use too much power, the laser will cause severe damage to the actuator that won’t be clearable,” Chen says.

The team soon realized that, when “operating” on such tiny devices, it is very difficult to observe the electrode to see if they had successfully isolated a defect. Drawing on previous work, they incorporated electroluminescent particles into the actuator. Now, if they see light shining, they know that part of the actuator is operational, but dark patches mean they successfully isolated those areas.

The new research could make swarms of tiny robots better able to perform tasks in tough environments, like conducting a search mission through a collapsing building or dense forest. Photo: Courtesy of the researchers

Flight test success

Once they had perfected their techniques, the researchers conducted tests with damaged actuators — some had been jabbed by many needles while other had holes burned into them. They measured how well the robot performed in flapping wing, take-off, and hovering experiments.

Even with damaged DEAs, the repair techniques enabled the robot to maintain its flight performance, with altitude, position, and attitude errors that deviated only very slightly from those of an undamaged robot. With laser surgery, a DEA that would have been broken beyond repair was able to recover 87 percent of its performance.

“I have to hand it to my two students, who did a lot of hard work when they were flying the robot. Flying the robot by itself is very hard, not to mention now that we are intentionally damaging it,” Chen says.

These repair techniques make the tiny robots much more robust, so Chen and his team are now working on teaching them new functions, like landing on flowers or flying in a swarm. They are also developing new control algorithms so the robots can fly better, teaching the robots to control their yaw angle so they can keep a constant heading, and enabling the robots to carry a tiny circuit, with the longer-term goal of carrying its own power source.

“This work is important because small flying robots — and flying insects! — are constantly colliding with their environment. Small gusts of wind can be huge problems for small insects and robots. Thus, we need methods to increase their resilience if we ever hope to be able to use robots like this in natural environments,” says Nick Gravish, an associate professor in the Department of Mechanical and Aerospace Engineering at the University of California at San Diego, who was not involved with this research. “This paper demonstrates how soft actuation and body mechanics can adapt to damage and I think is an impressive step forward.”

This work is funded, in part, by the National Science Foundation (NSF) and a MathWorks Fellowship.

Page 1 of 12
1 2 3 12