New technique helps robots pack objects into a tight space
Revolutionize Your PCB Prototyping Assembled in the USA: Fast Turnaround, High Quality
Preparing Your Facility for Mobile Robots: Key Considerations for Seamless Integration
Researchers develop soft-packaged, portable rehabilitation glove
New technique helps robots pack objects into a tight space
By Adam Zewe | MIT News
Anyone who has ever tried to pack a family-sized amount of luggage into a sedan-sized trunk knows this is a hard problem. Robots struggle with dense packing tasks, too.
For the robot, solving the packing problem involves satisfying many constraints, such as stacking luggage so suitcases don’t topple out of the trunk, heavy objects aren’t placed on top of lighter ones, and collisions between the robotic arm and the car’s bumper are avoided.
Some traditional methods tackle this problem sequentially, guessing a partial solution that meets one constraint at a time and then checking to see if any other constraints were violated. With a long sequence of actions to take, and a pile of luggage to pack, this process can be impractically time consuming.
MIT researchers used a form of generative AI, called a diffusion model, to solve this problem more efficiently. Their method uses a collection of machine-learning models, each of which is trained to represent one specific type of constraint. These models are combined to generate global solutions to the packing problem, taking into account all constraints at once.
Their method was able to generate effective solutions faster than other techniques, and it produced a greater number of successful solutions in the same amount of time. Importantly, their technique was also able to solve problems with novel combinations of constraints and larger numbers of objects, that the models did not see during training.
Due to this generalizability, their technique can be used to teach robots how to understand and meet the overall constraints of packing problems, such as the importance of avoiding collisions or a desire for one object to be next to another object. Robots trained in this way could be applied to a wide array of complex tasks in diverse environments, from order fulfillment in a warehouse to organizing a bookshelf in someone’s home.
“My vision is to push robots to do more complicated tasks that have many geometric constraints and more continuous decisions that need to be made — these are the kinds of problems service robots face in our unstructured and diverse human environments. With the powerful tool of compositional diffusion models, we can now solve these more complex problems and get great generalization results,” says Zhutian Yang, an electrical engineering and computer science graduate student and lead author of a paper on this new machine-learning technique.
Her co-authors include MIT graduate students Jiayuan Mao and Yilun Du; Jiajun Wu, an assistant professor of computer science at Stanford University; Joshua B. Tenenbaum, a professor in MIT’s Department of Brain and Cognitive Sciences and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); Tomás Lozano-Pérez, an MIT professor of computer science and engineering and a member of CSAIL; and senior author Leslie Kaelbling, the Panasonic Professor of Computer Science and Engineering at MIT and a member of CSAIL. The research will be presented at the Conference on Robot Learning.
Constraint complications
Continuous constraint satisfaction problems are particularly challenging for robots. These problems appear in multistep robot manipulation tasks, like packing items into a box or setting a dinner table. They often involve achieving a number of constraints, including geometric constraints, such as avoiding collisions between the robot arm and the environment; physical constraints, such as stacking objects so they are stable; and qualitative constraints, such as placing a spoon to the right of a knife.
There may be many constraints, and they vary across problems and environments depending on the geometry of objects and human-specified requirements.
To solve these problems efficiently, the MIT researchers developed a machine-learning technique called Diffusion-CCSP. Diffusion models learn to generate new data samples that resemble samples in a training dataset by iteratively refining their output.
To do this, diffusion models learn a procedure for making small improvements to a potential solution. Then, to solve a problem, they start with a random, very bad solution and then gradually improve it.
For example, imagine randomly placing plates and utensils on a simulated table, allowing them to physically overlap. The collision-free constraints between objects will result in them nudging each other away, while qualitative constraints will drag the plate to the center, align the salad fork and dinner fork, etc.
Diffusion models are well-suited for this kind of continuous constraint-satisfaction problem because the influences from multiple models on the pose of one object can be composed to encourage the satisfaction of all constraints, Yang explains. By starting from a random initial guess each time, the models can obtain a diverse set of good solutions.
Working together
For Diffusion-CCSP, the researchers wanted to capture the interconnectedness of the constraints. In packing for instance, one constraint might require a certain object to be next to another object, while a second constraint might specify where one of those objects must be located.
Diffusion-CCSP learns a family of diffusion models, with one for each type of constraint. The models are trained together, so they share some knowledge, like the geometry of the objects to be packed.
The models then work together to find solutions, in this case locations for the objects to be placed, that jointly satisfy the constraints.
“We don’t always get to a solution at the first guess. But when you keep refining the solution and some violation happens, it should lead you to a better solution. You get guidance from getting something wrong,” she says.
Training individual models for each constraint type and then combining them to make predictions greatly reduces the amount of training data required, compared to other approaches.
However, training these models still requires a large amount of data that demonstrate solved problems. Humans would need to solve each problem with traditional slow methods, making the cost to generate such data prohibitive, Yang says.
Instead, the researchers reversed the process by coming up with solutions first. They used fast algorithms to generate segmented boxes and fit a diverse set of 3D objects into each segment, ensuring tight packing, stable poses, and collision-free solutions.
“With this process, data generation is almost instantaneous in simulation. We can generate tens of thousands of environments where we know the problems are solvable,” she says.
Trained using these data, the diffusion models work together to determine locations objects should be placed by the robotic gripper that achieve the packing task while meeting all of the constraints.
They conducted feasibility studies, and then demonstrated Diffusion-CCSP with a real robot solving a number of difficult problems, including fitting 2D triangles into a box, packing 2D shapes with spatial relationship constraints, stacking 3D objects with stability constraints, and packing 3D objects with a robotic arm.
Their method outperformed other techniques in many experiments, generating a greater number of effective solutions that were both stable and collision-free.
In the future, Yang and her collaborators want to test Diffusion-CCSP in more complicated situations, such as with robots that can move around a room. They also want to enable Diffusion-CCSP to tackle problems in different domains without the need to be retrained on new data.
“Diffusion-CCSP is a machine-learning solution that builds on existing powerful generative models,” says Danfei Xu, an assistant professor in the School of Interactive Computing at the Georgia Institute of Technology and a Research Scientist at NVIDIA AI, who was not involved with this work. “It can quickly generate solutions that simultaneously satisfy multiple constraints by composing known individual constraint models. Although it’s still in the early phases of development, the ongoing advancements in this approach hold the promise of enabling more efficient, safe, and reliable autonomous systems in various applications.”
This research was funded, in part, by the National Science Foundation, the Air Force Office of Scientific Research, the Office of Naval Research, the MIT-IBM Watson AI Lab, the MIT Quest for Intelligence, the Center for Brains, Minds, and Machines, Boston Dynamics Artificial Intelligence Institute, the Stanford Institute for Human-Centered Artificial Intelligence, Analog Devices, JPMorgan Chase and Co., and Salesforce.
My Jewellery embraces the future of warehouse automation with innovative RoboShuttle solution by Geek+
Interview with Marek Šuppa: insights into RoboCupJunior
A RoboCupJunior soccer match in action.
In July this year, 2500 participants congregated in Bordeaux for RoboCup2023. The competition comprises a number of leagues, and among them is RoboCupJunior, which is designed to introduce RoboCup to school children, with the focus being on education. There are three sub-leagues: Soccer, Rescue and OnStage.
Marek Šuppa serves on the Executive Committee for RoboCupJunior, and he told us about the competition this year and the latest developments in the Soccer league.
What is your role in RoboCupJunior and how long have you been involved with this league?
I started with RoboCupJunior quite a while ago: my first international competition was in 2009 in Graz, where I was lucky enough to compete in Soccer for the first time. Our team didn’t do all that well in that event but RoboCup made a deep impression and so I stayed around: first as a competitor and later to help organise the RoboCupJunior Soccer league. Right now I am serving as part of the RoboCupJunior Execs who are responsible for the organisation of RoboCupJunior as a whole.
How was the event this year? What were some of the highlights?
I guess this year’s theme or slogan, if we were to give it one, would be “back to normal”, or something like that. Although RoboCup 2022 already took place in-person in Thailand last year after two years of a pandemic pause, it was in a rather limited capacity, as COVID-19 still affected quite a few regions. It was great to see that the RoboCup community was able to persevere and even thrive throughout the pandemic, and that RoboCup 2023 was once again an event where thousands of robots and roboticists meet.
It would also be difficult to do this question justice without thanking the local French organisers. They were actually ready to organise the event in 2020 but it got cancelled due to COVID-19. But they did not give up on the idea and managed to put together an awesome event this year, for which we are very thankful.
Examples of the robots used by the RoboCupJunior Soccer teams.
Turning to RoboCupJunior Soccer specifically, could you talk about the mission of the league and how you, as organisers, go about realising that mission?
The mission of RoboCupJunior consists of two competing objectives: on one hand, it needs to be a challenge that’s approachable, interesting and relevant for (mostly) high school students and at the same time it needs to be closely related to the RoboCup “Major” challenges, which are tackled by university students and their mentors. We are hence continuously trying to both make it more compelling and captivating for the students and at the same time ensure it is technical enough to help them grow towards the RoboCup “Major” challenges.
One of the ways we do that is by introducing what we call “SuperTeam” challenges, in which teams from respective countries form a so-called “SuperTeam” and compete against another “SuperTeam” as if these were distinct teams. In RoboCupJunior Soccer the “SuperTeams” are composed of four to five teams and they compete on a field that is six times larger than the “standard” fields that are used for the individual games. While in the individual matches each team can play with two robots at most (resulting in a 2v2 game) in a SuperTeam match each SuperTeam fields five robots, meaning there are 10 robots that play on the SuperTeam field during a SuperTeam match. The setup is very similar to the Division B of the Small Size League of RoboCup “Major”.
The SuperTeam games have existed in RoboCupJunior Soccer since 2013, so for quite a while, and the feedback we received on them was overwhelmingly positive: it was a lot of fun for both the participants as well as the spectators. But compared to the Small Size League games there were still two noticeable differences: the robots did not have a way of communicating with one another and additionally, the referees did not have a way of communicating with the robots. The result was that not only was there little coordination among robots of the same SuperTeam, whenever the game needed to be stopped, the referees had to physically run after the robots on the field to catch them and do a kickoff after a goal was scored. Although hilarious, it’s far from how we would imagine the SuperTeam games to look.
The RoboCupJunior Soccer Standard Communication Modules aim to do both. The module itself is a small device that is attached to each robot on the SuperTeam field. These devices are all connected via Bluetooth to a single smartphone, through which the referee can send commands to all robots on the field. The devices themselves also support direct message exchange between robots on a single SuperTeam, meaning the teams do not have to invest into figuring out how to communicate with the other robots but can make use of a common platform. The devices, as well as their firmware, are open source, meaning not only that everyone can build their own Standard Communication Module if they’d like but also that the community can participate in its development, which makes it an interesting addition to RoboCupJunior Soccer.
RoboCupJunior Soccer teams getting ready for the competition.
How did this new module work out in the competition? Did you see an improvement in experience for the teams and organisers?
In this first big public test we focused on exploring how (and whether) these modules can improve the gameplay – especially the “chasing robots at kickoff”. Although we’ve done “lab experiments” in the past and had some empirical evidence that it should work rather well, this was the first time we tried it in a real competition.
All in all, I would say that it was a very positive experiment. The modules themselves did work quite well and for some of us, who happened to have experience with “robot chasing” mentioned above, it was sort of a magical feeling to see the robots stop right on the main referee’s whistle.
We also found out potential areas for improvement in the future. The modules themselves do not have a power source of their own and were powered by the robots themselves. We didn’t think this would be a problem but in the “real world” test it transpired that the voltage levels the robots are capable of providing fluctuates significantly – for instance when the robot decides to aggressively accelerate – which in turn means some of the modules disconnect when the voltage is lowered significantly. However, it ended up being a nice lesson for everyone involved, one that we can certainly learn from when we design the next iterations.
The livestream from Day 4 of RoboCupJunior Soccer 2023. This stream includes the SuperTeam finals and the technical challenges. You can also view the livestream of the semifinals and finals from day three here.
Could you tell us about the emergence of deep-learning models in the RoboCupJunior leagues?
This is something we started to observe in recent years which surprised us organisers, to some extent. In our day-to-day jobs (that is, when we are not organising RoboCup), many of us, the organisers, work in areas related to robotics, computer science, and engineering in general – with some of us also doing research in artificial intelligence and machine learning. And while we always thought that it would be great to see more of the cutting-edge research being applied at RoboCupJunior, we always dismissed it as something too advanced and/or difficult to set up for the high school students that comprise the majority of RoboCupJunior students.
Well, to our great surprise, some of the more advanced teams have started to utilise methods and technologies that are very close to the current state-of-the-art in various areas, particularly computer vision and deep learning. A good example would be object detectors (usually based on the YOLO architecture), which are now used across all three Junior leagues: in OnStage to detect various props, robots and humans who perform on the stage together, in Rescue to detect the victims the robots are rescuing and in Soccer to detect the ball, the goals, and the opponents. And while the participants generally used an off-the-shelf implementations, they still needed to do all the steps necessary for a successful deployment of this technology: gather a dataset, finetune the deep-learning model and deploy it on their robots – all of which is far from trivial and is very close to how these technologies get used in both research and industry.
Although we have seen only the more advanced teams use deep-learning models at RoboCupJunior, we expect that in the future we will see it become much more prevalent, especially as the technology and the tooling around it becomes more mature and robust. It does show, however, that despite their age, the RoboCupJunior students are very close to cutting-edge research and state-of-the-art technologies.
Action from RoboCupJunior Soccer 2023.
How can people get involved in RCJ (either as a participant or an organiser?)
A very good question!
The best place to start would be the RoboCupJunior website where one can find many interesting details about RoboCupJunior, the respective leagues (such as Soccer, Rescue and OnStage), and the relevant regional representatives who organise regional events. Getting in touch with a regional representative is by far the easiest way of getting started with RoboCup Junior.
Additionally, I can certainly recommend the RoboCupJunior forum, where many RoboCupJunior participants, past and present, as well as the organisers, discuss many related topics in the open. The community is very beginner friendly, so if RoboCupJunior sounds interesting, do not hesitate to stop by and say hi!
About Marek Šuppa
Marek stumbled upon AI as a teenager when building soccer-playing robots and quickly realised he is not smart enough to do all the programming by himself. Since then, he’s been figuring out ways to make machines learn by themselves, particularly from text and images. He currently serves as the Principal Data Scientist at Slido (part of Cisco), improving the way meetings are run around the world. Staying true to his roots, he tries to provide others with a chance to have a similar experience by organising the RoboCupJunior competition as part of the Executive Committee. |
Rethinking the Role of PPO in RLHF
TL;DR: In RLHF, there’s tension between the reward learning phase, which uses human preference in the form of comparisons, and the RL fine-tuning phase, which optimizes a single, non-comparative reward. What if we performed RL in a comparative way?
Figure 1:
This diagram illustrates the difference between reinforcement learning from absolute feedback and relative feedback. By incorporating a new component - pairwise policy gradient, we can unify the reward modeling stage and RL stage, enabling direct updates based on pairwise responses.
Large Language Models (LLMs) have powered increasingly capable virtual assistants, such as GPT-4, Claude-2, Bard and Bing Chat. These systems can respond to complex user queries, write code, and even produce poetry. The technique underlying these amazing virtual assistants is Reinforcement Learning with Human Feedback (RLHF). RLHF aims to align the model with human values and eliminate unintended behaviors, which can often arise due to the model being exposed to a large quantity of low-quality data during its pretraining phase.
Proximal Policy Optimization (PPO), the dominant RL optimizer in this process, has been reported to exhibit instability and implementation complications. More importantly, there’s a persistent discrepancy in the RLHF process: despite the reward model being trained using comparisons between various responses, the RL fine-tuning stage works on individual responses without making any comparisons. This inconsistency can exacerbate issues, especially in the challenging language generation domain.
Given this backdrop, an intriguing question arises: Is it possible to design an RL algorithm that learns in a comparative manner? To explore this, we introduce Pairwise Proximal Policy Optimization (P3O), a method that harmonizes the training processes in both the reward learning stage and RL fine-tuning stage of RLHF, providing a satisfactory solution to this issue. Read More
Easing job jitters in the digital revolution
Professor Steven Dhondt has a reassurance of sorts for people in the EU worried about losing their jobs to automation: relax.
Dhondt, an expert in work and organisational change at the Catholic University Leuven in Belgium, has studied the impact of technology on jobs for the past four decades. Fresh from leading an EU research project on the issue, he stresses opportunities rather than threats.
Right vision
‘We need to develop new business practices and welfare support but, with the right vision, we shouldn’t see technology as a threat,’ Dhondt said. ‘Rather, we should use it to shape the future and create new jobs.’
The rapid and accelerating advance in digital technologies across the board is regarded as the world’s fourth industrial revolution, ushering in fundamental shifts in how people live and work.
If the first industrial revolution was powered by steam, the second by electricity and the third by electronics, the latest will be remembered for automation, robotics and artificial intelligence, or AI. It’s known as “Industry 4.0”.
‘Whether it was the Luddite movement in the 1800s through the introduction of automatic spinning machines in the wool industry or concerns about AI today, questions about technology’s impact on jobs really reflect wider ones about employment practices and the labour market,’ said Dhondt.
He is also a senior scientist at a Netherlands-based independent research organisation called TNO.
The EU project that Dhondt led explored how businesses and welfare systems could better adapt to support workers in the face of technological changes. The initiative, called Beyond4.0, began in January 2019 and wrapped up in June 2023.
While the emergence of self-driving cars and AI-assisted robots holds big potential for economic growth and social progress, they also sound alarm bells.
More than 70% of EU citizens fear that new technologies will “steal” people’s jobs, according to a 2019 analysis by the European Centre for the Development of Vocational Training.
Local successes
The Beyond4.0 researchers studied businesses across Europe that have taken proactive and practical steps to empower employees.
“We shouldn’t see technology as a threat – rather we should use it to shape the future and create new jobs.”
– Professor Steven Dhondt, BEYOND4.0
One example is a family-run Dutch glass company called Metaglas, which decided that staying competitive in the face of technological changes required investing more in its own workforce.
Metaglas offered workers greater openness with management and a louder voice on the company’s direction and product development.
The move, which the company named “MetaWay”, has helped it retain workers while turning a profit that is being reinvested in the workforce, according to Dhondt.
He said the example shows the importance in the business world of managers’ approach to the whole issue.
‘The technology can be an enabler, not a threat, but the decision about that lies with management in organisations,’ Dhondt said. ‘If management uses technology to downgrade the quality of jobs, then jobs are at risk. If management uses technology to enhance jobs, then you can see workers and organisations learn and improve.’
The Metaglas case has fed into a “knowledge bank” meant to inform business practices more broadly.
Dhondt also highlighted the importance of regions in Europe where businesses and job trainers join forces to support people.
BEYOND4.0 studied the case of the Finnish city of Oulu – once a leading outpost of mobile-phone giant Nokia. In the 2010s, the demise of Nokia’s handset business threatened Oulu with a “brain drain” as the company’s engineers were laid-off.
But collaboration among Nokia, local universities and policymakers helped grow new businesses including digital spin-offs and kept hundreds of engineers in the central Finnish region, once a trading centre for wood tar, timber and salmon.
Some Nokia engineers went to the local hospital to work on electronic healthcare services – “e-health” – while others moved to papermaker Stora Enso, according to Dhondt.
Nowadays there are more high-tech jobs in Oulu than during Nokia’s heyday. The BEYOND4.0 team held the area up as a successful “entrepreneurial ecosystem” that could help inform policies and practices elsewhere in Europe.
Income support
In cases where people were out of work, the project also looked to new forms of welfare support.
Dhondt’s Finnish colleagues examined the impact of a two-year trial in Finland of a “universal basic income” – or UBI – and used this to assess the feasibility of a different model called “participation income.”
In the UBI experiment, participants each received a monthly €560 sum, which was paid unconditionally. Although UBI is often touted as an answer to automation, BEYOND4.0’s evaluation of the Finnish trial was that it could weaken the principle of solidarity in society.
The project’s participation income approach requires recipients of financial support to undertake an activity deemed useful to society. This might include, for example, care for the elderly or for children.
While detailed aspects are still being worked out, the BEYOND4.0 team discussed participation income with the government of Finland and the Finnish parliament has put the idea on the agenda for debate.
Dhondt hopes the project’s findings, including on welfare support, will help other organisations better navigate the changing tech landscape.
Employment matchmakers
Another researcher keen to help people adapt to technological changes is Dr Aisling Tuite, a labour-market expert at the South East Technical University in Ireland.
“We wanted to develop a product that could be as useful for people looking for work as for those supporting them.”
– Dr Aisling Tuite, HECAT
Tuite has looked at how digital technologies can help job seekers find suitable work.
She coordinated an EU-funded project to help out-of-work people find jobs or develop new skills through a more open online system.
Called HECAT, the project ran from February 2020 through July 2023 and brought together researchers from Denmark, France, Ireland, Slovenia, Spain and Switzerland.
In recent years, many countries have brought in active labour-market policies that deploy computer-based systems to profile workers and help career counsellors target people most in need of help.
While this sounds highly targeted, Tuite said that in reality it often pushes people into employment that might be unsuitable for them and is creating job-retention troubles.
‘Our current employment systems often fail to get people to the right place – they just move people on,’ she said. ‘What people often need is individualised support or new training. We wanted to develop a product that could be as useful for people looking for work as for those supporting them.’
Ready to run
HECAT’s online system combines new vacancies with career counselling and current labour-market data.
The system was tested during the project and a beta version is now available via My Labour Market and can be used in all EU countries where data is available.
It can help people figure out where there are jobs and how to be best positioned to secure them, according to Tuite.
In addition to displaying openings by location and quality, the system offers detailed information about career opportunities and labour-market trends including the kinds of jobs on the rise in particular areas and the average time it takes to find a position in a specific sector.
Tuite said feedback from participants in the test was positive.
She recalled one young female job seeker saying it had made her more confident in exploring new career paths and another who said knowing how long the average “jobs wait” would be eased the stress of hunting.
Looking ahead, Tuite hopes the HECAT researchers can demonstrate the system in governmental employment-services organisations in numerous EU countries over the coming months.
‘There is growing interest in this work from across public employment services in the EU and we’re excited,’ she said.
(This article was updated on 21 September 2023 to include a reference to Steven Dhondt’s role at TNO in the Netherlands)
Research in this article was funded by the EU.
This article was originally published in Horizon, the EU Research and Innovation magazine.
ep.366: Deep Learning Meets Trash: Amp Robotics’ Revolution in Materials Recovery, with Joe Castagneri
In this episode, Abate flew to Denver, Colorado, to get a behind-the-scenes look at the future of recycling with Joe Castagneri, the head of AI at Amp Robotics. With Materials Recovery Facilities (MRFs) processing a staggering 25 tons of trash per hour, robotic sorting is the clear long-term solution.
Recycling is a for-profit industry. When the margins don’t make sense, the items will not be recycled. This is why Amp’s mission to use robotics and AI to bring down the cost of recycling and increase the number of items that can be sorted for recycling is so impactful.
Joe Castagneri
Joe Castagneri graduated with his Master of Science in Applied Mathematics, with an undergrad degree in Physics. While still in university, he first joined the team at Amp Robotics in 2016 where he worked on Machine Learning models to identify recyclables in video streams of Trash in Materials Recovery Facilities (MRFs). Today, he is the Head of AI at Amp Robotics where he is changing the economics of recycling through automation.
ABB survey reveals unplanned downtime costs $103,000 per hour
Robot Talk Episode 57 – Kate Devlin
Claire chatted to Kate Devlin from King’s College London about the social and ethical implications of robotics and AI.
Kate Devlin is Reader in Artificial Intelligence & Society in the Department of Digital Humanities, King’s College London. She is an interdisciplinary computer scientist investigating how people interact with and react to technologies, both past and future. Kate is the author of Turned On: Science, Sex and Robots, which examines the ethical and social implications of technology and intimacy. She is Creative and Outreach lead for the UKRI Responsible Artificial Intelligence UK programme — an international research and innovation ecosystem for responsible AI.