Page 1 of 462
1 2 3 462

Vote-based model developed for more accurate hand-held object pose estimation

Many robotic applications rely on robotic arms or hands to handle different types of objects. Estimating the pose of such hand-held objects is an important yet challenging task in robotics, computer vision and even in augmented reality (AR) applications. A promising direction is to utilize multi-modal data, such as color (RGB) and depth (D) images. With the increasing availability of 3D sensors, many machine learning approaches have emerged to leverage this technique.

Robot Talk Episode 119 – Robotics for small manufacturers, with Will Kinghorn

Claire chatted to Will Kinghorn from Made Smarter about how to increase adoption of new tech by small manufacturers.

Will Kinghorn is an automation and robotics specialist for the Made Smarter Adoption Programme in the UK. With a background as a chartered manufacturing engineer in the aerospace industry, Will has extensive experience in developing and implementing automation and robotic solutions. He now works with smaller manufacturing companies, assessing their needs, identifying suitable technologies, and guiding them through the adoption process.  Last year he released a book called ‘Digital Transformation in Your Manufacturing Business – A Made Smarter Guide’.

Multi-agent path finding in continuous environments

By Kristýna Janovská and Pavel Surynek

Imagine if all of our cars could drive themselves – autonomous driving is becoming possible, but to what extent? To get a vehicle somewhere by itself may not seem so tricky if the route is clear and well defined, but what if there are more cars, each trying to get to a different place? And what if we add pedestrians, animals and other unaccounted for elements? This problem has recently been increasingly studied, and already used in scenarios such as warehouse logistics, where a group of robots move boxes in a warehouse, each with its own goal, but all moving while making sure not to collide and making their routes – paths – as short as possible. But how to formalize such a problem? The answer is MAPF – multi-agent path finding [Silver, 2005].

Multi-agent path finding describes a problem where we have a group of agents – robots, vehicles or even people – who are each trying to get from their starting positions to their goal positions all at once without ever colliding (being in the same position at the same time).

Typically, this problem has been solved on graphs. Graphs are structures that are able to simplify an environment using its focal points and interconnections between them. These points are called vertices and can represent, for example, coordinates. They are connected by edges, which connect neighbouring vertices and represent distances between them.

If however we are trying to solve a real-life scenario, we strive to get as close to simulating reality as possible. Therefore, discrete representation (using a finite number of vertices) may not suffice. But how to search an environment that is continuous, that is, one where there is basically an infinite amount of vertices connected by edges of infinitely small sizes?

This is where something called sampling-based algorithms comes into play. Algorithms such as RRT* [Karaman and Frazzoli, 2011], which we used in our work, randomly select (sample) coordinates in our coordinate space and use them as vertices. The more points that are sampled, the more accurate the representation of the environment is. These vertices are connected to that of their nearest neighbours which minimizes the length of the path from the starting point to the newly sampled point. The path is a sequence of vertices, measured as a sum of the lengths of edges between them.

Figure 1: Two examples of paths connecting starting positions (blue) and goal positions (green) of three agents. Once an obstacle is present, agents plan smooth curved paths around it, successfully avoiding both the obstacle and each other.

We can get a close to optimal path this way, though there is still one problem. Paths created this way are still somewhat bumpy, as the transition between different segments of a path is sharp. If a vehicle was to take this path, it would probably have to turn itself at once when it reaches the end of a segment, as some robotic vacuum cleaners do when moving around. This slows the vehicle or a robot down significantly. A way we can solve this is to take these paths and smooth them, so that the transitions are no longer sharp, but smooth curves. This way, robots or vehicles moving on them can smoothly travel without ever stopping or slowing down significantly when in need of a turn.

Our paper [Janovská and Surynek, 2024] proposed a method for multi-agent path finding in continuous environments, where agents move on sets of smooth paths without colliding. Our algorithm is inspired by the Conflict Based Search (CBS) [Sharon et al., 2014]. Our extension into a continuous space called Continuous-Environment Conflict-Based Search (CE-CBS) works on two levels:

Figure 2: Comparison of paths found with discrete CBS algorithm on a 2D grid (left) and CE-CBS paths in a continuous version of the same environment. Three agents move from blue starting points to green goal points. These experiments are performed in the Robotic Agents Laboratory at Faculty of Information Technology of the Czech Technical University in Prague.

Firstly, each agent searches for a path individually. This is done with the RRT* algorithm as mentioned above. The resulting path is then smoothed using B-spline curves, polynomial piecewise curves applied to vertices of the path. This removes sharp turns and makes the path easier to traverse for a physical agent.

Individual paths are then sent to the higher level of the algorithm, in which paths are compared and conflicts are found. Conflict arises if two agents (which are represented as rigid circular bodies) overlap at any given time. If so, constraints are created to forbid one of the agents from passing through the conflicting space at a time interval during which it was previously present in that space. Both options which constrain one of the agents are tried – a tree of possible constraint settings and their solutions is constructed and expanded upon with each conflict found. When a new constraint is added, this information passes to all agents it concerns and their paths are re-planned so that they avoid the constrained time and space. Then the paths are checked again for validity, and this repeats until a conflict-free solution, which aims to be as short as possible is found.

This way, agents can effectively move without losing speed while turning and without colliding with each other. Although there are environments such as narrow hallways where slowing down or even stopping may be necessary for agents to safely pass, CE-CBS finds solutions in most environments.

This research is supported by the Czech Science Foundation, 22-31346S.

You can read our paper here.

References

Scientists use virtual reality for fish to teach robots how to swarm

Fish are masters of coordinated motion. Schools of fish have no leader, yet individuals manage to stay in formation, avoid collisions, and respond with liquid flexibility to changes in their environment. Reproducing this combination of robustness and flexibility has been a long-standing challenge for human-engineered systems like robots.

‘Robotability score’ ranks NYC streets for future robot deployment

For delivery robots, not all sidewalks are created equal—some are uneven or clogged with people and bus shelters—so researchers at Cornell Tech developed a "robotability score" and rated every street in New York City on how hospitable it would be to robots.

Forecast demand with precision using advanced AI for SAP IBP


Supply chain leaders know the pain of misaligned inventory. A high-demand product sells out just as a promotion goes live. Meanwhile, pallets of obsolete stock gather dust in the warehouse. 

These aren’t just planning hiccups — they’re costly missteps that eat into margins, tie up working capital, and erode customer trust.

But what if your forecasts could finally keep pace with today’s volatility? 

Imagine accurately anticipating demand – even when trends shift overnight, new SKUs launch, or external shocks like tariffs, inflation, or weather disrupt the plan.

If you’re already using SAP Integrated Business Planning (IBP), you’re in a strong position to take forecasting further. By layering advanced AI into your SAP IBP workflows, you can dramatically improve forecast accuracy and planning agility—without changing how your teams work.

The forecast gap that’s costing you millions


No matter your industry, the margin for error in demand planning is razor-thin:

  • Retailers lose revenue and customer loyalty when popular products sell out, while slow-moving stock ties up capital.
  • Distributors face freight chaos and missed service levels during seasonal peaks, unable to predict and respond quickly to shifting volumes.
  • CPG brands risk under- or over-producing when promotions or regional shifts aren’t accounted for.
  • All planners struggle with cold starts: new or low-history SKUs that traditional models can’t forecast reliably. 


These challenges are only intensifying.

Economic volatility, shifting consumer behaviors, tariffs, and climate-related disruptions introduce new variables that make demand harder to predict — and even harder to respond to in time. 

Even in robust, enterprise platforms like SAP IBP, traditional forecasting models often rely too heavily on limited historical data and rigid assumptions. They struggle to adapt quickly to new signals or scale across complex planning environments with thousands of SKUs, channels, and geographic regions.

Staying competitive requires more than minor tuning. It calls for intelligent systems that can learn from change, adapt to complexity, and deliver insights at scale.

That’s where the DataRobot Demand Planning App adds value. 

It enhances your SAP IBP workflows with AI to provide a more agile, intelligent, and precise forecasting engine.

Add demand planning precision to your SAP IBP 


The DataRobot Demand Planning App integrates directly with SAP IBP, infusing your existing planning environment with advanced AI forecasting. It works within your existing workflows, requiring no complex implementation, retraining, or disruption to planning processes. 

Here’s how it helps:

  • Improve accuracy where it matters most: Confidently forecast volatile demand patterns, including new product launches, promotional spikes, and macroeconomic disruptions.
  • Right-size inventory and reduce waste: Align supply chain with actual demand to minimize stockouts and overstocks, freeing up capital and reducing excess carrying costs.
  • Drive adoption with seamless integration: Forecasts appear  directly within SAP IBP, so planners don’t need to learn a new tool or adjust their process.
  • Accelerate impact without overwhelming AI teams: Prebuilt templates automate data prep, model training, and deployment so your data science resources can stay focused on strategic priorities.
Screenshot 2025 04 29 at 10.40.59 AM

How AI is changing demand planning   


Organizations across industries are already using DataRobot for demand planning to deliver measurable ROI and reduce forecast risk :

  • Global fashion retailer: Improved forecast accuracy by 9% at the SKU level, saving hundreds of thousands from overstock losses through better inventory alignment.
  • European CPG company: Maintained 99% stock availability and reduced held stock by £500K per month, freeing up working capital and improving service levels.
  • Fortune 100 retailer: Increased Black Friday forecast accuracy to 95%, saving $2M per week through optimized workforce and inventory planning.
  • Fortune 500 tech company: Predicted sell-out volumes four weeks in advance, enabling more strategic inventory placement around retailer promotions. 

These aren’t edge cases. They’re evidence that the right AI solution, when integrated into the tools your team already uses, can deliver fast and sustainable value.

What sets the Demand Planning App apart


The DataRobot Demand Planning App wasn’t built to replace SAP IBP. It was built to enhance and extend it. It complements your existing investment with differentiated AI capabilities purpose-built for the real-world complexity of demand planning.

Why it’s different:

  • Prebuilt, customizable forecasting templates designed for cold starts, promotions, seasonality, and high-SKU environments help you get started quickly and adjust as needed without starting from scratch.
  • The app adapts in near real time,  providing flexibility to easily incorporate broader factors from other data sources to identify the impact of emerging trends, geographic variation, and external factors like weather, inflation, or regulatory changes.
  • Built-in monitoring, drift detection, version control, and auditability support enterprise-grade accuracy, compliance, and governance.
  • Forecast outputs flow directly into SAP IBP, so planners can see and use improved forecasts in the tools they already know, without retraining or manual work. 

Four ways to strengthen demand planning in a volatile market 


For supply chain and AI leaders looking to reduce risk, drive agility, and future-proof your operations, here are four practical actions to take now:

  1. Incorporate external signals: Go beyond historical data by integrating real-time signals like inflation, weather, or social sentiment to anticipate shifts in demand. 
  2. Tackle  cold starts with proxy data. Use intelligent modeling to accurately forecast demand for new SKUs so they don’t become blind spots in your planning process.
  3. Align on shared KPIs. Strengthen collaboration by ensuring  supply chain, finance, and AI teams are working toward common goals and measurable outcomes. 
  4. Automate what slows you down: Improve accuracy and speed by automating data prep and model deployment using  tools designed for scale and repeatability.

Ready to enhance SAP IBP with AI?


In volatile markets, every planning decision carries more weight. Forecasting missteps can ripple across your entire supply chain, eroding trust, tying up cash, and compromising customer satisfaction.

The DataRobot Demand Planning App enhances SAP IBP with advanced AI-driven forecasting that learns, adapts, and scales as fast as your business moves — without disrupting the systems or workflows you rely on.

Want to see what this could look like in your SAP IBP environment? Schedule a demo.

The post Forecast demand with precision using advanced AI for SAP IBP appeared first on DataRobot.

Flying squirrel-inspired drone with foldable wings demonstrates high maneuverability

Unmanned aerial vehicles (UAVs), commonly known as drones, have already proved to be valuable tools for a wide range of applications, ranging from film and entertainment production to defense and security, agriculture, logistics, construction and environmental monitoring. While these technologies are already widely used in many countries worldwide, engineers have been trying to enhance their capabilities further so that they can be used to tackle even more complex problems.

Interview with Yuki Mitsufuji: Improving AI image generation


Yuki Mitsufuji is a Lead Research Scientist at Sony AI. Yuki and his team presented two papers at the recent Conference on Neural Information Processing Systems (NeurIPS 2024). These works tackle different aspects of image generation and are entitled: GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping and PaGoDA: Progressive Growing of a One-Step Generator from a Low-Resolution Diffusion Teacher . We caught up with Yuki to find out more about this research.

There are two pieces of research we’d like to ask you about today. Could we start with the GenWarp paper? Could you outline the problem that you were focused on in this work?

The problem we aimed to solve is called single-shot novel view synthesis, which is where you have one image and want to create another image of the same scene from a different camera angle. There has been a lot of work in this space, but a major challenge remains: when an image angle changes substantially, the image quality degrades significantly. We wanted to be able to generate a new image based on a single given image, as well as improve the quality, even in very challenging angle change settings.

How did you go about solving this problem – what was your methodology?

The existing works in this space tend to take advantage of monocular depth estimation, which means only a single image is used to estimate depth. This depth information enables us to change the angle and change the image according to that angle – we call it “warp.” Of course, there will be some occluded parts in the image, and there will be information missing from the original image on how to create the image from a new angle. Therefore, there is always a second phase where another module can interpolate the occluded region. Because of these two phases, in the existing work in this area, geometrical errors introduced in warping cannot be compensated for in the interpolation phase.

We solve this problem by fusing everything together. We don’t go for a two-phase approach, but do it all at once in a single diffusion model. To preserve the semantic meaning of the image, we created another neural network that can extract the semantic information from a given image as well as monocular depth information. We inject it using a cross-attention mechanism, into the main base diffusion model. Since the warping and interpolation were done in one model, and the occluded part can be reconstructed very well together with the semantic information injected from outside, we saw the overall quality improved. We saw improvements in image quality both subjectively and objectively, using metrics such as FID and PSNR.

Can people see some of the images created using GenWarp?

Yes, we actually have a demo, which consists of two parts. One shows the original image and the other shows the warped images from different angles.

Moving on to the PaGoDA paper, here you were addressing the high computational cost of diffusion models? How did you go about addressing that problem?

Diffusion models are very popular, but it’s well-known that they are very costly for training and inference. We address this issue by proposing PaGoDA, our model which addresses both training efficiency and inference efficiency.

It’s easy to talk about inference efficiency, which directly connects to the speed of generation. Diffusion usually takes a lot of iterative steps towards the final generated output – our goal was to skip these steps so that we could quickly generate an image in just one step. People call it “one-step generation” or “one-step diffusion.” It doesn’t always have to be one step; it could be two or three steps, for example, “few-step diffusion”. Basically, the target is to solve the bottleneck of diffusion, which is a time-consuming, multi-step iterative generation method.

In diffusion models, generating an output is typically a slow process, requiring many iterative steps to produce the final result. A key trend in advancing these models is training a “student model” that distills knowledge from a pre-trained diffusion model. This allows for faster generation—sometimes producing an image in just one step. These are often referred to as distilled diffusion models. Distillation means that, given a teacher (a diffusion model), we use this information to train another one-step efficient model. We call it distillation because we can distill the information from the original model, which has vast knowledge about generating good images.

However, both classic diffusion models and their distilled counterparts are usually tied to a fixed image resolution. This means that if we want a higher-resolution distilled diffusion model capable of one-step generation, we would need to retrain the diffusion model and then distill it again at the desired resolution.

This makes the entire pipeline of training and generation quite tedious. Each time a higher resolution is needed, we have to retrain the diffusion model from scratch and go through the distillation process again, adding significant complexity and time to the workflow.

The uniqueness of PaGoDA is that we train across different resolution models in one system, which allows it to achieve one-step generation, making the workflow much more efficient.

For example, if we want to distill a model for images of 128×128, we can do that. But if we want to do it for another scale, 256×256 let’s say, then we should have the teacher train on 256×256. If we want to extend it even more for higher resolutions, then we need to do this multiple times. This can be very costly, so to avoid this, we use the idea of progressive growing training, which has already been studied in the area of generative adversarial networks (GANs), but not so much in the diffusion space. The idea is, given the teacher diffusion model trained on 64×64, we can distill information and train a one-step model for any resolution. For many resolution cases we can get a state-of-the-art performance using PaGoDA.

Could you give a rough idea of the difference in computational cost between your method and standard diffusion models. What kind of saving do you make?

The idea is very simple – we just skip the iterative steps. It is highly dependent on the diffusion model you use, but a typical standard diffusion model in the past historically used about 1000 steps. And now, modern, well-optimized diffusion models require 79 steps. With our model that goes down to one step, we are looking at it about 80 times faster, in theory. Of course, it all depends on how you implement the system, and if there’s a parallelization mechanism on chips, people can exploit it.

Is there anything else you would like to add about either of the projects?

Ultimately, we want to achieve real-time generation, and not just have this generation be limited to images. Real-time sound generation is an area that we are looking at.

Also, as you can see in the animation demo of GenWarp, the images change rapidly, making it look like an animation. However, the demo was created with many images generated with costly diffusion models offline. If we could achieve high-speed generation, let’s say with PaGoDA, then theoretically, we could create images from any angle on the fly.

Find out more:

About Yuki Mitsufuji

Yuki Mitsufuji is a Lead Research Scientist at Sony AI. In addition to his role at Sony AI, he is a Distinguished Engineer for Sony Group Corporation and the Head of Creative AI Lab for Sony R&D. Yuki holds a PhD in Information Science & Technology from the University of Tokyo. His groundbreaking work has made him a pioneer in foundational music and sound work, such as sound separation and other generative models that can be applied to music, sound, and other modalities.

High-wire act: Soft robot can carry cargo up and down steep aerial wires

Researchers have created a light-powered soft robot that can carry loads through the air along established tracks, similar to cable cars or aerial trams. The soft robot operates autonomously, can climb slopes at angles of up to 80 degrees, and can carry loads up to 12 times its weight.
Page 1 of 462
1 2 3 462