News

Page 1 of 600
1 2 3 600

AI search robot uses 3D maps and internet knowledge to find lost items

A robot that can locate lost items on command, the latest development at the Technical University of Munich (TUM), combines knowledge from the internet with a spatial map of its surroundings to efficiently find the objects being sought. The new robot from Prof. Angela Schoellig's TUM Learning Systems and Robotics Lab looks like a broomstick on wheels with a camera mounted at the top. It is one of the first robots that not only integrates image understanding but also applies it to a clearly defined task.

Build enterprise-ready Agentic AI with DataRobot using NVIDIA Nemotron 3 Super 

With the arrival of NVIDIA Nemotron 3 Super, organizations now have access to a high-accuracy reasoning model purpose-built for collaborative, multi-agent enterprise workloads. Being fully open, Nemotron 3 Super can be customized and deployed securely anywhere. However, having a powerful large language model (LLM) like Nemotron 3 Super is just the starting line. The real challenge is turning that powerful reasoning engine quickly into a production-grade system that your enterprise can trust for building AI agents and applications seamlessly using the LLM.

That is where DataRobot comes in. In this post, we will walk through how DataRobot’s Agent Workforce Platform, co-engineered with NVIDIA, makes it straightforward and quick to take Nemotron 3 Super from a standalone Large Language Model (LLM) to a fully deployed, evaluated, monitored, and governed production system that enterprises can trust and use to build their AI agents and applications seamlessly. We will also explore why mastering each of these steps is critical to successfully deploying specialized agentic AI systems.

A great LLM alone isn’t enough

Nemotron 3 Super is a highly capable 120-billion-parameter hybrid Mamba-Transformer MoE model, optimized for enterprise multi-agent tasks like IT automation and supply chain orchestration, boasting a 1-million-token context window. However, the move from pilot to reliable production is challenging; MIT research shows 95% of GenAI pilots fail, not due to the model’s capabilities, but due to issues in the surrounding deployment infrastructure.

Before deploying any LLM for enterprise applications and agents, organizations must address five critical areas:

  1. Evaluation and Comparison: Thoroughly assess models based on behavioral metrics (accuracy, hallucination) and operational metrics (cost, latency). Use LLMs as judges, proprietary, standard, or synthetic datasets, and comparative evaluations, often augmenting with human input.
  2. Efficient Hosting/Inferencing: Implement scalable, reliable, and elastic hosting infrastructure to ensure continuity for the LLM at the core of Generative and Agentic AI systems.
  3. Observability: Continuously monitor the deployed model’s behavior, both standalone and within agents, with instrumentation to detect and alert on drifts from desired performance.
  4. Real-Time Intervention and Moderation: Establish strong guardrails for real-time intervention to prevent undesirable or toxic behavior, such as PII leakage, which could compound quickly across interactions.
  5. Governance, Security, and Compliance: Enforce rigorous governance via authentication, authorization, approval workflows for updates, and comprehensive testing and reporting against enterprise, industry, and regulatory compliance standards.

DataRobot’s Agent Workforce Platform, co-engineered with NVIDIA, provides a unified solution for all these challenges with NVIDIA Nemotron 3 Super.

Launch Nemotron 3 Super NIM on your infrastructure with a few clicks

Your AI team wants Nemotron 3 Super in production. Your security team wants hardened containers with signed images. Your compliance team wants an audit trail from day one. And you want all of this to run without a month of configuration and a stack of support tickets.

NVIDIA NIM microservices are available directly within the DataRobot platform, pre-configured and optimized for NVIDIA AI Infrastructure. For Nemotron 3 Super — which uses NVFP4 quantization to deliver high performance while keeping compute costs predictable — this means your deployment comes production-ready out of the box. No inference engine tuning. No GPU parameter research. No guesswork.

deploy neomotron 3 final

Here’s what the workflow looks like:

  • Browse and select. Open the NVIDIA NIM model gallery inside DataRobot. Each model comes with a clear description of its capabilities, supported GPU configurations, and resource requirements. Select Nemotron 3 Super and import it into your registry. DataRobot automatically tracks the version, tags it, and begins a full lineage record — so when your compliance team asks “which exact model version is running in production?”, the answer is already documented. 
  • Let the platform handle GPU sizing. DataRobot recommends the optimal GPU configuration for your deployment — whether you’re running on NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs or other supported hardware — so you can focus on testing rather than troubleshooting infrastructure. You don’t need to understand the model’s internal architecture to get this right. The platform matches the model to your hardware and tells you what to provision. If your AI team later asks why you chose a particular configuration, the recommendation is logged and auditable.
  • Deploy with one click. Select your configuration and deploy. Here’s what makes this different from downloading a model container and figuring out the rest yourself: DataRobot deploys the model with monitoring and access controls already wired in. There’s no separate step to “add observability later.” The moment your Nemotron 3 Super endpoint goes live, its already reporting health metrics, latency, throughput, and token consumption to your monitoring dashboard — giving you immediate visibility into how the deployment is performing.

Your AI team gets a live API endpoint they can start building immediately. You get a deployment that’s observable and auditable from minute one. 

Multiple teams, one endpoint — without the free-for-all

Once Nemotron 3 Super is live, the next problem lands fast: multiple teams and applications all hitting the same deployment, with no way to prevent one team’s spike from degrading everyone else’s experience. Without controls, you’re back to fielding “why is the model so slow?” tickets.

NIM multi tenancy

DataRobot’s built-in quota management lets you set default access limits for each endpoint, then apply overrides for specific users, groups, or agents that need more (or less) capacity. Your production agent gets priority allocation; the experimentation team gets enough to stay productive without impacting production traffic. The platform enforces limits automatically — no more arbitrating access over email or diagnosing mystery slowdowns caused by a runaway agent on another team.

Built-in cost visibility

Not every task needs the same level of reasoning — and Nemotron 3 Super is equipped with a configurable thinking budget that lets you match inference cost to task complexity. The difference is dramatic: on the Finance Reasoning Hard benchmark, Nemotron 3 Super at its highest thinking budget reaches ~86% accuracy but consumes over 1.4 million output tokens, while the lowest thinking setting still delivers ~74% accuracy on roughly 100,000 tokens — a 14x reduction in token spend based on results conducted by DataRobot. For straightforward classification or routing tasks, the low setting is more than enough. For complex financial analysis or multi-step reasoning, you dial it up.

accuracy vs tokens

This means you can run a single model across multiple use cases and tune the cost-accuracy tradeoff per task, rather than deploying separate models for simple versus complex workloads. DataRobot surfaces this through its monitoring dashboard — giving you and your leadership clear visibility into token consumption per team, and per deployment. When your CFO asks “what are we spending on AI inference?”, you’ll have the numbers ready.

Rigorous evaluation before production

Deployment without evaluation is a recipe for failure. DataRobot provides comprehensive evaluation capabilities that let you rigorously test Nemotron 3 Super before they reach production.

LLM-as-a-Judge and out-of-the-box metrics

DataRobot’s evaluation framework spans the full range of metrics that matter:

  • Functional metrics and automated compliance tests measure correctness, faithfulness, relevance, bias, toxicity, etc., giving teams a rigorous, multi-dimensional view of model quality. 
  • Security and safety metrics provide real-time guards evaluating whether outputs comply with safety expectations — including detection of toxic language, PII exposure prevention, prompt-injection resistance, topic boundary adherence, and emotional tone classification.
  • Economic metrics track token usage and cost, ensuring that your Nemotron 3 Super deployment remains economically sustainable at scale.
configure eval

Playground comparison and the Evaluation API

DataRobot’s LLM Playground lets you setup side-by-side comparisons — running Nemotron 3 Super against other models, different prompt strategies, or alternative vector database configurations. You can configure up to three workflows at a time, run queries, and analyze results using LLM-as-a-judge alongside human-in-the-loop reviews with custom or synthetic test data. 

For teams that want programmatic control, the Evaluation API supports the same full set of metrics, enabling automated evaluation pipelines that integrate with your existing CI/CD workflows.

Execution tracing for deep debugging

Evaluation without explainability is incomplete. DataRobot’s tracing capabilities expose the full execution path of every interaction: the sequence and latency, the tools or functions invoked, and the inputs and outputs at each stage. This is especially important for Nemotron 3 Super powered agents because the model’s reasoning capabilities — including its configurable reasoning trace — mean that understanding how the agent arrived at a result is as important as whether the result was correct.

Tracing extends relevant metrics like accuracy and latency to both the input and output of each step, enabling you to pinpoint exactly where an issue originated in a multi-step workflow. This visibility makes debugging faster, iteration safer, and refinement more confident.

execution tracing

Scalable deployment and production monitoring

Once evaluation confirms Nemotron 3 Super is performing as expected, DataRobot ensures it stays that way in production.

Scalable infrastructure management

The Agent Workforce Platform handles the operational complexity of running Nemotron 3 Super at enterprise scale. With NVIDIA AI Enterprise natively embedded, the platform manages containerization, resource allocation, and scaling automatically. Whether you’re handling hundreds or thousands of concurrent requests, the infrastructure adapts — scaling GPU resources up and down based on demand without requiring manual intervention.

For organizations with strict data sovereignty requirements, this extends to on-premises and air-gapped deployments using the NVIDIA AI Factory for Government reference architecture.

Continuous monitoring with out-of-the-box metrics

DataRobot’s observability framework delivers comprehensive visibility across health, quality, usage, and resource dimensions through a unified console:

  • Real-time performance & resource tracking monitors latency, throughput, token consumption, CPU utilization, memory, and concurrency across every deployment — with quota rates and alerts to catch degradation and enforce cost governance before either impacts users.
OTel tracing
  • OTel tracing captures the full execution path of every system interaction — from initial prompt through each tool call, retrieval step, and model invocation — with timing and payload visibility at each node. Trace correlation links a quality degradation signal directly to the offending step, so root cause analysis takes minutes rather than hours.
  • Custom alerting lets you define thresholds across any metric and route notifications to your preferred channels, enabling proactive intervention rather than reactive firefighting.

The monitoring system works seamlessly across all deployment environments, providing a single pane of glass whether your NVIDIA Nemotron 3 Super NIM are running in the cloud, on-premises, or in a hybrid configuration.

Enterprise governance and real-time intervention

Governance isn’t a checkbox at the end of a deployment — it’s an operational discipline that spans the entire model lifecycle. DataRobot provides governance capabilities across three critical dimensions for NVIDIA Nemotron 3 Super deployments.

Security risk governance

DataRobot enforces role-based access controls (RBAC) aligned with your organizational policies for all tools and enterprise systems that agents can access. This means your Nemotron 3 Super only interacts with the data and systems they’re explicitly authorized to use.

Robust, auditable approval workflows prevent unauthorized or unintended deployments and updates. Every change to the system — from prompt modifications to configuration updates — is tracked and requires appropriate authorization.

Operational risk governance with real-time intervention

This is where DataRobot’s capabilities become particularly critical. Beyond monitoring and alerting, the platform provides real-time moderation and intervention capabilities that can catch and address undesired inputs or outputs as they happen.

Multi-layer safety guardrails — including NVIDIA NeMo Guardrails for topic control, content safety, and jailbreak detection — operate in real time during model execution. You can configure these guardrails directly within the DataRobot Model Workshop, customizing thresholds and adding additional protections specific to NVIDIA Nemotron 3 Super deployment.

Lineage and versioning
Lineage and versioning

Lineage and versioning capabilities track all versions of NVIDIA Nemotron 3 – powered AI system: models, prompts, VDBs, datasets, creating an auditable record of how decisions were made and preventing behavioral drift across deployments.

Regulatory risk governance

DataRobot supports validation against applicable regulatory frameworks — including the EU AI Act, NIST RMF, and country- or state-level guidelines — identifying risks including bias, hallucinations, toxicity, prompt injection, and PII leakage.

Automated compliance documentation is generated as part of the deployment process, reducing audit effort and manual work while ensuring NVIDIA Nemotron 3 Super deployment maintains ongoing compliance as regulations evolve.

How to use doc

From model to impact

NVIDIA Nemotron 3 family of open models represents a significant step forward for enterprise agentic AI. Nemotron 3 Super, with its high-accuracy reasoning optimized for collaborative multi-agent workloads, is purpose-built for the kind of enterprise applications that drive real business outcomes.

But the organizations that will succeed with Nemotron 3 Super are not the ones with the most impressive demos. They’re the ones that rigorously evaluate behavior, monitor systems continuously in production, and embed governance across the entire agent lifecycle. Reliability, safety, and scale are not accidental outcomes — they are engineered through disciplined metrics, observability, and control.

DataRobot’s Agent Workforce Platform, co-engineered with NVIDIA, provides the complete foundation to make that happen. From one-click deployment to comprehensive evaluation, from continuous monitoring to real-time governance — we make the hard part of enterprise AI manageable.

Ready to build with NVIDIA Nemotron 3 Super on DataRobot? Request a demo and see how quickly you can move from model to production.

The post Build enterprise-ready Agentic AI with DataRobot using NVIDIA Nemotron 3 Super  appeared first on DataRobot.

Robots that learn everyday tasks can free humans from repetitive work

A robot task AI capable of learning and performing everyday repetitive tasks in a human-like manner has been developed. The AI learns tasks through human demonstrations and executes complex tasks step by step based on a hierarchical task execution framework. The technology is expected to contribute to the automation of labor-intensive repetitive work and reduce human workload in homes, offices, as well as retail and logistics environments.

A bicycle robot that can drive fast and jump over obstacles

Experienced human cyclists can perform a wide range of maneuvers and acrobatics while riding their bicycle, from balancing in place to riding on a single wheel or hopping over obstacles. Reproducing these agile maneuvers in two-wheeled robots could open new opportunities both for entertainment or robot sports and for the completion of complex missions in rough terrain.

Coding for underwater robotics

Screenshot from video showing underwater robotic vehicle. Credit: Tim Briggs/MIT Lincoln Laboratory.

During a summer internship at MIT Lincoln Laboratory, Ivy Mahncke, an undergraduate student of robotics engineering at Olin College of Engineering, took a hands-on approach to testing algorithms for underwater navigation. She first discovered her love for working with underwater robotics as an intern at the Woods Hole Oceanographic Institution in 2024. Drawn by the chance to tackle new problems and cutting-edge algorithm development, Mahncke began an internship with Lincoln Laboratory’s Advanced Undersea Systems and Technology Group in 2025. 

Mahncke spent the summer developing and troubleshooting an algorithm that would help a human diver and robotic vehicle collaboratively navigate underwater. The lack of traditional localization aids — such as the Global Positioning System, or GPS — in an underwater environment posed challenges for navigation that Mahncke and her mentors sought to overcome. Her work in the laboratory culminated in field tests of the algorithm on an operational underwater vehicle. Accompanying group staff to field test sites in the Atlantic Ocean, Charles River, and Lake Superior, Mahncke had the opportunity see her software in action in the real world.

“One of the lead engineers on the project had split off to go do other work. And she said, ‘Here’s my laptop. Here are the things that you need to do. I trust you to go do them.’ And so I got to be out on the water as not just an extra pair of hands, but as one of the lead field testers,” Mahncke says. “I really felt that my supervisors saw me as the future generation of engineers, either at Lincoln Lab or just in the broader industry.”

Says Madeline Miller, Mahncke’s internship supervisor: “Ivy’s internship coincided with a rigorous series of field tests at the end of an ambitious program. We figuratively threw her right in the water, and she not only floated, but played an integral part in our program’s ability to hit several reach goals.”

Lincoln Laboratory’s summer research program runs from mid-May to August. Applications are now open. 

Video by Tim Briggs/MIT Lincoln Laboratory | 2 minutes, 59 seconds

Hybrid AI planner turns images into robot action plans

MIT researchers have developed a generative artificial intelligence-driven approach for planning long-term visual tasks, like robot navigation, that is about twice as effective as some existing techniques. Their method uses a specialized vision-language model to perceive the scenario in an image and simulate actions needed to reach a goal. Then a second model translates those simulations into a standard programming language for planning problems, and refines the solution.

Spatial Intelligence for Next-Generation Indoor Navigation

Powered by our proprietary geomagnetic engine and our hardware suite, ION combines high-precision positioning with accurate tracking of assets and people. That’s what allows people, assets, and autonomous systems to interact with space reliably, without heavy reliance on traditional infrastructure.

Poultry processing robotics advances with ChicGrasp

What started out as a response to labor shortages in poultry processing plants during the COVID-19 pandemic has turned into a robotics system that can learn by imitating human movements to handle chickens. Using an advanced imitation learning algorithm and camera perceptions, researchers with the Arkansas Agricultural Experiment Station have developed ChicGrasp, a dual-jaw robotic gripper with pinchers that can grasp a chicken carcass by the legs, lift and hang it on a shackle conveyor to be moved on for further processing.

Robot hands so sensitive they can grab a potato chip

A new type of robotic hand developed at The University of Texas at Austin demonstrates such sensitive touch that it can grasp objects as fragile as a potato chip or a raspberry without crushing them. The technology, called Fragile Object Grasping with Tactile Sensing (FORTE), combines advanced tactile sensing with soft robotics. The breakthrough could improve robot performance when a light touch is needed, such as in health care and manufacturing.

Restoring surgeons’ sense of touch with robotic fingertips

By Anthony King

Modern surgery has gone from long incisions to tiny cuts guided by robots and AI. In the process, however, surgeons have lost something vital: the chance to feel inside the body directly. Without palpation, it becomes harder to detect tissue abnormalities during an operation.

A group of surgeons and engineers across Europe is now trying to bring back this vital aspect of surgery.

Working within an EU-funded research collaboration called PALPABLE, they are developing a soft robotic “fingertip” that can sense how firm or soft tissue is during minimally invasive and robotic surgery. The research runs until the end of 2026, with a first prototype expected to be tested by surgeons around March 2026.

By combining optical sensing, soft robotics and AI, the team is designing a probe that mimics the way a fingertip presses and feels during surgery. It would gently probe organs and create a visual map of tissue stiffness, displayed on a screen to guide surgeons as they operate.

Losing the surgeon’s touch

For many surgeons, the loss of direct touch has been one of the quiet trade-offs of modern surgery.

Sooner or later, I believe the vast majority of surgeries will be robotic.

Prof Alberto Arezzo, PALPABLE

“We started 30 years ago with open surgery and using our fingers,” said Professor Alberto Arezzo from the University of Turin, Italy. He specialises in minimally invasive and robotic surgery and mostly treats patients with colorectal cancer.

“Then we moved into the era of keyhole surgery, which reduced tactile feedback because we began to use long instruments,” he said.

From the 1990s, keyhole surgery became increasingly common, allowing surgeons to operate through small incisions with the help of a camera. Patients benefited from less trauma, shorter hospital stays and faster recovery.

But this came at the expense of physical touch. That matters because tumours often feel different from healthy tissue – stiffer, less pliable or irregular – important differences that experienced hands can detect.

Finding tumour margins

When operating on cancer, surgeons walk a fine line: remove too much tissue and function may suffer; remove too little and cancer may remain, and then spread again, requiring more surgery. 

“We don’t want to do that. We want it done in one shot,” said Dr Gadi Marom at Hadassah Medical Centre in Jerusalem, one of the clinicians involved in the research, who specialises in minimally invasive and robotic surgery on patients with stomach and oesophagus diseases. 

This is where sensing technology could help. By translating physical contact into visual information, such as a colour-coded map showing softer and firmer areas, surgeons could regain a functional equivalent of touch. 

“With a new instrument, we want to be able to determine the margins around a tumour,” said Marom.

Using light to feel

To do that, engineers on the team are turning to light.

The probe they are developing contains fibre-optic cables embedded in a soft, flexible tip. When pressed against tissue, the tip deforms and the light travelling through the fibres changes.

“A silicone dome presses against soft tissue, allowing us to map both the direction and the magnitude of the applied force,” explained Dr Georgios Violakis at Hellenic Mediterranean University in Heraklion, Crete. 

Those tiny shifts in light intensity and wavelength are then translated into information about tissue stiffness.

In the lab, the team has already built and calibrated early versions of the soft membrane and light-based sensors, with partners contributing across the system.

Queen Mary University of London (UK) is helping design and refine the membranes, the Fraunhofer Institute (Germany) is developing the functional films, while Bendabl (Greece), Tech Hive Labs (Greece) and the University of Essex (UK) are advancing the software needed to visualise stiffness and tactile maps.

The prototype will be validated in lab tests before it is used on patients.

The bottom line is that we will be able to give better care to our patients.

Dr Gadi Marom, PALPABLE

The fibre‑optic cables are each about the width of a human hair. Similar sensing technology has long been used to detect small movements in large structures such as aircraft, skyscrapers and nuclear reactors. Here it is being applied on a much smaller scale to detect subtle differences in human tissue. 

“For touching organs inside an anaesthetised patient, the device needs to be both highly accurate and high resolution,” said Professor Panagiotis Polygerinos, a soft robotics researcher at Hellenic Mediterranean University. 

“Something like this might have been possible sooner, but the technology would have been far more expensive and less precise, making it impractical for clinical use.” 

Bringing touch to robots

As surgery grows increasingly robotic, the loss of tactile feedback is becoming more pressing – and restoring a sense of touch even more vital.

“When I operate with a robot I have the advantage of 3D vision,” said Marom. “And I don’t have to stand for the entire surgery.” That matters in long procedures, such as removing a patient’s oesophagus, which can take up to eight hours.

Robotic surgery also raises new possibilities. Marom hopes it may eventually allow surgeons, in carefully selected cases, to remove small tumours from the oesophagus without removing the entire organ.

But there is a downside.

“In robotic surgery, tactile feedback is largely absent,” said Arezzo. “That’s why this work is so important.”

Both surgeons believe robotics will continue to expand in operating theatres, but only if surgeons are given better sensory information.

“Sooner or later, I believe the vast majority of surgeries will be robotic,” said Arezzo.

For Marom, working closely with engineers has been essential. “I am exposed to soft robotics and many new technologies,” he said. “I see how new instruments can be developed.”

“The bottom line is that we will be able to give better care to our patients,” he added.

Research in this article was funded by the EU’s Horizon Programme. The views of the interviewees don’t necessarily reflect those of the European Commission. If you liked this article, please consider sharing it on social media.


This article was originally published in Horizon, the EU Research and Innovation magazine.

When Factory Logistics Falls Out of Sync with Production: Three Operational Turning Points in Automotive and Chemical Industries

As global manufacturing continues to upgrade, capacity expansion and automation are advancing in parallel. Yet as production takt time accelerates, a critical question emerges: Can factory logistics reliably sustain higher operational intensity?

Sneaker-sized ‘Electronic Dolphin’ robot could transform oil spill cleanup

RMIT University engineers in Australia have built a remote-controlled minibot that hoovers up oil spills using an innovative filtering system inspired by sea urchins. Oil spills are still a serious problem around the world. They can badly damage oceans and coasts, kill or injure sea animals and birds, and cost billions of dollars to clean up and repair the damage.
Page 1 of 600
1 2 3 600