NRL Demonstrates First Reinforcement Learning Robot Control in Space

Wed, 09/24/2025

Using NVIDIA Omniverse, Naval researchers trained Astrobee to perform zero-gravity maneuvers with real-world precision.

5m read

Written by:

Ross Gianfortune

Samantha Chapin, U.S. Naval Research Laboratory (NRL) space roboticist, performs a reinforcement learning control test on free-flyer robots aboard the International Space Station from NRL headquarters in Washington, D.C., May 27, 2025. Photo Credit: U.S. Navy photo by Sarah Peterson

A Naval Research Laboratory (NRL) research team successfully conducted the first reinforcement learning (RL) control of a free-flying robot in space. In a three-month sprint, an NRL team of scientists demonstrated that RL algorithms can control a robot in the zero-gravity environment of the International Space Station (ISS) using the Astrobee robotic platform.

“The [Autonomous Planning In-space Assembly Reinforcement-learning free-flYer (APIARY)] team’s achievement will allow us to rapidly adapt robotic systems to new tasks and environments,” said NRL Senior Scientist for Robotics and Autonomous Systems Glen Henshaw in an NRL press release. “What has previously taken years, took only three months, proving that further advanced development is on the horizon. This swift innovation means more affordable robotic solutions for the U.S. Navy.”

The team concentrated on foundational robotic maneuvers critical for In-Space Assembly, Manufacturing and Servicing (ISAM) missions: undocking, translation, rotation and docking. Undocking and docking are central to these operations, according to NRL, enabling robots to become free-flying, perform tasks and then reliably re-dock for essential recharging, maintenance or data transfer.

“This research is significant because it marks, to our knowledge, the first autonomous robotic control in space using reinforcement learning algorithms,” NRL Computer Research Scientist Kenneth Stewart said in the press release. “We believe this breakthrough will build confidence in these algorithms for space applications and generate further interest in expanding this research.”

 Simulating Space with NVIDIA Omniverse

NRL used NVIDIA’s Omniverse platform to simulate high-fidelity environment capable of modeling zero-gravity conditions, according to members of the research team.

“NVIDIA Omniverse provided a highly accurate physics simulator that precisely modeled zero-gravity, including effectively ‘turning off’ gravity, which let us train our system in a simulated zero-gravity environment,” Stewart told GovCIO Media & Research. “This let the RL policy transfer directly to on-orbit Astrobee control, effectively bridging the sim-to-real gap.”

Stewart emphasized the platform’s ability to run vast numbers of simulations in parallel, allowing the RL system to explore a wide range of dynamics.

“Omniverse allows thousands to hundreds of thousands of simulations of a task to be conducted in parallel. This lets the reinforcement learning system ‘explore’ situations with different dynamics — such as differences in mass, friction, lighting and so on — so that when the robot encounters those differences in the real world it is ready for them.”

Despite these advances, Stewart acknowledged the challenges of deploying RL in space and the compute power they need. He said that limitations remain in sim-to-real transfer for future RL deployments.

“RL algorithms often require more processing and memory capacity than spacecrafts usually have, and there are no good ways to mathematically prove that RL algorithms work correctly, which is usually a prerequisite for spacecraft control algorithms,” said Stewart. “The opportunity to try new algorithms out on spacecraft just to see how well they work is quite rare, because spacecraft and time on a spacecraft are so valuable.”

Safety Measures for RL in Orbit

To ensure safe operation of Astrobee using RL, the team implemented multiple safeguards to adapt to unexpected mission conditions without compromising safety or asset integrity, NRL Computer Research Scientist Roxana Leontie told GovCIO Media & Research.

“The RL training incorporated negative rewards (penalties) for high linear and angular velocities, which encouraged a slower, more controlled navigation approach,” Leontie said. “Second, we implemented a fail-safe mechanism to switch off the RL policy and revert to the standard Astrobee controller if the robot’s position or orientation errors exceeded the acceptable bounds during maneuver execution.”

Beyond these RL-specific measures, extensive simulation and ground testing were conducted using NASA’s Astrobee simulators and NASA Ames ground testing platforms in collaboration with the space agency.

“RL training in complex environments can be constrained with additional safety objectives, such as maintaining a safe distance from hazards,” Leontie added. “Software safety safeguards like ‘control barrier functions’ can mathematically define safe operating regions for the robot, preventing it from entering unsafe states.”

 From CubeSats to Robotic Arms

With the success of the NRL researchers are looking ahead to the next frontier for RL in space, they said. Team member Samantha Chapin, NRL space roboticist, said that NRL is looking at new, longer-duration or multi-robot missions in orbit to use the technology.

“We hope to demonstrate RL on a CubeSat or other small spacecraft in the near future; this would add another important example of RL working successfully in space, this time outside the pressurized volume of ISS,” said Chapin in an interview with GovCIO Media & Research. “We plan to progress from foundational maneuvers — such as translation, rotation, undocking, and docking — to using RL to control robotic arms that can repair and upgrade existing spacecraft, refuel spacecraft and build new structures in space.”

Chapin noted that ISAM missions, long in development, are finally nearing launch.

“ISAM missions have been under development for decades now, but they are finally getting close to launch,” said Chapin. “Many more robotic arms in space could be used to demonstrate human-level manipulation.”

Leontie added that the potential of RL extends well beyond space-based missions.

“We are currently developing robots that may be able to do routine maintenance on ships or aircraft,” she added. “RL is also being used to fly drones and submersible robots in order to improve energy efficiency and to allow human operators to control more robots at the same time. Some day we may have robotic stewards, robotic mechanics and robotic corpsmen.”

A New Vision for Space Logistics

Henshaw added that RL could be a game changer for national security space policy, as well as influence acquisition strategies or autonomy standards within the War Department.

“In the spacecraft community, demonstrating that a technique works, allows it to be used more broadly,” Henshaw said to GovCIO Media & Research. “ISAM missions are now being flown, and we hope that the combination of robots in space and demonstrated RL will let us start thinking about space as a domain in which logistics — building structures, sending new experiments, replacement parts, extra fuel, and so on — becomes a native part of spacecraft mission design.”

The breakthrough also demonstrates, Henshaw says, a future where space technology is less disposable, making tech more durable and efficient.

“Today, nearly all spacecraft are built on the ground as monoliths, and once they reach orbit they are never touched again. They are disposed of when they break or run out of fuel,” Henshaw added. “We hope that in the future we’ll be able to think about spacecraft the way we think about ships or aircraft, as items that we constantly repair, maintain and upgrade and which have service lives that are over only when we decide they are.”

Trending

This is a carousel with manually rotating slides. Use Next and Previous buttons to navigate or jump to a slide with the slide dots

Stay in the Know

Subscribe now to receive our newsletters.

All Topics

Featured

Topics

Events

Videos

Podcasts

Insights

NRL Demonstrates First Reinforcement Learning Robot Control in Space

Simulating Space with NVIDIA Omniverse

Safety Measures for RL in Orbit

From CubeSats to Robotic Arms

A New Vision for Space Logistics

Stay in the Know

All Topics

Featured

Who’s in Charge of AI at Every Federal Agency

Implementing Agentic AI in Federal Government

Securing Data Against Evolving Cyber Threats

Topics

Events

Videos

Podcasts

Insights

NRL Demonstrates First Reinforcement Learning Robot Control in Space

Simulating Space with NVIDIA Omniverse

Safety Measures for RL in Orbit

From CubeSats to Robotic Arms

A New Vision for Space Logistics

Geopolitical Tensions Highlight Need for Resilient AI Supply Chains

Iran Strikes Showcase American AI, Drone and Cyber Advances

Pentagon Using AI to Protect Supply Chains

Labor Department's New Hub Aims to Prepare Workforce for AI

Who’s in Charge of AI at Every Federal Agency

U.S. Cyber Strategy Aims to Reset Adversaries’ Risk Calculus Amid Iran Threats

Building the Military Health System’s AI Ecosystem

Geopolitical Tensions Highlight Need for Resilient AI Supply Chains

Pentagon Using AI to Protect Supply Chains

Stay in the Know

 Simulating Space with NVIDIA Omniverse

 From CubeSats to Robotic Arms