Robotics and physical AI
Keeping an open mind on artificial intelligence
Robots are using physical AI software to support a range of adaptive and autonomous tasks, and Hyundai is securing an end-to-end supply chain for more sophisticated deployment in its factories
The traditional assembly line robot arm has come a long way since its introduction in 1961, when the Unimate made its debut at GM’s plant in Trenton, New Jersey. Robots are now capable of wide range of adaptive tasks in picking and placing different parts. The technology has advanced to the use of collaborative robots (cobots) that work safely alongside humans for lineside assembly or supporting logistics tasks. There is also currently a lot of work being done to trial fully humanoid robots for a range of applications in collaboration with humans on the assembly line, such as the work being done by Boston Dynamics in collaboration with Hyundai Motor Group, which owns 80% of the robot maker.
An increasing percentage of these robots are controlled by artificial intelligence (AI), which enables them to safely and efficiently adapt to and interact with the physical world. Data is fed into ‘physical AI’ platforms from a range of sensors, as well as cameras or lidar remote sensing devices. Physical AI also includes world model reasoning, the use of neural networks that understand the dynamics of the real world, including physics and spatial properties. That again uses data input including text, image, video and movement. With that information the technology can simulate realistic physical environments.
By putting state-of-the-art models up in the open we allow the community to start working at the frontier of the industry rather than at the starting line
Building physical AI robotics stacks is a complex process involving layering software with hardware architecture to enable a robot to perceive, plan and act. The stacks build in localisation, mapping, motion planning, manipulation and sensing. At this year’s global AI conference for developers held by US technology firm Nvidia in San Jose, California – Nvidia GTC – speakers from a range technology start-ups and manufacturers, discussed how important open collaboration is across robotics ecosystems.
Sharing the load
It wasn’t always that way, according to Steven Palma, robotics AI engineer at Hugging Face, the collaborative machine-learning platform. He said that working with AI used to be very fragmented.
“Every lab or company was just building their own things and reinventing the wheel, with pipelines and training architectures from scratch,” said Palma. “Hugging Face changed that by creating a centralised and open hub where people could go and share models and datasets. That practically lowered the barrier to entry so the entire community could build on top of each other’s work.”
The strategy to overcome that fragmentation is to provide a standard library and infrastructure for real-world robotics and make it “ridiculously easy for anyone to experiment with the new technology” according to Palma. Users can easily share a dataset, pull down a state-of-the-art robot model (called a policy) to train on and apply it to a physical product.
Having the collaborative tools to quickly experiment with robotics learning is helping to overcome the existing delays in building physical AI.
“By putting state-of-the-art models up in the open we allow the community to start working at the frontier of the industry rather than at the starting line,” said Palma.
That creates an environment in which different users can collectively upload datasets, finetune models on them and figure out how to deploy the robot more efficiently.
“Suddenly the whole ecosystem levels up overnight and it
becomes a collective effort to push the field forward,” said Palma.
Data that is critical
Robotics learning is only as good as the data it is fed with and there are still gaps to bridge. Nicolas Keller, co-founder of a relatively new German-based robotics and automation startup called Poke & Wiggle, said real-world data is crucial in providing a secure foundation on which to build physical AI models. Keller pointed out that there are roughly 5m industrial robots in the real world but most of them don’t collect data that can actually be used to feed into physical AI. Data from real robot deployment is essential in solving real-world use cases with a robust model that comes within the necessary cost parameters, something Poke & Wiggle was set up to solve, according to Keller.
“Deployment is the big milestone that we have to achieve and we believe in real-world robot data because this is needed for a simulator [and] need way more data before we can create [that] simulator,” said Keller.
Flexion Robotics is another new start-up focused on the next generation of software for humanoid robotics. Its co-founder and chief technology officer, David Höller, said the company’s goal is to leverage parallel simulation, synthetic data and large data sets – training robots in simulation with reinforcement learning.
“In doing so you can leverage the full capability of the robot, train much faster than real time and it becomes cost effective solution that can adapt to different humanoid morphologies,” he said.
According to Höller, when it comes to simulation there are challenges in bridging the simulation-to-reality (sim-to-real) gap in humanoid robot training, where a safe, fast, virtual environment is created before deploying robots in the real world. For one thing, the inertial measurement unit (IMU) can suffer distortion from the heavy impact of the robot’s feet when walking. The IMU is a sensor module that measures data related to orientation, velocity and gravity. Camera data can also be blurry when training policies in simulation, according to Höller. To overcome this, Flexion runs a real-to-sim analysis of the platform it is working with, where real environments are digitised to create better training simulations.
“We have a batch of tests that we do on the robot even before deploying anything to assess the likelihood of the specific sim-to-real transfer we are attempting,” explained Höller, adding that Flexion works to quantify everything in a motion-capture system to ensure accurate data, which indicates how ready a specific platform is for deployment.
The biggest challenge beyond sim-to-real is the production
planning side of things compared to demonstrations of dextrous manipulation. A robot
can fold cloth and open doors using a key, which are not now very difficult for
robots. However, it is the sequencing of tasks over a longer horizon, as
required by vehicle and parts manufacturing, which is the challenge and
something that big data sets will help a lot with, according to Höller.
Tools for deployment
Simulation is one part of a three-part problem when it comes to physical AI along with training and edge deployment. Nebius, the AI cloud infrastructure platform, has been working to make it easier for developers to access Nividia’s Isaac digital tools, which are designed to accelerate the development, simulation and deployment of AI-powered robots.
“In terms of IsaacLab or IsaacSim… the goal is first to make them super easy to deploy on our cloud,” said Evan Helda, head of physical AI at Nebius. He said that rather than a developer having to assemble the robotics stack and deal with things such as drivers, libraries and configurations, which over time can tax on a company’s iteration loops, Nebius works to subtract and prepackage the tools on its platform ready to collect and deploy on a customer’s individual architectures.
“What we are trying to build is a full-stack AI platform that meets customers where they are, based upon their ability to either build on their own, or just buy something off the shelf if their resources are constrained,” he said.
The next tier of the platform Nubius is working
on with Nvidia covers orchestration, or the ability to create workflow across disparate
environments, taking out the dependencies and different run times across each.
Using Nvidia Osmo, a cloud orchestration platform, which is used for scaling
complex robotics workloads across on-premises, private and public clouds, users
can build workflows across training, simulation and edge deployment.
“Behind the scenes Nebius is going to handle the orchestration, network, authentification and all the scaling,” said Helda. “Developers can just work at the workflow layer.”
Partnering with an Asian vehicle manufacturer can be very useful to advance your robotics solution into the field. I am just encouraging people to be open and reach out
Flexion is also using IsaacLab for accurate edge deployment, according to Höller. “We will try to analyse the different characteristics of the [robot] actuators on the real platform and then with scripts that are running in the simulator we will try to fit these specific parameters so that they replicate the real-world behaviour,” he explained.
Höller also pointed to Nvidia’s Newton Physics Engine possibly playing a key role in the future. Developed with Google DeepMind and Disney Research, it is an open-source, GPU-accelerated physics simulation engine designed for robotics research, AI training and simulation.
“It allows you to customise your simulation to the finest degree that can be done on an actuator level [and include] specific weird friction parameters that you might identify,” said Höller.
The message from the start-ups at Nvidia GLC this year is that development of physical AI for advanced robotics deployment is a collaborative effort and that off-the-shelf tools can accelerate timelines. If you are going to do it, build it in in the open.