NVIDIA releases open blueprint for physical AI data processing
NVIDIA (NASDAQ: NVDA) announced the Physical AI Data Factory Blueprint, an open reference architecture designed to automate training data generation for robotics, autonomous vehicles and vision AI systems.
The blueprint enables developers to process and curate large-scale datasets, generate synthetic data, and evaluate physical AI models. NVIDIA said the system reduces costs and complexity associated with training AI systems at scale by transforming limited training data into diverse datasets that include rare scenarios difficult to capture in real-world conditions.
Microsoft Azure and Nebius are integrating the blueprint with their cloud infrastructure services. Companies including FieldAI, Hexagon Robotics, Linker Vision, Milestone Systems, Skild AI, Teradyne Robotics and Uber are using the system for development projects.
"Physical AI is the next frontier of the AI revolution, where success depends on the ability to generate massive amounts of data," said Rev Lebaredian, vice president of Omniverse and simulation technologies at NVIDIA.
The blueprint operates through three main components: NVIDIA Cosmos Curator for processing and annotating datasets, Cosmos Transfer for expanding and diversifying data, and NVIDIA Cosmos Evaluator for scoring and filtering generated data.
NVIDIA OSMO, an open source orchestration framework, manages workflows across compute environments and integrates with coding agents including Claude Code, OpenAI Codex and Cursor.
Microsoft Azure is incorporating the blueprint into a physical AI toolchain that integrates with Azure services including IoT Operations, Microsoft Fabric and GitHub Copilot. Nebius has integrated OSMO into its AI Cloud platform.
The Physical AI Data Factory Blueprint is expected to be available on GitHub in April, according to the company's press release.
