Archetype AI was started with a fundamental question: what does it mean for an AI to "understand" the world? And how can AI augment human intelligence in a physical environment?
The Generative AI boom started with the launch of ChatGPT and the proliferation of LLMs that work with online text and images. At the time, many industry experts and futurists were asking—what would AI designed for physical world problems look like? How would AI shape the future of our homes, cars, workspaces, factories, and cities?
A couple of years later, Physical AI is here, but the vision behind it is often reduced to robotics powered by multimodal AI models, and we believe this view is too narrow. In this article, we will outline an approach that focuses on enabling AI to independently uncover the underlying principles of the physical world and augment human intelligence rather than replacing humans with machines.
Atoms Over Bits: AI Meets the Real World
The question of understanding physical reality—and whether it can even be fully understood—has been debated for centuries. It has been a topic of deep philosophical inquiry and scientific research, explored by Aristotle, Newton, Einstein, and others. This question holds profound implications for humanity: to successfully operate in, build within, and control the real world, we must first understand it.
In the past couple of decades, we have made impressive progress in implementing technologies that are "aware" of the world around us because of advanced sensors. From consumer electronics to industrial equipment and self-driving cars, we are encountering more and more devices and machines that can interpret the environment without human intervention. But more often than not, they have many limitations. Smart home assistants have been available for more than a decade, but the complexity of tasks they can accomplish is still very limited, and the promises of the IoT boom have remained out of reach.
In manufacturing, sensor data can unlock predictive maintenance, process optimization, and real-time decision-making. But very often, predictive models face challenges like the complexity of uncovering larger patterns and turning them into actionable insights.
Enter Physical AI. At CES 2025, Jensen Huang, founder and CEO of NVIDIA, shared his vision for the future. He said, "Physical AI will revolutionize the $50 trillion manufacturing and logistics industries. Everything that moves—from cars and trucks to factories and warehouses—will be robotic and embodied by AI." After chatbots dominated the AI market in the last couple of years, many companies took the leap beyond the digital world and started building tools to deploy AI in the physical environment. This new vision is enabled by new sources of data: Physical AI will fuse sensor, camera, audio data, and more, not just online digital data like LLMs today.
While robotics is an obvious application, we believe that Physical AI's potential extends much further—it can revolutionize how we understand and interact with the world around us.
Robots, World Models, and More
At CES, NVIDIA announced Cosmos World Foundation Models that help developers create digital twins and virtual environments for training Physical AI models. The company also introduced four new blueprints for specific applications including warehouse robotics, autonomous vehicle simulation, Apple Vision Pro streaming, and real-time digital twins for engineering.
In addition to NVIDIA, startups like Physical Intelligence, Figure AI, Field AI, and Spot AI develop robotics hardware and software and are focusing on industrial use cases. By integrating multiple types of sensors and data streams and using AI to interpret them, they create robots that can "work" at factories, mines, construction sites, etc.
World Labs is taking a different approach compared to robotics companies mentioned above, but like NVIDIA, its long-term vision is about enabling robots with AI. World Labs is building Large World Models (LWMs) that can generate interactive 3D worlds—a digital twin. For example, the company recently demonstrated an ability to create consistent 3D environments generated from images and everyday photos. The company is aiming to create a product for both designers and engineers—for the latter, that would mean creating tools that would teach robots to navigate and manipulate objects in the physical world.
Augmenting Human Intelligence in the Real World
Archetype AI takes an orthogonal yet complementary approach to Physical AI. Since founding the company, we've focused on expanding human expertise by augmenting our skills with AI rather than trying to automate them away.
We see AI as a tool that makes professionals working across different industries more efficient and intelligent when solving complex real world problems. As we piloted Newton with customers in 2024 in different fields, including construction, manufacturing, automotive, smart home devices, and other industries, we focused on putting human needs first as the most effective way to transform these industries now and in the nearest future.
Despite significant advances in robotics and AI, the extensive integration of robots in all aspects of human life and work is perhaps decades away, and there are many areas of human endeavor where robots may not be necessary. For example, a recent HBR article highlights the challenges of building fully robotic factories: there are very few successful examples of "lights-out" facilities that don't require much human oversight. And outside of manufacturing, in more heavily regulated industries like construction, automation can take decades to implement due to high costs and government regulation.
That is why Newton, our Physical AI model, is built for humans, not for robots or self-driving cars. It can be deployed and make a difference today by turning existing vast repositories of data into insights and knowledge that would help us to be more effective, make better decisions, stay safe, predict and avoid accidents. Newton, an AI foundation model that understands the physical world from sensor data, is built to generalize across many tasks, use cases, and industrial verticals beyond those that it was trained on.
Here are some of the key features that make it possible to deploy Newton in the physical world:
- Multimodality. Unlike many other solutions that focus on camera data, Newton works across different sensor modalities, turning huge amounts of sensor data generated in the real world into insights and solutions. Fusing several sensors allows for contextual understanding that won't be possible with just one type of data. Newton is also capable of multimodal output that unlocks multiple use cases that require objective measurements or other types of non-text output.
- On-prem deployment. It allows Newton to be your private AI, deployed in remote locations or with limited access to the internet while maintaining strict data protection protocols. Companies can process sensitive sensor data on-site in real-time and make the outputs available internally for further analysis. Newton is not just "local" or "in the cloud"—it can be a distributed intelligence at scale.
- Semantic Lenses. We're deploying a different interaction paradigm for Physical AI that emphasizes transformation of objective sensor data into meaningful insights. We are pioneering Semantic Lenses which function as an instrument for humans to better understand the physical world. Unlike traditional AI agents that suggest autonomy, Lens serves as an instrument that amplifies human capabilities—similar to how a microscope reveals hidden physical structures—enabling people to make more informed decisions.
Physical AI is often associated with robotics and specialized solutions, but Archetype AI is taking a different approach. We're building a single, versatile model capable of solving multiple real world use cases, across sensors and real world situations, a general-purpose intelligence for the physical world.