Connecting the Dots: How AI Can Make Sense of the Real World

Archetype AI Team

4 mins
December 12, 2024

Unlike machines, humans have a sixth sense — a unique ability to instantly connect and interpret scattered signals and turn them into a cohesive narrative. Yet our smart devices, despite access to billions of sensors, haven't been able to match this basic human ability. By fusing simple sensor data with contextual awareness, Archetype AI's large behavior model can reach contextual understanding of events in the physical world — much like humans do. Read on to learn how Newton "learns" to make sense of complex situations without relying on cameras, and deliver better experiences for humans.

Learning from Human Intelligence

Humans have an extraordinary ability to derive meaning and behaviors from scattered cues and incomplete information. For example, seeing someone biking with children on a weekday morning naturally leads us to infer that they are likely heading to school. Similarly, the sound of broken glass followed by footsteps immediately suggests the possibility of an accident or an intruder, depending on the time and place where the event occurs.

*^{Our brains can seamlessly and effortlessly connect disparate sensory cues and contextual information and make quick, intuitive — often subconscious — decisions.}*

This capability to "connect the dots" is central to human intelligence. It emerges from our brain's ability to effortlessly fuse scattered details, such as perceptual cues (e.g., seeing kids on bikes) and high-level contextual knowledge (e.g., weekday mornings typically involve school runs), into a cohesive understanding of the physical world in the present moment. Importantly, this capability allows us to predict, make decisions, and plan our next steps even when confronted with specific situations that we have never seen before.

^{Using data from just two sensors — a microphone to detect a few predefined sounds and a radar for capturing human presence — Newton can integrate spatial and temporal contexts to analyze sequences of events and generate endless interpretations of reality.}

Working with Infineon, a global semiconductor manufacturer and leader in sensing and IoT, we are exploring how such powerful human-like functions can be developed and deployed in real-world applications using generative physical AI models like Newton. These models seamlessly integrate real-time events captured by simple, ubiquitous sensors — such as radars, microphones, proximity sensors, and environmental sensors — with high-level contextual information to generate rich and detailed interpretations of real-world behaviors. Importantly, this is achieved without requiring developers to explicitly define such interpretations or relying on complex, expensive, and privacy-invasive sensors like cameras.

‍

Understanding the Sensor Fusion Challenge

The real world — buildings, appliances, vehicles, factory floors, and electrical grids — runs on sensor data. Hundreds of billions of sensors operate around us, capturing various aspects of physical reality. While interpreting a single sensor signal in isolation is relatively straightforward, fusing data from a multitude of sensors distributed across a physical space into a single actionable interpretation remains a significant challenge.

^{Today's sensors are interpreted separately, missing dynamic spatial patterns of behavior. Fusing sensor data is key to unlocking the full potential of smart environments.}

The challenge lies in the exponential growth of possible interpretations as the number of sensors, locations, and event histories increases. Even for very simple systems, the number of potential interpretations rapidly becomes overwhelming, far beyond what humans can handle. For example, the system of two binary sensors shown in the figure below would generate 1,536* possible scenarios. Adding just one more binary sensor skyrockets the number of interpretable contexts to 24,576!

^{Previous events matter in physical systems. Two binary signals—like "presence and absence" or "alarm and no alarm"—can have completely different meanings depending on the sequence of earlier events, even with just four events in the history.}

This happens because most real-world processes are non-Markovian, meaning they depend on both current observations and past events: identical events can mean entirely different things based on what happened before, as shown in the figure above. This requires combining sensor data with its history, greatly increasing the number of possible interpretations. To conclude, manually programming such systems to be robust and comprehensive has been practically impossible, until now.

Unlocking Sensor and Context Fusion with AI

Generative physical AI models, such as Newton, are able to overcome these challenges for the first time, unlocking a boundless range of applications. We explored Newton's ability to interpret real-world context and human activities by combining radar and microphone data. In our demo scenarios, Newton powers a home assistant in a kitchen setting, helping a user through their morning routine in one situation and in another helping to keep residents safe when the smoke alarm goes off.

The model can infer an unlimited number of interpretations directly from sensor data across timelines of any length. By combining multiple simple, privacy-friendly sensors into a larger network, it can generate detailed descriptions of dynamic scenes and contexts.

When fused with additional contextual data — such as location, time, day of the week, weather, news, or user preferences — Newton can provide personalized and relevant recommendations or services. This capability makes it possible to go beyond basic sensor interpretations, offering meaningful insights tailored to the needs of individual users or organizations.

^{When Newton is provided with time of day context, it can recognize, for example, that a nighttime alarm needs a different interpretation than a daytime alarm. In this video when someone leaves the kitchen without turning off the alarm, Newton suggests notifying other residents.}

Newton's ability to seamlessly fuse large amounts of sensor data and contextual information at scale opens the door to a wide range of exciting applications. Here are a few examples:

Smart Homes: Detect human activities in a privacy-respecting way without relying on cameras, and deliver personalized services such as security, safety, wellness, and entertainment.
Automotive: Monitor driver behavior, such as signs of drowsiness or health emergencies. Understand in-car context to provide passengers with mobility-related services tailored to their needs.
Manufacturing: Improve safety, efficiency, and adherence to best practices by integrating outputs from equipment, occupancy, and environmental sensors. This approach can scale from individual machines to entire factory floors and facilities.

These examples illustrate the potential of Newton to revolutionize industries by leveraging sensor data to create intelligent context-aware solutions.

Looking forward and given Newton's fundamental capability to interpret physical event data, a key question arises: can Newton also do "next event prediction" similar to how LLMs do "next word prediction"? Can Newton predict the future evolution of the physical world based on current and past observations?

‍‍

This project was completed in collaboration with Infineon, a global semiconductor manufacturer and leader in sensing and IoT. To learn more about the partnership, check out how Infineon and Archetype AI are unlocking the future of AI-powered sensors.

‍

*We assume here that the system is deployed in two locations, e.g. kitchen and living room, and three times of the day are provided to Newton as additional context: morning, midday and night.

‍

Recommended posts

See all posts

How Newton AI Is Transforming Manufacturing

Despite billions invested in industrial monitoring systems, the majority of manufacturing sensor data goes unused. Newton AI changes this by providing a foundation model that fuses data from different types of sensors to deliver actionable insights for safety, quality, and maintenance — without requiring months of custom development.

Applications

Technology

June 25, 2025

Unlock the Power of Physical AI with Lenses

As AI moves beyond screens into the physical world, we need new ways to understand and interact with it. Instead of chatbots or agents focused on automation, we're introducing Lenses — AI applications that continuously transform sensor data into actionable insights.

Research

Technology

March 27, 2025

Bringing AI to Sensor Data: Newton On-Prem and Real-Time

Implementing AI in industrial settings comes with significant challenges like ensuring employee safety, estimating productivity, and monitoring hazards—all requiring real-time processing. However, sending sensor data to the cloud for analysis introduces latency and security concerns, driving up costs. The solution? Eliminate the cloud. With Archetype AI’s Newton foundation model, AI can run on local machines using a single off-the-shelf GPU, delivering low latency, high security, and reduced costs in environments like manufacturing, logistics, transportation, and construction.

Technology

Product

September 5, 2024

Building Physical AI for People, Not Robots

Physical AI is here, but the vision behind it is often reduced to robotics powered by multimodal AI models, and we believe this view is too narrow. In this article, we will outline an approach that focuses on enabling AI to independently uncover the underlying principles of the physical world and augment human intelligence rather than replacing humans.

Research

Technology

February 13, 2025

What Is Physical AI? – Part 2

At Archetype, we want to use AI to solve real world problems by empowering organizations to build for their own use cases. We aren’t building verticalized solutions –instead, we want to give engineers, developers, and companies the AI tools and platform they need to create their own solutions in the physical world.

Research

Applications

May 21, 2024

2024: The Year of Physical AI

This was the year Physical AI moved from concept to reality, as the importance of using modern AI foundation models to solve real-world problems entered the mainstream. We are proud to be at the forefront of this movement. Let’s look at the year’s most defining moments from Archetype AI.

‍

Announcements

Product

Technology

December 23, 2024

Power of Metaphors in Human-AI Interaction

Currently, digital companions are the dominant metaphor for understanding AI systems. However, as the field of generative AI continues to evolve, it's crucial to examine how we frame and comprehend these technologies—it will influence how we develop, interact with, and regulate AI. In this blog post, we'll explore different metaphors used in AI products and discuss how they shape our mental models of AI systems.

‍

Research

September 25, 2024

Can AI Learn Physics from Sensor Data?

We are excited to share a milestone in our journey toward developing a physical AI foundation model. In a recent paper by the Archetype AI team, "A Phenomenological AI Foundation Model for Physical Signals," we demonstrate how an AI foundation model can effectively encode and predict physical behaviors and processes it has never encountered before, without being explicitly taught underlying physical principles. Read on to explore our key findings.

Research

Technology

October 17, 2024

What Is Physical AI? – Part 1

We’re building the first AI foundation model that learns about the physical world directly from sensor data, with the goal of helping humanity understand the complex behavior patterns of the world around us all.

Research

Technology

November 1, 2023