Implementing AI in an industrial setting presents significant challenges—tracking employee safety, estimating the productivity of vast lines of equipment and monitoring complex hazards. Each of these challenges requires real-time processing, solutions, analysis, and action. However, streaming real-time sensor data to AI in the cloud often falls short when split-second insights are needed. These models demand gigabytes of sensitive sensor and video data, which must be sent over the network, bringing latency and security concerns as well as added costs.
Fortunately, there’s a better solution—eliminating the cloud. Running AI at the edge, on-premises, offers low latency, enhanced security, and reduced cost. And it already exists. Newton, Archetype AI’s foundation model for any type of sensor data, can run on a single, off-the-shelf GPU on local machines in any environment—be it manufacturing, logistics, transportation, construction, or beyond.
Here's how Archetype AI's Newton is transforming industrial AI deployment.
What it Takes to Run AI at the Edge
In July 2024, we showcased Newton at Khasm Lab’s Summer Social in Kirkland, WA. The demo was straightforward—categorizing traffic at an intersection—but the real focus was on demonstrating edge deployment.
First, the hardware. We used a single Dell PowerEdge XR8000 server coupled with a NVIDIA L4 GPU, a desktop GPU designed for AI workloads as well as desktop applications like video, gaming, and streaming. This off-the-shelf hardware is part of a new breed of GPUs designed for local use. Despite the processor's compact size compared to H100s, which are common in cloud compute environments, it is capable of running sophisticated AI foundation models while requiring far less power and integrating seamlessly with other local hardware.
The NVIDIA L4, while less powerful than the Jetson Orin AGX (NVIDIA’s Edge-specific GPU), is still ideal for industrial AI applications (and easily accessible off the shelf from Amazon). This type of GPU is designed for embedded systems, which is what most industrial AI applications will use. The entire Archetype AI platform is running on a single L4 and server shown above. The platform consists of:
- A lightweight version of our physical AI foundation model, Newton.
- All the necessary tooling and services to run, deploy, and interact with Newton.
Optimized for the edge, this lightweight model runs on low-power hardware like the NVIDIA L4 and performs as needed for a particular industrial environment. Despite its 20GB size, it can fit and run exceptionally well on this cost-effective GPU.
What’s more, the Archetype AI platform supports hot-swapping of GPUs, allowing developers to dynamically add additional GPUs to create a virtual GPU cluster simply by plugging in additional hardware units to the same network. This unlocks physical AI use cases with a single GPU while supporting more complex applications with thousands of sensors using a fleet of edge GPU nodes—all with the same platform and APIs.
Setting up the entire Archetype AI platform on an edge device requires just one line of code. The platform, consisting of a set of microservices, provides all the tooling required to run physical AI applications, such as managing real-time sensor streams, providing dashboards for monitoring platform status, and most importantly, interfacing with the Newton foundation model. Once installed, the platform handles all sensor integration, data processing, real-time inference, and API interactions.
This entirely local solution eliminates the need for complex cloud integration, additional middleware, or specialized AI and engineering expertise, making AI deployment in industrial environments straightforward and accessible. Users can build and deploy an application in a day, just as our team did for this demo at Khasm Lab’s Summer Social.
Speed, Privacy, and Cost: The Edge AI Advantage
On-prem AI provides everything you need for an industrial AI solution.
First, low latency. In industrial applications, milliseconds matter. Whether dealing with fast-moving machinery, real-time video feeds, or time-sensitive data, on-prem AI processes data at the edge, enabling near-instantaneous decision-making and responses. In our demo, local network latency was further minimized by deploying a high-bandwidth private 5G network.
With low latency, end users can access:
- Real-time monitoring: Detect anomalies or potential safety hazards as they occur.
- Immediate action: Trigger alerts or automated responses without delay.
- Improved safety: React to potential dangers before they escalate.
Second, high security. Industrial data is often sensitive and proprietary. On-prem AI keeps your data within your physical premises, significantly reducing the risk of data breaches or unauthorized access, as you aren’t sharing your data with cloud providers. High security makes compliance more straightforward, as you can adhere to industry-specific data protection regulations like HIPAA, or broader regulations like GDPR on your own terms.
For the most secure use cases, the entire system can run as an air-gapped platform, with no need for a connection to the public internet, further minimizing cloud transmission and storage vulnerabilities.
Finally, reduced cost. Cloud-based AI solutions often come with ongoing fees. In contrast, on-prem AI can be more cost-effective, especially for high-volume, continuous applications. After the initial capital outlay, the running costs for servers and GPUs are minimal, and edge AI devices will only become more affordable over time.
With local AI, you benefit from:
- Predictable costs: No surprise bills for data transfer or API calls.
- Scalability: Expand your AI capabilities by simply adding more local hardware.
- Reduced bandwidth needs: Save on network infrastructure costs by processing data locally.
On-prem AI offers further advantages, like customization. You can rapidly fine-tune Archetype AI's Newton to fit specific industrial needs and environments, use natural language to interact with it, and instruct the model to perform exact use cases. This solution allows you to easily integrate with existing systems and test and deploy updates without relying on external providers.
On-prem deployment also ensures reliability and availability. It continues to function even if internet connectivity is lost, supporting continuous operation in critical industrial settings.
Going East or West?
How does this all come together? Let's break down our demo at the 5G Open Innovation Lab's Summer Social.
With a 5G camera connected over a private 5G network, all behind the same on-prem security firewall, we monitored an intersection in Bellevue, WA outside the Khasm Lab and across from a Meta office building. This an ideal location for demonstrating Newton’s capabilities amidst the ebb and flow of traffic and digital congestion.
In the video below, you’ll see the complexities of the environment with that there are both vehicles and pedestrians going in all different directions, and although note seen, the “network congestion” of thousands of tech workers using digital devices on the public LTE cellular network.
The lightweight Newton model is pre-trained to understand vehicles, pedestrians, and roads, and only requires simple “semantic hints” to begin monitoring the intersection in real-time. These hints consist of giving Newton simple cues specifics of the situation such as differentiating the westbound lane vs. the eastbound lanes, or calling out crosswalks in the area.
We can ask Newton natural language questions such as "How many cars went east today?" and get detailed, analytical answers. Newton generates a chart breaking down eastbound traffic by lane. There are no pre-programmed queries or complex SQL—just natural language and instant insights.
This demo was built and deployed in a single day. With one line of code and a few semantic hints, Newton was operational. Zooming out from this transportation scenario to a factory floor, construction site, warehouse or shipping port, Newton can provide real-time insights, safety monitoring, and efficiency tracking—all running locally, keeping data secure and latency low.
- Factories: Newton could monitor production lines in real time, where it has the ability to identify bottlenecks, track equipment efficiency, and alert supervisors to potential safety hazards before they escalate.
- Construction: AI could monitor worker movements, equipment usage, and material flow to enhance safety protocols and optimize resources with a bird's-eye view of the entire operation.
- Warehouses: Newton could manage inventory by tracking item locations, predicting restocking needs, and optimizing picking routes, all while maintaining worker safety in high-traffic areas.
- Ports: AI could orchestrate cargo containers, cranes, and vehicles to maximize throughput and minimize idle time while maintaining strict security protocols
Each scenario can run a specialized, local model tailored to its environment. A construction model grasps trucks and builders; a warehouse model understands forklifts and pallets; and a port model comprehends containers and cranes. These elements are pre-existing in the AI model—no extra training required. You simply provide semantic hints about what to look for, and Newton handles the rest. The result? Real-time insights into your operations, safety monitoring, and efficiency tracking—all running locally, keeping your data secure and your latency low.
Interestingly, though the demo was in Kirkland, the "local AI" was five miles away in Bellevue. This highlights Newton's flexibility—while the AI and data processing is local, insights don't have to stay local. Companies can process sensitive sensor data on-site in real-time and make the outputs available to HQ for further analysis. Newton is not just "local" or "in the cloud"—it can be a distributed intelligence that scales with your business.
Ready to Bring AI to Your Sensor Data?
The future of industrial AI isn't in the cloud—it's right where you need it, on the edge. With Archetype AI's Newton, you can harness cutting-edge physical AI that is fast, private, and entirely under your control.
If you are integrating AI into your workflow, talk to us about how on-prem AI and private networks can transform your operations with AI that runs on your terms.