AI for Apple silicon.

Run your AI workflows locally on Apple silicon devices running iOS, macOS, tvOS, visionOS, or watchOS.

Download on the App Store
main.rs
use onde::inference::{ChatEngine, GgufModelConfig};
let engine = ChatEngine::new();
engine.load_gguf_model(
GgufModelConfig::platform_default(),
Some("You are a helpful assistant.".into()),
None,
)
.await?;
let result = engine.send_message("Hello!").await?;
println!("{}", result.text);
// completed in 85ms — 100% on device

Benchmark

The fastest API call is the one you don't make.

While others wait for a handshake from a data center in Virginia, Onde is already finished.

Inference LayerLatency (ms)Server CostPrivacy
Cloud API1,200ms+$$$$Leaves device
Onde on-device85ms$0Stays on device

First-class citizen

Developer developer developer developer developer developer developer developer developer developer developer developer developer developer - Steve

Reliable and efficient

Your app does not stall at one million users. More devices means more distributed compute at the edge.

Ergonomic API

Two lines to load a model. One line to run it. The SDK gets out of your way so you can ship.

Hardware Optimized

Hand-tuned for Apple Neural Engine and Apple silicon. Onde speaks closer to the metal than generic wrappers.

Action speaks (c)louder

In Production.

One More Thing

The World's Intelligence. On Your Terms.