Reliable and efficient
Your app does not stall at one million users. More devices means more distributed compute at the edge.
use onde::inference::{ChatEngine, GgufModelConfig};
let engine = ChatEngine::new();
engine
.load_gguf_model(
GgufModelConfig::platform_default(),
Some("You are a helpful assistant.".into()),
None,
)
.await?;
let result = engine.send_message("Hello!").await?;
println!("{}", result.text);
// completed in 85ms — 100% on deviceBenchmark
While others wait for a handshake from a data center in Virginia, Onde is already finished.
| Inference Layer | Latency (ms) | Server Cost | Privacy |
|---|---|---|---|
| Standard Cloud AI | 1,200ms+ | $$$$ | Public |
| Generic Mobile AI | 450ms | $0 | Local |
| Onde Engine | 85ms | $0 | Encrypted/Local |
First-class citizen
Your app does not stall at one million users. More devices means more distributed compute at the edge.
Two lines to load a model. One line to run it. The SDK gets out of your way so you can ship.
Hand-tuned for Apple Neural Engine and Apple silicon. Onde speaks closer to the metal than generic wrappers.
Action speaks (c)louder
We did not just build an engine; we built a champion. Splitfire uses Onde Inference to deliver studio-grade audio separation 100% offline. It is the fastest audio splitter on the App Store. Period.
One More Thing
Apple, App Store, iOS, and macOS are trademarks of Apple Inc.