AI Technology — AirMouse AI

Architecture

Six models. One unified pipeline.

AirMouse Aurora ships a custom neural stack: voice ASR, gesture CNN, screen OCR, context LLM, workflow compiler, and a predictive action engine — all running concurrently on-device.

On-device inferenceNo round-trips to the cloud. Your data never leaves your hardware.
Under 12 ms total latencyFaster than a human blink from input to executed action.
Continual learningModels fine-tune on your usage patterns privately, on-device.

airmouse — ai pipeline

# Aurora AI stack — boot sequence
loading  voice_asr_v4        ✓ 3.2ms
loading  gesture_cnn_v3      ✓ 1.8ms
loading  screen_ocr_v2       ✓ 2.1ms
loading  context_llm_mini    ✓ 4.6ms
loading  workflow_compiler   ✓ 0.9ms
loading  predict_engine_v5   ✓ 0.7ms

aurora   all models ready — 13.3ms
aurora   listening for input

Context Engine

Active context snapshot

Focused appVS Code — main.ts

Open windowsSlack · Figma · Linear

Clipboard intentCode snippet

Predicted next actionRun tests

Workflow suggestionDeploy + notify team

Context engine

It knows what you're about to do next.

A lightweight LLM continuously interprets your active app, clipboard, and usage history to build a real-time context graph — so commands like "reply to that" just work.

Semantic window trackingUnderstands what's visible, not just what app is open.
Pronoun resolution"Her", "that file", "earlier" — all resolved from context.

Voice ASR v4

Voice that understands intent, not commands.

Our on-device ASR model was trained on 280,000 hours of speech across 42 languages. It understands accents, ambient noise, and natural speech patterns with 98.7% accuracy.

42 languages supportedCode-switching and mixed-language commands work natively.
Noise-robust inferencePerforms at 94% accuracy even in open-plan offices.
Intent compositionChains multi-step intents from a single spoken sentence.

Whisper-based On-device 42 languages 98.7% accuracy

Live waveform

# transcript

"open my standup and summarise overnight messages"

intent → multi-step composition

actions → open_workspace · summarise_slack

Gesture CNN v3

Gestures that feel like muscle memory.

A convolutional neural network trained on 80 million gestures classifies your touch patterns with 99.2% accuracy and maps them to AI-powered actions in real time.

24 preset gesturesPlus unlimited custom mappings to any workflow or shortcut.
Adaptive sensitivityLearns your personal touch pressure and motion style.

99.2% accuracy 80M training gestures Custom maps

Workflow Intelligence

From spoken intent to running automation.

The workflow compiler parses natural language into an executable action graph — no scripts, no YAML, no clicking through settings panels.

NL Compiler

Say it once. Run it forever.

Describe a workflow in plain English. The NL compiler converts it into a validated, optimised action graph instantly.

# you said:

"every morning at 9, open slack, fetch my tickets, summarise, post to standup"

# compiled to:

trigger cron("09:00")

action[1] open_app("Slack")

action[2] fetch_linear(assignee="me")

action[3] ai_summarise(context="standup")

action[4] post_to_slack(channel="#standup")

Trigger types

Trigger anything

Voice"Open my project"
GestureCustom swipe pattern
ScheduleTime-based cron
ContextApp or URL change

200+ Integrations

Works with your stack

Native connectors for Slack, Notion, Linear, GitHub, Figma, Jira, and 195 more apps — with an open REST API for custom integrations.

Smart chains

Conditional branches & loops

Workflows branch on AI-evaluated conditions. "If the PR is failing, notify the team and open the error log" compiles to a fully conditional execution graph.

Trigger Fetch data AI evaluate if true → Notify + act

Smart Automation

The AI that automates itself.

Aurora watches for repetitive patterns and proactively suggests automations — then builds them the moment you approve.

Pattern Detection

After 3 repetitions, Aurora recognises a routine and quietly builds an automation draft for your review.

Auto-suggestsOpt-in

Instant Execution

Once a workflow is compiled, it executes in under 50 ms. Background, foreground, or silent — your choice.

<50ms execParallel steps

Workflow Library

500+ community-built workflow templates. One tap to install, fork, and customise for your own stack.

500+ templatesOpen source

Prediction Engine v5

One step ahead, always.

The prediction engine builds a Markov-inspired action graph from your personal usage data, locally. It surfaces the next likely action as a gentle suggestion — tap to accept, swipe to dismiss.

84% acceptance rateUsers accept the AI's next-action suggestion 84% of the time within 2 weeks of use.
Fully privateThe action graph is stored only on your device and never transmitted.

ML Pipeline

How we build models that get smarter over time.

Our five-stage ML pipeline combines large-scale pre-training with privacy-preserving on-device fine-tuning.

1

Foundation pre-training

Base models trained on 280K hours of speech and 80M gestures from consenting opt-in users with differential privacy.

2

Federated fine-tuning

Model gradients — never raw data — are aggregated across devices with noise injection to preserve privacy while improving accuracy.

3

Compression & quantisation

INT8 and INT4 quantisation reduces model size 8× while preserving 98% of accuracy — enabling real-time on-device inference.

4

On-device personalisation

A tiny adapter layer learns your specific patterns locally — accent, gestures, workflows — in under 50 epochs with no data leaving your device.

5

Continuous evaluation & rollback

Every model update is A/B tested silently on a 1% canary cohort. Automated accuracy benchmarks gate every deployment. Any model that regresses rolls back in under 90 seconds.

Performance

Numbers that make latency invisible.

0

End-to-end latency

0

Voice accuracy

0

Gesture accuracy

0

Concurrent AI models

AI Safety

Smart enough to help. Safe enough to trust.

Every AI capability is governed by explicit user consent layers, hardware-backed isolation, and continuous red-team audits.

Hardware Isolation

AI models run inside a dedicated secure enclave. Neither the OS nor other apps can access model inputs or outputs.

Secure EnclaveTrustZone

Differential Privacy

Federated learning adds calibrated noise to all gradient updates, making it mathematically impossible to reconstruct any individual's data.

ε = 1.0δ < 10⁻⁵

Consent Ledger

Every AI feature that accesses app context or screen data has a granular toggle. Revoke any permission instantly.

Granular controlsAuditable

Roadmap

The operating system for human intent.

We're building toward a world where your devices anticipate your needs before you voice them — a persistent AI co-pilot woven seamlessly into every tool you use.

Proactive AI (Q3 2025)Aurora surfaces suggestions before you ask, based on calendar and context signals.
Multi-device orchestration (Q4 2025)Single voice command cascades actions across phone, laptop, TV, and smart home.
Open model fine-tuning (2026)Publish base model weights so enterprises can fine-tune for domain-specific workflows.

Unified AI operating layer — always on, always private

Developer APIs

Build on the AirMouse AI stack.

Open REST and WebSocket APIs give developers direct access to voice, gesture, workflow, and context streams. Build custom integrations in minutes.

Read the docs Get API key

REST API — voice stream

# Subscribe to voice intent stream
const ws = new WebSocket(
  "wss://api.airmouseai.com/v2/voice"
);

ws.onmessage = ({ data }) => {
  const event = JSON.parse(data);
  // { type: "intent", action: "open_app",
  //   payload: { app: "VS Code" },
  //   confidence: 0.97 }

  if (event.type === "intent") {
    handleIntent(event);
  }
};

# Also available: gesture, workflow,
# context, prediction streams

Try AirMouse AI

Experience the AI engine firsthand.

Download free. Explore every AI model live on your device. No subscription required to start.

Download free Developer docs

The neural engine that makeseverything intelligent.