SelfReason AI gateway
Build once, then route each request through the best execution path for the job: SelfReason on-device models for privacy and speed, cloud models for larger context or deeper reasoning, and automatic fallbacks when reliability matters more than provider loyalty.
Why this gateway exists
- One OpenAI-compatible surface instead of separate local and cloud integrations
- Local-first routing so private and latency-sensitive requests stay close to the user
- We use this same gateway in our client apps across Web, Desktop, Android, extensions, and bots
- Shared sessions, prompts, tools, and schemas across Web, Desktop, Android, extensions, and bots
- Fallback policies so one model outage does not become your product outage
- A cleaner path for products already built around SelfReason workflows
What you get
Flexible routing
local: force SelfReason's on-device runtimeauto: try local first, then escalate only when limits or quality thresholds require itcloud: go directly to hosted models for the largest context windows or strongest reasoningfallback: define backup models or providers for reliability-sensitive flows
One contract across execution modes
- OpenAI-compatible chat completion patterns
- Real-time streaming for chat, summarization, and agent UX
- Tool calling for workflow actions, database lookups, and API orchestration
- Structured outputs for typed JSON responses
- Shared observability for latency, safety actions, and route decisions
A practical integration shape
The gateway is designed to feel familiar if you already use modern AI SDK patterns. The exact base URL, models, and policy fields depend on your deployment, but the request pattern stays consistent:
Why it differs from a generic LLM proxy
- The gateway is designed around SelfReason's local runtime, not just cloud-to-cloud rerouting
- You can keep one application contract even when requests move between device and cloud
- Route decisions can factor prompt size, safety posture, device capability, energy state, and user quality overrides
- It fits naturally with SelfReason sessions, prompts, and browser-native tooling instead of bolting on another isolated proxy layer
Pick the right path
- Start here if you need routing, fallbacks, streaming, and a shared app contract
- Read SelfReason on-device API if your main concern is on-device inference, offline behavior, and model warm-up strategy
- Explore SelfReason if you want to see the local runtime in a product-facing form
Build with confidence
- Keep privacy-sensitive work local by default
- Escalate only when larger context or stronger reasoning is worth the tradeoff
- Use fallbacks to turn provider outages into policy decisions instead of user-visible failures
Need a capability or model that is not covered yet? Open a ticket and help shape the SelfReason AI Gateway roadmap.
