Every model behind one serious gateway.
One OpenAI-compatible API routes across provider fleets, subscription accounts, BYOK keys, and fallbacks — with policy controls, not a toy marketplace.
A public multi-tenant inference platform on the Foundry engine — identity, cost, isolation, and provider credentials from the first token to the on-chain receipt. One API. Every model.
Performance · measured, not marketed
End-to-end latency against the same local benchmark harness on identical hardware. Foundry is the lowest-latency router we've measured in that harness.
k6, 50–100 VUs sustained · ARM64 Graviton3 · zero failures across all gateways. Microbenchmarks: 1.91µs hot-path overhead, 509ns identity resolution, 61ns rate-limit checks.
00
Marketplaces route calls. Foundry governs model access: identity, cost, isolation, provider credentials, usage logs, and policy from the first token to the on-chain receipt.
The platform
One OpenAI-compatible API routes across provider fleets, subscription accounts, BYOK keys, and fallbacks — with policy controls, not a toy marketplace.
Flagship coding models run on flat USDC subscriptions with internal cost controls. Every other model stays on explicit PAYGO, BYOK, or enterprise capacity.
Tenant identity binds humans, services, wallets, and autonomous agents to spending ceilings. x402 lets agents top up over HTTP in USDC.
Keys, logs, models, provider credentials, budgets, alerts, and billing live in one tenant-aware console. Operators get the admin tools. Customers get the platform.
Foundry Cloud Console
Start now
Drop-in OpenAI compatibility, flat USDC subscriptions for the coding six, and per-token gateway access for everything else.