AI GatewayProvider routingUsage control

One doorfor everymodel call.

Put Forgeon between your app and AI providers. Control keys, routing, limits, logs, fallbacks, and usage from one production gateway.

POST /v1/gateway/chat

gateway.console

Requests today

84,291

Avg latency

214ms

Fallback rate

0.7%

Blocked keys leaked

0

Live request trace

request → policy → provider → stream

healthy
1

Request received

tenant_acme / production

8ms
2

Policy checked

rate limit, allowed model, budget

14ms
3

Provider selected

openai/gpt-4.1-mini

21ms
4

Response streamed

1,428 tokens / $0.0031

318ms

Provider priority

OpenAI#1
Anthropic#2
Gemini#3
Mistral#4

Budget guard

$418 used$700 limit

Fallback

Move to next provider after timeout, 429, or failed response.

OpenAIAnthropicGeminiMistralGroqLocalOpenAIAnthropicGeminiMistralGroqLocalOpenAIAnthropicGeminiMistralGroqLocal

Why it exists

AI gets messy when every app talks directly to every provider.

Keys leak. Costs drift. Logs live everywhere. Switching providers becomes painful. AI Gateway gives you one stable layer before the chaos reaches your product.

01

Hide provider keys

Your frontend and apps call Forgeon. Provider keys stay in the platform layer, not scattered across projects.

02

Route per use case

Cheap model for simple tasks. Strong model for reasoning. Fast model for realtime UX. One gateway, different routes.

03

Fallback when providers fail

If one provider slows down, rate limits, or returns errors, the gateway can move the request before users notice.

04

Know the real cost

Track requests, latency, tokens, model usage, and estimated spend per tenant, app, route, and environment.

Routing table

Different jobs deserve different models.

Don’t send every request to the same expensive model. Create routes for speed, cost, privacy, and reasoning quality.

per tenantper environmentper routeper budget
RouteModelModeStatus

support-chat

/ai/support

openai/gpt-4.1-mini

balancedlive

content-draft

/ai/content

mistral/small

low-costlive

reasoning-task

/ai/reason

anthropic/claude

qualitylive

private-agent

/ai/internal

local/qwen

privatebeta

Developer surface

Your app calls one endpoint. Forgeon handles the provider layer.

Keep your application simple. The gateway handles keys, routing, retry behavior, fallback rules, rate limits, and request logs.

View integration guide

app/api/chat/route.ts

export async function POST(req: Request) {
  const { message } = await req.json()

  const response = await fetch("https://api.forgeon.dev/v1/gateway/chat", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.FORGEON_TOKEN}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      route: "support-chat",
      input: message,
      fallback: true,
      stream: true,
    }),
  })

  return response
}

No provider key in app

Controlled at the gateway level.

Trace every request

Controlled at the gateway level.

Switch model later

Controlled at the gateway level.

Use cases

Start with one route. Grow into a real AI control plane.

Customer support

Route chat requests through a monitored endpoint with usage limits and safe fallbacks.

Internal copilots

Give your team AI tools without exposing provider tokens across every internal app.

Content workflows

Run generation, rewrite, classification, and moderation through one controlled layer.

AI Gateway

Stop wiring AI providers directly into every app.

Put one controlled gateway in front of model calls, then manage access, routing, fallback, and usage from the platform.