Skip to content

Roadmap

The gateway is built milestone by milestone; each is fully green (task reset && task test) before the next starts.

Phase 0 — single-tenant engine

MilestoneStatusWhat it delivers
M0 Cluster + Higress✅ donereproducible local gateway from code
M0.5 Remote access + in-cluster TLS✅ doneLet's Encrypt TLS in-cluster, Cloudflare Tunnel dev ingress
M1 Work-type → model routing✅ doneheader/tag routing to logical model routes + fallback
M2 Per-group/user token limits✅ donebuilt-in token rate limits, Redis-backed, identity-keyed
M2.5 API keys + USD budgets✅ donekey-auth consumers, per-consumer dollar caps via a budget controller
M3 Model allow-list✅ doneper-group allowed-model enforcement (403)
M4 Observability✅ doneGrafana LGTM + Alloy dashboards
M5 Guardrails✅ donePII masking + prompt-injection blocking (no cloud dependency)
M6 SSO✅ doneGoogle Workspace OIDC, domain-restricted, feeding identity

Phase 0 (single-tenant engine) is complete. Next is Phase 1 — the multi-tenant control plane.

USD budget enforcement (dollar caps per group/user) lands after SSO (M9, hierarchical budgets), once authenticated-consumer identity exists.

Beyond — multi-tenant product

A control plane stores a per-Project spec and reconciles it into Higress config; Higress stays the data plane. On top: enterprise SSO + RBAC, per-tenant budgets / guardrails / API keys, usage dashboards, and FinOps.

  • Control plane — tenancy model + reconcile loop, project-scoped routing API.
  • Per-tenant governance — hierarchical budgets/limits, API keys, multi-tenant usage attribution, project-scoped guardrails.
  • Enterprise access — multi-org SSO + SCIM + RBAC, web UI.
  • Enterprise-grade — FinOps + audit, semantic cache, canary/A-B model rollout, per-org BYOK provider vault, approval workflows, and more.

Everything is designed so today's single-tenant setup is a lift-and-shift into the multi-tenant product — identity is always the {org, project, group, user} tuple, config is rendered from a Project spec, and every policy is per-project overridable.

Self-hosted AI gateway on Higress — Infrastructure-as-Code, on-prem, no cloud lock-in.