Self-hosted Layer-7 AI gateway for Azure — priority queuing, async orchestration, and per-user governance inside your own VNET.
TL;DR
- Run locally:
git clone … && dotnet run --project src/SimpleL7Proxy - Deploy to ACA:
./.azure/setup.sh && azd provision && ./.azure/deploy.sh - Use async mode for long LLM calls (>60 s); see AsyncOperation.md
Incoming requests are priority-queued and dispatched to healthy backends; degraded backends are isolated automatically.
- Priority queuing — routes high-priority users ahead of batch traffic.
- Per-user validation — blocks callers whose model or header values aren't in their allowlist.
- Entra App ID gating — unknown app IDs rejected at the gate; no backend hit.
- Circuit breaker — progressive back-off; auto-recovery when backends respond.
- Async orchestration — blob + Service Bus hand-off for calls that exceed the sync timeout.
- Hot-reload config — allowlists, routing rules, and profiles update without restart.
→ Full architecture and use-case analysis
- .NET 10 SDK
- Docker (container builds)
- Azure Developer CLI (azd) (cloud deployment)
- Azure subscription with Container Apps; optionally AI Foundry / APIM
Local (2 commands):
git clone https://github.com/your-org/SimpleL7Proxy.git
dotnet run --project src/SimpleL7ProxyAzure Container Apps — Windows:
.\.azure\setup.ps1
azd provision
.\.azure\deploy.ps1Azure Container Apps — Linux / macOS:
chmod +x .azure/setup.sh .azure/deploy.sh
./.azure/setup.sh && azd provision && ./.azure/deploy.shSee Development & Testing for local mock backends.
See Container Deployment for VNET and high-performance variants.
New here? Start with Quick Start → Overview → Advanced Configuration.
| Topic | Document |
|---|---|
| Overview & Architecture | docs/OVERVIEW.md |
| Backend Host Configuration | docs/BACKEND_HOSTS.md |
| Load Balancing | docs/LOAD_BALANCING.md |
| Priority Queuing & User Governance | docs/ADVANCED_CONFIGURATION.md |
| Circuit Breaker | docs/CIRCUIT_BREAKER.md |
| Health Checking | docs/HEALTH_CHECKING.md |
| Async Operations | docs/AsyncOperation.md |
| User Profiles | docs/USER_PROFILES.md |
| Request Validation | docs/REQUEST_VALIDATION.md |
| Observability & Telemetry | docs/OBSERVABILITY.md |
| Security | docs/SECURITY.md |
| Configuration Settings | docs/CONFIGURATION_SETTINGS.md |
| Azure App Configuration | docs/AZURE_APP_CONFIGURATION.md |
| Environment Variables | docs/ENVIRONMENT_VARIABLES.md |
| AI Foundry Integration | docs/AI_FOUNDRY_INTEGRATION.md |
| APIM Policy | APIM-Policy/readme.md |
| Container Deployment | docs/CONTAINER_DEPLOYMENT.md |
| Development & Testing | docs/DEVELOPMENT.md |
| Response Codes | docs/RESPONSE_CODES.md |
Issues and pull requests are welcome. Open an issue first to discuss significant changes before submitting a PR.
MIT — see LICENSE. Copyright (c) Microsoft Corporation.
