How to Restrict GPT-5 Access to One Team with Virtual Keys
GPT-5 is one of the most capable and most expensive models available through the OpenAI API, and uncontrolled access across an entire organization produces unpredictable token spend and inconsistent usage patterns. Bifrost, the open-source AI gateway built in Go by Maxim AI and free to self-host, is the best overall choice for enterprise teams that need to restrict GPT-5 access to a single team while keeping a unified API across every provider. Using virtual keys, you can scope GPT-5 to one team, attach a budget and rate limits, and reject any request that comes from outside that boundary. This post covers how to restrict GPT-5 access with virtual keys, step by step.
What Are Virtual Keys in Bifrost
Virtual keys are the primary governance entity in Bifrost. A virtual key is a credential that applications use to authenticate, and it carries its own access permissions, budgets, and rate limits. Instead of distributing raw provider API keys to every team, you issue virtual keys that define exactly which models and providers each consumer is allowed to use.
Each virtual key supports several controls that matter for model access:
- Access control: model and provider filtering, so a key can be limited to a specific set of models such as GPT-5.
- Cost management: an independent budget, checked alongside any attached team or customer budget.
- Rate limiting: token-based and request-based throttling at the virtual key level.
- Exclusive attachment: a virtual key belongs to one team, one customer, or neither, but never both at once.
- Active or inactive status: enable or disable a key instantly without rotating provider credentials.
Because the gateway resolves model access at the virtual key layer, restricting GPT-5 to a single team becomes a configuration change rather than a code change in every downstream service.
Why Restrict GPT-5 Access to a Single Team
Restricting GPT-5 access to one team is a cost and governance decision. GPT-5 has driven a large increase in coding, agent-building, and reasoning workloads since its release, and reporting on its enterprise rollout noted that the economics of running frontier models remain demanding for both providers and customers. When every team can call the most expensive model freely, spend is hard to forecast and hard to attribute.
Scoping GPT-5 to a single team addresses several concrete problems:
- Predictable spend: only one team can incur GPT-5 token costs, and that team operates under a fixed budget.
- Clear attribution: usage maps to a known group, which simplifies internal chargeback and reporting.
- Reduced blast radius: a misconfigured client elsewhere in the organization cannot route traffic to GPT-5 by accident.
- Staged rollout: a pilot team can validate GPT-5 in production before broader access is granted.
This pattern is common when a frontier model is approved for one use case, such as advanced research or agentic coding, while the rest of the organization continues to use lower-cost models. Bifrost makes the boundary explicit and enforceable. For a broader view of access and cost controls, the Bifrost governance overview describes how virtual keys, teams, and budgets fit together.
How to Restrict GPT-5 Access with Virtual Keys in Bifrost
To restrict GPT-5 access to a single team, create a team, then create a virtual key that allows only GPT-5 from OpenAI and attach it to that team. The steps below use the Bifrost governance API, and the same actions are available in the web UI.
Step 1: Create the team
A team groups virtual keys and supports department-level budget management. Create the team that will own GPT-5 access:
curl -X POST <http://localhost:8080/api/governance/teams> \\
-H "Content-Type: application/json" \\
-d '{
"name": "AI Research Team",
"budget": { "max_limit": 500.00, "reset_duration": "1M" }
}'
Teams support independent budgets but do not carry rate limits; rate limiting is applied at the virtual key level. See the budget and limits documentation for the full set of options.
Step 2: Create a virtual key that allows only GPT-5
Create a virtual key with a provider configuration that lists GPT-5 as the only allowed model, then attach it to the team using team_id. The allowed_models array is what restricts the key to GPT-5:
curl -X POST <http://localhost:8080/api/governance/virtual-keys> \\
-H "Content-Type: application/json" \\
-d '{
"name": "GPT-5 Research Key",
"description": "GPT-5 access scoped to the AI Research Team",
"provider_configs": [
{
"provider": "openai",
"weight": 1.0,
"allowed_models": ["gpt-5"]
}
],
"team_id": "team-ai-research-001",
"is_active": true
}'
With this configuration, the virtual key can only call GPT-5 through OpenAI. A request for any other model, or any other provider, is rejected. Because only this key carries GPT-5 in its allowed_models, no other team can reach the model.
Step 3: Restrict the key to specific provider credentials (optional)
If you maintain separate OpenAI API keys per cost center, you can pin the virtual key to specific provider credentials with key_ids. This ties GPT-5 usage to a designated billing key:
{
"provider_configs": [
{
"provider": "openai",
"weight": 1.0,
"allowed_models": ["gpt-5"],
"key_ids": ["openai-research-key"]
}
]
}
When key_ids is set, the virtual key can use only those provider keys. An empty array or an omitted field denies all keys, and ["*"] allows every configured key. This level of control over key management keeps GPT-5 spend on a single, auditable credential.
Adding Budgets and Rate Limits to the GPT-5 Key
A model restriction controls which model a team can call. A budget and rate limit control how much. Attaching both to the GPT-5 virtual key turns a binary access grant into a bounded one. Budgets and rate limits are configured directly on the virtual key:
curl -X POST <http://localhost:8080/api/governance/virtual-keys> \\
-H "Content-Type: application/json" \\
-d '{
"name": "GPT-5 Research Key",
"provider_configs": [
{ "provider": "openai", "weight": 1.0, "allowed_models": ["gpt-5"] }
],
"team_id": "team-ai-research-001",
"budget": { "max_limit": 300.00, "reset_duration": "1M" },
"rate_limit": {
"token_max_limit": 200000,
"token_reset_duration": "1h",
"request_max_limit": 500,
"request_reset_duration": "1m"
},
"is_active": true
}'
This configuration applies three independent controls to GPT-5 access:
- Budget: a monthly dollar cap that resets on the chosen duration (
1m,1h,1d,1w,1M, or1Y). - Token limit: a ceiling on tokens consumed per period.
- Request limit: a ceiling on requests per period.
The virtual key budget is checked together with the team budget, so the GPT-5 key cannot exceed either its own cap or the AI Research Team's department budget. Hierarchical cost control across the virtual key, team, and customer levels is part of how the Bifrost governance model keeps frontier-model spend bounded.
Enforcing the Restriction Across Every Request
Creating a scoped virtual key is only effective if every request must present a valid key. Bifrost can require a virtual key on all inference traffic, which closes the path where a client calls the gateway with no key at all. Enable enforcement in the client configuration:
curl -X PUT <http://localhost:8080/api/config> \\
-H "Content-Type: application/json" \\
-d '{ "client_config": { "enforce_auth_on_inference": true } }'
When enforcement is on, any request without the x-bf-vk header is rejected. Applications then send their virtual key on each call, for example using the OpenAI-style Authorization header:
curl -X POST <http://localhost:8080/v1/chat/completions> \\
-H "Authorization: Bearer <GPT5_VIRTUAL_KEY>" \\
-H "Content-Type: application/json" \\
-d '{"model": "gpt-5", "messages": [{"role": "user", "content": "..."}]}'
A second behavior reinforces the boundary. When a client lists available models with a virtual key, Bifrost returns only the providers and models that key is allowed to use. Teams without the GPT-5 key never see GPT-5 in the model list, which reduces accidental calls and keeps error-rate metrics meaningful across the supported providers.
Common Questions About Restricting GPT-5 Access
How does Bifrost restrict GPT-5 to one team?
Bifrost restricts GPT-5 to one team through a virtual key whose allowed_models list contains only gpt-5, attached to that team with team_id. No other virtual key includes GPT-5, so no other team can reach the model. Access resolves at the gateway, so the restriction holds regardless of which application sends the request.
Can I set a spending cap on GPT-5 usage?
Yes. Attach a budget to the GPT-5 virtual key with a max_limit and a reset_duration. The budget is checked alongside the team budget, so usage stops when either cap is reached. Token and request rate limits can be applied on the same key for finer control.
Can a team use GPT-5 alongside other approved models?
Yes. Add the additional models to the allowed_models array on that team's virtual key, for example ["gpt-5", "gpt-4o-mini"]. The key then permits exactly those models and rejects everything else.
Does this require self-hosting?
Bifrost is open source and available on GitHub for self-hosting, and governance features including virtual keys, teams, and budgets are part of the gateway. For in-VPC deployments, role-based access control, and immutable audit logs for compliance, the enterprise tier extends the same model.
Getting Started with GPT-5 Access Control in Bifrost
Restricting GPT-5 access to a single team with virtual keys gives platform teams a precise, enforceable boundary around frontier-model spend. You define the allowed model on a virtual key, attach it to one team, add a budget and rate limits, and enforce authentication so every request is governed. Because Bifrost sits in front of every provider through a single OpenAI-compatible API, the same approach extends to any model, any provider, and any number of teams as your access policy grows.
To see how Bifrost can centralize GPT-5 access control and cost governance across your AI infrastructure, book a demo with the Bifrost team.