Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Azure DevOps Services
Azure DevOps Services uses multi-tenancy to reduce costs and improve performance. This design can cause performance issues or outages when other users of shared resources have spikes in consumption. To help prevent this, Azure DevOps limits the resources each user can consume, and the number of requests they can make to certain commands. If you exceed these limits, future requests can be delayed or blocked.
Learn more in Git limits and Best practices to avoid hitting rate limits.
Global consumption limit
Azure DevOps has a global consumption limit that delays requests from individual users when shared resources are at risk of being overwhelmed. This limit helps avoid outages when shared resources are close to being overwhelmed. Individual users typically experience delayed requests only when one of the following incidents occurs:
- One of their shared resources is at risk of being overwhelmed.
- Their personal usage exceeds 200 times the consumption of a typical user within a sliding five-minute window.
The delay depends on the user's sustained level of consumption. Delays range from a few milliseconds per request up to 30 seconds. When consumption drops to zero or the resource isn't overwhelmed, the delays stop within five minutes. If consumption stays high, delays can continue indefinitely to protect the resource.
When a user request is delayed by a significant amount, the user receives an email and a warning banner in the web. For the build service account and others without an email address, members of the Project Collection Administrators group receive the email. For more information, see Usage monitoring.
When an individual user's requests are blocked, the user receives responses with HTTP code 429 (too many requests) and a message similar to the following:
TF400733: The request has been canceled: Request was blocked due to exceeding usage of resource <resource name> in namespace <namespace ID>.
Azure DevOps throughput units
Azure DevOps users consume many shared resources, and the level of consumption depends on factors like:
- Uploading a large number of files to version control, which puts load on databases and storage accounts.
- Running complex work item queries, which increases database load based on the number of work items being searched.
- Running builds, which download files from version control and produce log output.
- General operations, which consume CPU and memory across different parts of the service.
To measure this activity, Azure DevOps expresses resource consumption in Azure DevOps throughput units (TSTUs). A TSTU is an abstract unit of load that represents a blend of different resources, including:
- Database usage—measured primarily through Azure SQL Database DTUs.
- Compute usage—CPU, memory, and I/O from application tiers and job agents.
- Storage usage—Azure Storage bandwidth.
Note
TSTUs are intentionally abstract. They aggregate resource consumption across compute, storage, and database layers within a distributed infrastructure. The underlying metrics (CPU, memory, I/O, DTUs) aren't directly exposed or meaningful on their own. TSTUs provide a unified way to represent load, making it easier to manage and monitor usage without exposing the full complexity of individual resource components. You can't calculate usage in TSTUs for an action with a formula, but you can see how many TSTUs an operation consumes on the usage monitoring page. Some operations, like work item queries, vary in consumption as your organization grows and changes, so you might need to benchmark periodically to stay accurate.
Currently, TSTUs focus primarily on Azure SQL Database DTUs because databases are the shared resource most likely to be overwhelmed by excessive consumption.
- One TSTU represents the average load generated by a typical Azure DevOps user over five minutes.
- Normal user activity can generate spikes of 10 TSTUs or fewer per five minutes.
- Larger but less frequent spikes can reach up to 100 TSTUs.
- The global limit is 200 TSTUs within any sliding five-minute window.
Best practices
- Honor the Retry-After header: If you receive it in a response, wait the specified time before sending another request. The response still returns HTTP 200, so retry logic isn't required.
- Monitor X-RateLimit headers: If available, track
X-RateLimit-Remaining
andX-RateLimit-Limit
to approximate how quickly you're approaching the threshold. This lets your client smooth out request bursts and avoid enforced delays.
Note
Identities used by tools and applications to integrate with Azure DevOps can occasionally need higher rate and usage limits beyond the allowed consumption limit. Increase these limits by assigning the Basic + Test Plans access level to the identities your application uses. After you no longer need higher rate limits, revert to the previous access level. You're charged for the Basic + Test Plans access level only for the duration assigned to the identity. Identities already assigned a Visual Studio Enterprise subscription can't be assigned the Basic + Test Plans access level until you remove the subscription.
Pipelines
Rate limiting works the same way for Azure Pipelines. Each pipeline is an individual entity, and its resource consumption is tracked separately. Even if build agents are self-hosted, they generate load by cloning and sending logs.
There's a 200 TSTU limit for each pipeline in a sliding 5-minute window. This limit matches the global consumption limit for users. If rate limiting delays or blocks a pipeline, you see a message in the attached logs.
API client experience
When requests are delayed or blocked, Azure DevOps returns response headers to help API clients react. While not fully standardized, these headers are broadly in line with other popular services.
The following table lists the available headers and what they mean.
Except for X-RateLimit-Delay
, all these headers are sent before requests start getting delayed.
This design lets clients proactively slow down their rate of requests.
Header name
Description
`Retry- A custom header that shows the service and type of threshold reached. Threshold types and service names can vary over time and without warning. Display this string to a human, but don't rely on it for computation.
X-RateLimit-Delay
How long the request is delayed. Units: seconds with up to three decimal places (milliseconds).
X-RateLimit-Limit
Total number of TSTUs allowed before delays are imposed.
X-RateLimit-Remaining
Number of TSTUs remaining before delays start. If requests are already delayed or blocked, it's 0.
X-RateLimit-Reset
Time when, if all resource consumption stops immediately, tracked usage returns to 0 TSTUs. Expressed in Unix epoch time.
Work tracking, process, & project limits
Azure DevOps limits the number of projects you can have in an organization and the number of teams you can have in each project. There are also limits for work items, queries, backlogs, boards, dashboards, and more. For more information, see Work tracking, process, and project limits.
Wiki
In addition to the usual repository limits, a wiki file in a project can be up to 25 MB.
Service connections
There aren't any per-project limits on creating service connections. However, limits might be imposed through Microsoft Entra ID. For more information, see the following articles:
- Microsoft Entra service limits and restrictions
- Azure subscription and service limits, quotas, and constraints