2024 Cloud Service Provider API Report

The definitive annual industry report
by: Dr. Paul M. Cray

Introduction

As part of our API Directory operations, in 2023 APIContext made more than 650 million API calls to more than 10,000 different API endpoints from more than 100 geographically diverse cloud data center locations across major public cloud service providers including AWS, Azure, Google, and IBM. This proprietary dataset gives us unique insights into the cloud API landscape. 

In this report, we analyze the API quality data generated by multiple APIContext services, including the APImetrics platform, our API Directory and our Supplier Index. Now in its sixth year, the result is an unbiased, industry-wide baseline for API quality scoring.

As the volume of API calls across the internet continues to grow, our data continues to get both broader and deeper. To create this analysis, we leverage aggregated, anonymized data from leading API services, including those from infrastructure providers, financial services institutions, social networks, search engines, and other key services.

2024 Cloud Service Provider Quality Report Key Findings

Details on key findings

  • API availability is getting worse. In 2022, 18% of observed API services achieved 99.99% service availability. In 2023, this dropped to only 7%.
  • AWS US-East (N. Virginia) has been considered the standard bearer for API performance, because so many applications start there. However, after four years of industry-leading Time to Connect, the AWS US-East data center has dropped off precipitously. In 2022, this location averaged 1.23 ms Time to Connect. In 2023, this increased to 2.50 ms. The new fastest cloud location for Time to Connect is AWS Asia-Pacific Northeast 3 (Osaka, Japan) with an average time of 2.28 ms.
    • Four of the top five locations for Time to Connect are AWS locations (in Asia, Europe, North America and Oceania), which speaks to the continuing performance of this cloud service.
    • Rounding out the top five is Google Europe-West 3 (Frankfurt, Germany), which was fourth-fastest for Time to Connect at 2.81 ms.
  • Despite worse availability, other quality metrics improved, and on balance, observed API services had good overall performance, as evidenced by CASC scores above 8.00.
    • 68% of providers had a CASC score between 8.00 and 8.99, indicating very good performance.
    • 32% of providers had a CASC score between 9.00 and 9.99, indicating good performance.
  • IBM is the fastest cloud by Total Time, averaging 443 ms. AWS was second with 450 ms, and Azure landed at the bottom with 529 ms.
  • AWS was (once again) totally dominant on DNS Time with an average for the year of 2 ms; no other managed better than 13 ms, and Azure, at 20 ms, was ten times slower than AWS.

2023 Executive Summary

This is the sixth year of the APIContext Cloud API Performance Report. With years of historical data, we can now start to identify longer term trends in addition to our annual analysis. 

While cloud services become even more important to our personal and professional lives, and now comprise the vast majority of all internet traffic, we see cloud providers struggling to maintain speed and quality in all circumstances.

Baseline performance continues to be good, but corner cases and edge cases with reduced quality are more prevalent. This is at a time when user expectations of reliability and availability have never been higher. Here are some key conclusions.

Continued degradation in cloud API latency

We can now say definitively that the 2020s have seen an increase in latency. This is likely due to two trends that accelerated during the global pandemic, and continued across 2023: Remote Work and APIs Eating Software. 

Remote work requirements have increased cloud loads, and have forced cloud providers to expand edge infrastructure. At the same time, both public and internal APIs have continued to expand, running more applications than ever. 

Although there have been year-on-year increases in capital expenditure by the main cloud service providers (an average of ~30% per year in the 2017-22 period), demand for cloud computing  is outsripping supply in large part because of the phenomenal growth of artificial intelligence and machine learning in 2023 with its associated requirements for immense amounts of compute. This has put pressure on cloud services and may lead to some degradations in network quality.   

AWS’s continuing success in improving DNS Time indicates that the other clouds still have some catching up to do in optimizing the performance of different latency components.

Quality is stable

Most services are rated as excellent (a CASC score of 9.00+), with no observed services having issues of concern over quality. Overall quality is similar to 2020, 2021 and 2022, which means that improvements in performance might be at a plateau. There is no excuse for not having a highly stable and consistently performant API. You can’t blame your infrastructure provider for everything; only for network performance. Once that has been optimized with the most suitable cloud for your API service, the ball is in your court to offer the fastest and most reliable service possible.  

Availability

Five 9s is a tough target, but we believe that 99.99% is a goal that should be achievable for most APIs. Only 7% of the services we studied managed to reach this level, down from 18% in 2022, up from 6% in 2021.. 

PagerDuty was again the API service with highest availability and the only one to achieve Four 9s in both 2022 and 2023, an excellent performance. No API has previously managed to reach the 99.99% mark two years in a row, so this was an exceptional performance by PagerDuty. This is clearly an area of future focus for quality improvements for API services

Cloud performance varies

We see significant differences in performance between clouds. In 2023, Azure was consistently >75 ms slower globally than AWS. In an API-first economy where every millisecond increasingly counts, can you and your customers afford to be using a slow cloud? A typical phone  \app operation such as checking in for a flight might use 6–9 APIs or more, which means >400 ms of latency owing to cloud choice, which will create friction for the user. 

But the crucial thing is to check performance constantly from the cloud and locations you use and, if possible, cross-check with other cloud locations to determine which choice is best for your service and users. 

DNS deterioration

DNS resolution times slowed in two of the four clouds (Google and IBM) and across all regions in 2023 compared to 2022. 

If AWS can have a median DNS Time of 2 ms, so can the other clouds. Individual services can have hundreds of milliseconds of extra latency friction created through non-optimized DNS. In 2024, we want to see DNS Times getting better again across all clouds and regions – and services.

Regional deterioration in latency

The contribution of DNS to Total Time is particularly important when every millisecond of additional latency counts. 

All regions were slower in terms of both DNS Time and Connect Time in 2023 compared to 2022. In the case of Connect Time, South America was slower by 3500%! North America has only improved by 1 ms for DNS Time since 2018, and is no longer the fastest region for that latency component. We recommend that many services look to diversify their hosting locations.

Europe and South America tie for first at 8 ms, and South Asia (11 ms) and East Asia (9 ms) are also faster.

  •  
Cloud report key recommendation

The API Supply Chain

In 2023, API performance was good across a wide range of popular services. In contrast to 2022, no APIs in the study rated as being of concern. But the problem of the API Supply Chain, as Founder and CEO of ProgrammableWeb John Musser calls it, remains significant. 

There are meaningful geographic differences, such as physical distances across oceans and continents; and cloud performance variations, such as the amount of bandwidth available through fiber optic cables and the capacity of network equipment. DNS lookup times, which have always been a problem, seem to be getting worse, as do Connect Times.

Using an API isn’t just relying on a black box. The API you provide or use exists in a universe of components including their own cloud service, a CDN provider, probably a gateway of their own, a backend server architecture, and potentially a security and identity service. 

Each of those components has its own configuration and cloud dependencies – and a failure could end up costing $200,000 per incident.

DNS is STILL a problem

DNS Lookup Time has increased across two services and all regions, suggesting there are issues with network infrastructure and configuration for Google and IBM Cloud, particularly in Oceania.

Overall, the average rose to 19 ms in 2023 (10 ms in 2022), with the best DNS Time of 3.4 ms from Box.

In contrast, DocuSign had a DNS Time of 224 ms (up from 199 ms in 2022) and Capital One had a DNS Time of 420 ms (up from 339 ms in 2022). 

Such large differences in DNS Time cost money and add unacceptable friction to the user experience. They can be avoided through optimizing network and cloud configuration and adopting industry best practices. AWS is faster than the other three clouds for DNS and has steadily improved its performance over the past few years. You need to ensure that your DNS is properly configured and optimized and choosing Azure, Google or IBM could potentially contribute to ongoing reduced DNS performance for your service with all the additional associated increased support costs that entails as well as a degraded performance for your users.    

Top Achievers

In all of 2023, PagerDuty had 30.0 minutes of measurable downtime on their APIs. To put this performance in perspective, the worst-performing API had more than 8.9 days of downtime. 

If an API is being exercised at an average rate of 50 calls/second, that would mean nearly 39 million attempted calls were lost.

PagerDuty was also first in 2022 with 21.9 minutes of downtime, so while it increased a bit, this was another superb performance. Congratulations to the DevOps team and everyone at PagerDuty involved. They’ve been setting the standard for industry best practices in terms of API availability over the last few years. The DNS host and service host for PagerDuty is AWS, which is consistent with our overall finding that AWS is the most performant cloud.

High-level data for 2023 is provided for free at the APIContext Directory: https://apicontext.com/api-directory. If you’d like to dive deeper into the details, please contact us for licensing access.

Methodology

API calls were made from 82 data centers around the world using APIContext observer agents running on application servers provided by the cloud computing services of Amazon (AWS), Google, IBM, and Microsoft (Azure). New locations do come online and old ones get taken off. The 82 cloud locations are locations that have been used since we began reporting on the state of the cloud in 2020. For the 2025 report, the number will increase as a significant number of new locations have recently been added. 

The sample sizes for each API are roughly the same and are equivalent to a call from each cloud location made to each endpoint every five minutes throughout the year. 

We logged the data using the APImetrics platform. Latency, pass rates, and quality scores were recorded in the same way for all APIs. For most APIs, data is available for the whole period.

Pass Rates

In calculating the pass rate, we define failures to include the following:

  • 5xx server-side errors
  • Network errors in which no response is returned
  • Content errors where the API did not return the correct content (e.g., an empty JSON body or incorrect data returned)
  • Slow errors in which a response is received after an overly long period
  • Redirect errors in which a 3xx redirect HTTP status code is returned

We ignored call-specific application errors such as issues with the returned content, and client-side HTTP status code 4xx warnings caused by authentication problems such as tokens that have expired.

If an API fails, it may pass if called again immediately and succeed if the outage is transitory. However, our methodology still gives a general indication of availability issues.

n-9s Reliability

The traditional telecommunications standard for service availability is five 9s – at least 99.999% uptime, or just five minutes of downtime in a year. 

  • Of the 27 services analyzed in this study, no API managed to achieve five 9s. 
  • Two services achieved four 9s, down from 6 in 2022.
  • In 2023, 19% of major corporate services scored less than three 9s, slightly up from 2022. 
  • There were just 69 minutes’ difference in unscheduled downtime observed between two leading file management services, compared to nearly 18 hours between them in 2022. 
  • In this decade, we have not observed significant improvement in API availability.

Quality

APIContext uses CASC, our patented quality scoring system, to compare the quality of different APIs. CASC (Cloud API Service Consistency) blends multiple factors to give a “credit rating” for an API, benchmarked against our unmatched historical dataset of API test call records.

  • Scores over 9.00 show exceptional quality of operation.
  • Scores over 8.00 are healthy, well-functioning APIs that will give few problems to users.
  • Scores between 6.00–8.00 indicate some significant issues that will lead to a degraded user experience and increased engineering support costs.
  • For a CASC score below 6.00, urgent attention is required.

It is important to note that CASC scores do not fall on a normal curve. The scores are absolute, and we see no engineering reasons why prominent APIs should not consistently reach a CASC score of  8.00+.

Latency

Some calls will be faster than others because of backend processing, so total call duration, even over a sample size of tens of millions of calls, can only give a partial view of API behavior. 

  • Over the past three years, the overall trend has broadly been an increase in Total Time.
  • IBM was the fastest cloud provider in 2023 with an average Total Time of 443 ms, slightly ahead of AWS at 450 ms and 86 ms faster than Microsoft.
  • All clouds are slower for Total Time in 2023. This might reflect increased load on cloud ecosystems. IBM is only 52 ms slower in 2023, compared to 93 ms for both AWS and Azure. IBM may benefit from being the least heavily loaded cloud.
  • Azure has been consistently the slowest cloud by median time since 2018.
  • AWS’s median DNS Lookup Time has continued to fall and is now just 2 ms, down from 3 ms in 2022. This is an exceptional performance from AWS’s engineers as the next best cloud, Google, had a DNS Time of 13 ms and Azure, 18 ms.
  •  

DNS Matters

All regions had slower DNS in 2023. Oceania was twice as slow (16 ms compared to 8 ms in 2022.) Given that faster performance has been achieved in the past, this is an area for focus in improving service quality for all cloud network engineering teams, who should monitor DNS times on an ongoing basis and ensure performance criteria from all regions and locations are not becoming longer or less stable. 

Recommendations

  1. Are you actively monitoring the availability and latency of all APIs you expose and consume? If you’re not, you don’t know how your APIs are performing right now for users in the real world. And if you’re not actively monitoring your APIs, you’re not managing them.
  2. Are you benchmarking the performance and quality of your APIs against those of your peers/competitors? Because you certainly wouldn’t want to find out that they outperform you.
  3. Do you know the differences between cloud locations v. user locations? Your service might be hosted in Virginia, but your users might be in Vienna or Vietnam. Make sure your choice of cloud isn’t wasting valuable milliseconds for your users – and causing you to lose customers  who migrate to faster or more reliable services.
  4. Do you know that 70 ms or more of latency can be down to your choice of cloud? Your API users shouldn’t wait tens or hundreds of milliseconds simply because of a decision made years ago.
  5. Do you want to rely on a single cloud service provider? Not all clouds are the same, and they change for different latency components. One cloud service provider might be the right choice for one user segment, but not for another. Build in resilience by using multiple cloud service providers. 
  6. Have you ensured no specific issues affect the DNS Lookup Time for your domain? It should be 12 ms or less. If it isn’t, do something about it – slow DNS is money down the drain!
  7. Do you understand what factors impact call latency and where to focus your efforts? What’s the latency component most impacting user experience, and what can you do to improve it?
  8. Are you tracking performance outliers and determining their causes? Slow outliers can greatly impact user experience. Are some calls taking 30 seconds or more to complete? Are some calls timing out without a response at all? How can you stop that?
  9. Is your organization aware of the impact of API failures and errors on user experience and business costs? All API performance and quality issues cost money. Bad APIs mean lost customers. Can your organization afford not to have the best possible APIs providing the best possible user experience?
 
For more detailed results, including reports on each cloud and region, download the full report and review the appendix.

Download Now