NEW data on the problem of API drift – most production APIs are not built as designed  Learn more >

Understanding APIs
New to the API economy? Learn what APIs are and why they’re so important.
Detailed guides to using APIContext to solve critical business problems.

Defining API KPIs (key performance indicators) being used is a critical part of understanding not just how they work but how well they can work and the impact they have on your services, users or partners. But what do you look at and track when measuring API KPIs?

Some API KPIs are ‘soft’ – your documentation, the formal definitions of how the API works, use metrics and adoption, and others are ‘hard’ – focusing on the underlying functionality of the service being measured.

This article will focus on the ‘hard’ API KPIs, and looking beyond the obvious ones that existing tools tools and solutions capture and at the critical numbers you need to focus on to understand how well an API works and the impact that it is having on the services you deliver, integrate to, or provide.

Pass Rate
0 %
Median Latency
0 ms
DNS Look Up
0 ms
Outlier Rate
0 %
Effective Pass Rate
0 %
CASC Score
0

We will be focusing on measurable metrics for API KPIs:

  • Call Latency – but not just total time, you should understand the networking impact as well as the various server components – some Open Banking regulations like those from Open Banking UK require a specific Time to First Byte received to be reported
  • Total pass and error rates – the measured success rates in terms of HTTP or other codes
  • Effective pass rates – the pass rate when you take performance outliers or ‘passing errors’ (like a HTTP 200 OK! hiding an error) into account
  • The Cloud API Service Consistency score, which takes into account the consistency of service

 

Multiple ways to measure APIs and the impact they may have on your services, customers and critical interdependence. But outside of what you can measure with your current products.

 

Measured Pass Rate

The pass rate is the actual measured success rate for an API call from a specific location.

But APIs can pass at different rates from different locations. So don’t just assume that HTTP-200 means ‘All OK’. Just because your API gateway or APM stack logs show 200 codes doesn’t mean that everything is working well – or even at all.

Imagine you’re in a restaurant where they only measure whether or not you get a drink – but don’t actually care if you get the drink you ordered, or whether or not it has ice in it? 

That can be the difference between being told an API call passed and having a dispute with a supplier, customer or regulator over what you actually delivered.

Effective Pass Rate

It’s entirely possible to have two APIs doing the same thing, but with wildly different effective pass rates.

Even with a 100% pass rate, there may be events and performance issues that cause timeouts and other problems. You need to take into account the performance, including latencies and items that may affect end users.

So thinking back to our restaurant, they might have a time goal where they still don’t care if there’s even a drink in the glass as long as the glass is delivered on time, or, where they check the average time, so while your drink took over an hour, on average,most drinks were served in under 5 minutes.

Location Impact on API KPIs

Latency is complex for APIs. Using a common ‘ping’ tool won’t tell you what you need to know. API calls include multiple steps:

  • Connect Time
  • DNS Look Up
  • Server Side Processing Time
  • Internet Travel Time
  • Total Call Time

 

And each of these vary by geography and even cloud service provider. What works between AWS servers might not work so well if your partner is using Azure or Google. Understanding these issues can save pain and complaints and brand impact later.

Map of the world APImetrics KPIs for APIs global monitoring coverage API KPIs

Statistics matter

A slow API might not be a problem. But an API that is slow only sometimes might well be. Systems measuring average performance can miss significant performance issues lasting many hours.

So it’s important not just to think about the averages here, but also the percentiles, that is the calls that fall into a certain group – like the worst 5% or 1% of calls.

That might not sound like a lot but consider an example. You serve 1,000,000 API calls a day and have an average latency of 500ms, that’s good? Right? But consider this, your 95th percentile, that’s the slowest 5% take over 2 seconds, and your 99th percentile is over 10 seconds. That’s 10,000 calls a day taking over 10 seconds… is that any good?

Learn more about API KPIs

See how APIContext measures performance and SLAS – or check out some of our other articles.

Learn about CASC

CASC scoring takes the guesswork out of your API monitoring by identifying the APIs that need attention.

SLAs

Service Level Agreements – do you meet them? How do you know?

Ready To Start Monitoring?

Want to learn more? Check out our technical documentation, our API directory, or start using the product immediately. Sign up instantly, and monitor your first API call in minutes.