Net API Notes for 2021/11/04 - Issue 180 - State of the API Report

Last week, Postman released its 2021 State of the API Report. There are exciting tidbits throughout, as always. What I wanted to talk about, however, is some new research I helped introduce that has profound impacts on how we think about, measure, and shape API productivity.

Net API Notes is a regular, hand-curated digest of impactful news and analysis for busy API practitioners. Are you reading this on the web and not subscribed yet? Sign up today and be the first to get ad-free, actionable info delivered weekly to your inbox.

NOTES

THE FOUR BEHAVIORS OF HIGH PERFORMING API TEAMS

STRAT / DESIGN / DOC / DEV & TEST / DEPLOY / SECURITY / MONITOR / DISCOVERY

One API ecosystem challenge I've had over the years is quantifying what "good" looks like (I've published that journey on my website, with pieces including "Rethinking NPS for Quality" and "How Reporting Can Backfire" ). Despite most organizations' desire to be "data-driven", much of the quality storytelling we do remains anecdotal and story driven.

On the one hand, quantifying abstract concepts like "maturity", "quality", or "maintainability" is problematic. On the other hand, quantifiable metrics like "time-to-hello-world" and "uptime percentile" don’t describe the health of an API ecosystem or whether improvement efforts are improving the speed and safety of overall API development.

In 2013, Nicole Forsgren, Ph.D., Jez Humble, and Gene Kim began researching which capabilities and practices deliver the most significant value to companies. It culminated in their book, Accelerate: Building and Scaling High Performing Technology Organizations. Their rigorous, academic-based research methods identified vital capabilities that drive successful software delivery performance.

I joined Postman in April, just as the company was preparing its latest survey. I saw it as an opportunity to build upon Accelerate’s insights, testing whether they also apply to API program performance.

What we found exceeded even my high expectations. With the largest response of its kind, the 2021 Postman State of the API Report proved a correlation between high-performing, API-First organizations and key deployment behaviors. Taken together, we can show that healthy API ecosystems ship APIs sooner, incorporate changes more often, recover faster after outages, and require fewer hotfixes.

FOUR BEHAVIORS API DELIVERY PERFORMANCE

To begin the journey of improving a companies’ API agility and resilience, organizations should start tracking the following four metrics:

  • Lead Time
  • Deployment Frequency
  • Mean Time to Restore
  • Change Fail Percentage

LEAD TIME

There are two parts to lead time: the time it takes to validate a feature and the time it takes to deliver it to customers. Measuring the design part of lead time can be difficult as acceptance criteria may vary. However, the delivery part of the lead time - the time it takes for work to be implemented, tested, and delivered - is easier to measure and has lower variability.

Shorter API lead times are better for many reasons. Like we know from API description creation and mocking, faster feedback provides quicker validation and course correction. Reduced lead times also mean defects or outages are more quickly corrected. Short lead times fulfill the portion of the Agile Manifesto that seeks to "satisfy the customer early".

DEPLOYMENT FREQUENCY

If not handled appropriately, deploying more often may be a risky proposition. In these situations, deployments are treated as delicate, precarious events that attract oversight and numerous controls. Similarly, conventional wisdom states that consumers want to build on a stable, consistent foundation, not integrate with the API-equivalent of a construction site. With that apparent paradox, how do we create environments capable of change?

The key is satisfying both of these concerns while enabling more rapid releases is to lower batch sizes for additive changes. Reducing the amount delivered (or batch size) reduces variability in estimates, accelerates feedback, reduces risk and overhead while increasing urgency. Further, we can achieve both speed and stability if those smaller batches adhere to API Evolution principles or avoid breaking changes.

Feature toggles are also a method to enable the benefits of continuous development while synchronizing with timed marketing or product releases. A response from the State of API Survey had this to say:

"We deploy APIs to production multiple times a week. We deploy changes as soon as they’re completed and tested, and we keep them feature-flagged off until the whole feature is ready to go live." — Sindhu N., Technical Architect

MEAN TIME TO RESTORE

Creating more APIs and API changes quicker should not come at the expense of API reliability. Usually, reliability is thought of as the time between failures. However, modern API experiences are rapidly changing complex systems. In these situations, it is not a question of if an implementation will fail, but when. In this light, the critical metric becomes how quickly teams can restore service.

High-performing API organizations recover more quickly when outages do occur. The difference is minutes rather than in hours (or even days) of their contemporaries.

CHANGE FAIL PERCENTAGE

A key metric when making changes to a system is what percentage of changes to production fail. Failure may include degraded service or accidental introduction of a breaking change requiring a hotfix, a rollback, or a patch. A metric like ‘change fail percentage’ helps ensure that while we encourage greater delivery tempo, we do not do so by making the system less stable.

High-performing teams’ changes to production APIs require a lower percentage of hotfixes, rollbacks, and patches than their fellows.

DRIVING HIGH PERFORMANCE WITH API DELIVERY PERFORMANCE METRICS

Tracking these performance metrics have two key benefits over other possible numbers. First, a metric should focus on a global outcome to ensure teams aren’t pitted against each other. An API example of how not to do this is to reward developers for minimizing response time while goading API program managers to build upon others' prior API work (or maximize reuse). In this situation, developers are reluctant to add subsystem calls outside their control that might increase response times, while API program managers enact harsher oversight to prevent duplication of functionality. Second, these measures focus on outcomes and not output; they avoid rewarding teams for busy work unaligned with delivering actual customer value, like crude counts of APIs produced in a quarter.

Conventional wisdom suggests that companies must choose between speed and stability. However, research like the 2021 Postman State of the API Report indicates that high-performing API teams have both. If anything, the results imply that speed depends on stability, and stability only happens with a more focused and batch approach to speed. Like Martin Fowler expressed, good IT practices give you both.

Improving these four API metrics will make a company's API teams more responsive, customer-centric, and optimized for defensible growth in increasingly unpredictable landscapes.

Is your API program measuring these behaviors? Or some variation on these metrics? If so, I'd love to talk with you about what your experience has been.

MILESTONES

WRAPPING UP

Want more API news? Check out the LinkedIn API and Web Services Professional Group that I manage. There is also the list of upcoming API meetups and events on NetAPI.events.


Lastly, thanks to my Patreons! Their support helps keep this newsletter is free of advertising, information selling, or paywalls for everyone's benefit.

Till next time, Matthew

@libel_vox and matthewreinbold.com

While I work at Postman, which continues to support signifigant research like the 2021 State of the API Report, the opinions presented above are mine.

Subscribe to Net API Notes

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe