Correcting Amazon's "Crumbling Engineering Culture"

Net API Notes for 2022/10/21 - Issue 205

When I sit down to write the newsletter, some issues flow like water. Then there are ones, like "message orientated middleware in service mesh deployments", that bore me just retyping the title. When that happens, I've learned it is best to set it aside until a new angle or way to engage with the material manifests and rekindles my interest.

Instead of that topic, I'd like to discuss the recently leaked documents chronicling Amazon's efforts to combat a "crumbling engineering culture". Appearing on the Business Insider, the news of a new unit tasked with reversing slowdowns in Amazon's engineering culture is relevant to all of us (the original article is behind a paywall, but there's an archived version available).

ALL COMPANIES ARE WRESTLING WITH SLOWER SOFTWARE PACE

Former Dropbox VP of Engineering, Etsy CTO, and co-author of the OAuth 2.0 specification, Kellan Elliot-McCrea, reacting to the article, listed several contributing factors why software development seems to be slowing down in a Twitter thread. The reasons include the following:

  • Increased Team Sizes - Software has eaten the world, so more people than ever are 'touching' software, which creates increased communication and coordination costs.
  • More complicated Tech Stacks - The cloud is advantageous for several reasons (speed to provision, elastic scaling, replication, etc.). However, proper usage requires more nuanced tech stacks, each with its suite of tools. These tools are also software that has their own deployment pipelines, bugs, and comprehension curves.
  • Preference for Complex Architectures - This is more than just "resume engineering" (although there's a facet of that at play here, as well). These architectures are "often in an attempt to address [existing] productivity losses", but end up aggregating to even more significant problems. Kellan specifically mentions microservices, event sourcing, and schemaless datastores. These approaches are born of good intentions but, misapplied, compound systemic losses.
  • Turnover - Want to feel old? Most software engineers entered the industry in the last ten years. Further, the average tenure at any given role is less than two years. Old ways of learning - like having to clean up your own mess - don't apply. People barely have time to ship their first product to prod before leaving for a different company; and thus forgo any of the "day 2" lessons on maintenance, operationalization, and graceful evolution. With these realities, it isn't easy to justify increased institutional knowledge sharing and reinvestment if those on the receiving end are off to the competitor in 18 months. The result is that each successive hire receives a smaller and smaller slice of the context of why things are the way they are.

These problems aren't unique to Amazon (although Amazon sees these problems at scale). So what are they doing about it?

AMAZON'S SOLUTION

The team tasked with addressing the slowdown is called the "Amazon Software Builder Experience" (or ASBX). According to the documents leaked to Business Insider, the team has grown to more than 400 employees. Their efforts are focused on:

  • Code Automation
  • Improved Developer Tools
  • Enhanced Tutorials
  • Safety Infrastructure

The work is driven by "mounting frustration" among Amazon engineers:

"they are 'overwhelmed' by mundane software upgrade work, manual testing and deployment, and hard-to-use developer tools that prevent them from engaging in more creative building activities.

Honestly, it sounds like Amazon Engineers have to use the same AWS UI as the rest of us.

~* rimshot *

Who wouldn't love new tools? I'm a huge proponent of using smart, environmental tweaks to encourage better behavioral outcomes; or, said another way, *make the right thing the easy thing.

"The ASBX team intends to solve those problems by building new tools and educational content, among other things. ASBX is composed of seven smaller units, spanning Builder Tools and Engineering Knowledge Growth teams to Mechanic, Safety Infrastructure, and User Experience. It's also investing in roadshows, surveys, and conferences to hear from the engineers, the documents said."

However, the rest (educational content, roadshows, surveys, and conferences) raises an information action fallacy flag for me.

THE INFORMATIONAL ACTION FALLACY IS EVERY CHANGE EFFORT'S ACHILLES HEAL

I first found the Information Action Fallacy in BJ Fogg's book, Tiny Habits. It says that if you give people the correct facts, they will change their attitudes and behavior. Sadly, anyone who has ever failed at a New Year's resolution to eat healthier has firsthand experience with this fallacy. I know I should eat more leafy greens. But another study proving vegetables' benefits will not change my tendency to reach for potato chips.

In the case of Amazon, the article makes it clear that the current systems and processes are perceived as inadequate. However, I don't see anyone addressing how those things came to be in the list of proposed solutions. Reducing a deployment by several clicks will save minutes. Is anyone addressing the weeks lost in handoffs?

"Everybody wants autonomy but also alignment, fast moving teams but also no duplication of efforts, shared abstractions but no collaborative modelling, user focused development but also a platform, move fast and break things unless something actually breaks" - Mathias Verraes

This is probably because:

"We collapse systemic problems into personalized narratives, and when we do, we cloud our understanding of politics and confuse our theories of repair." - Ezra Klein

In other words, people who program will look to program a solution.

AN ALTERNATIVE APPROACH TO ADDRESSING SLOWING SOFTWARE DEVELOPMENT

I'm a huge fan of Helen Bevan, a change leader for the UK's NHS. She's pointed out how organizations over-focus on the relationship of change to performance (in this case, the rate of software development) at the expense of participants' emotional experience.

"In a recent internal survey, seen by Insider, 34% of the engineers said they spent 4 to 8 hours on undifferentiated efforts weekly, or roughly 10% to 20% of their week on things that are not related to building new products."

People want to do new stuff. But is the lack of new products what ails Amazon, specifically AWS, from a consumer standpoint?

Previously, I was glib about the AWS UI. I've previously written about how there are something like 200+ products, many of which do similar things in slightly different ways. What is externally visible is not a culture that can't get their DevOps under an acceptable number of manual clicks. What I see is the inability to address human systems.

"It's wild how much we talk about individual productivity when in an organizational context, it's territory battles that send individual effectiveness straight through the floor to Hades." - Erika Hall

Ultimately, the ASBX effort will have to overcome a paradox in their prescribed mission. On the one hand, the company has pledged to "become the Earth's Best Employer". That would require a discourse with the real people and their emotional experience in the more extensive Amazon system. Making "things even faster for our developers" seems like a step in that direction. However, it conflates an easy (and measurable!) performance target before dealing with a messy, but much more vital, human one.

MILESTONES

WRAPPING UP

Thank you to the Patrons! They help keep these emails free of ads, paywalls, or information. If you're interested in joining them, go to my Patron page and sign up.

That's all for now. Till next time,

Matthew

@libel_vox matthewreinbold.com

Subscribe to Net API Notes

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe