The Beautiful COE

Many leaders at Amazon got a Jeffb “?” email at one time or another.

I did.

It was my first week. 

Almost peed myself.

A Jeffb “?” mail was an email from Jeff Bezos with only a “?” in the subject line, forwarding some error or appearance of an error, and trickled down through hierarchy until a recipient had no one else to forward it to.

Most of the time, a “?” required some sort of Correction of Error.

Sometimes people treat a COE as a fix to a specific situation - fixing a number, a web page, customer or client issue.

But a COE - at least how I’ve used it - identifies the root causes of an error and outlines specific changes across people, process and tools so that it won’t happen again, ever.

The reflexive habit of recognizing operational errors and fixing their root causes was something I fell in love with, refined, took forward to future roles.

So that no one involved in performing COEs (including me) needed to risk soiling themselves in the face of errors, I made the COE an accepted and repeated part of how I led any product or business going forward.

Here’s how we did it:

  1. Definition of an error: something that does not go as planned, even if it did not cause immediate harm. That original Jeffb “?” mail was in response to a tiny customer <$100/year customer who did not get a response from customer support. But it could easily have been a big customer, or many small customers…

  2. The person or team closest to the error writes the COE, even if the root causes lie elsewhere. When I led the global operations team for Microsoft Store, we wrote over 50 COEs in 4 years, and the root causes were only “ours” 50% of the time. The owner is accountable for involving other dependent or responsible teams. 

  3. Blameless. They are filled with facts: dates, intended actions, actual actions, results, and real and potential impact on revenue, customers, trust, efficiency, employee morale, etc. Most COEs have more than one root cause.

  4. Routine. When they are part of the culture, COEs get done routinely by the folks best suited to perform them. No terrifying “?” needed from above.

  5. Done in 72 hours. They may not be 100% complete, but immediacy beats perfection.

  6. Root causes dig deep. COEs are not a “we’ll try harder in the future.” When we had high error rates in published content, “ensuring the QA teams did better” was not a solution because we could not do that without fixing how we tracked and managed QA capacity, defined SLAs to upstream partners, and managed the production queue. We used a template to ensure COEs dug deep enough.

  7. COEs don’t just get filed away. They become part of checklists, training, workback plans, and readiness meetings. This is part of the “correction” - how the fix sticks.

When errors occurred, we always asked ourselves “did we already do a COE on this?” In over 50 COEs, I think we had one recurrence. COEs work. As a result, our conversion rates, Net Promoter scores, revenue, and employee morale all increased.

Want a COE template? I can try to dig one up.


Previous
Previous

Fixing Rad Power Bikes

Next
Next

Whining in Retail