5-Minute DevOps: I’m Too Smart to Make Mistakes

Bryan Finster
Defense Unicorns
Published in
5 min readFeb 12, 2024

--

I find joy in platform engineering because the mission is challenging and it helps people: “Make it harder to make mistakes and easier to deliver.” These two things are often at odds, which is why it’s called “platform engineering” instead of “tool assembly.”

When I discuss ways to make it harder to make mistakes, I often get feedback from people who are obviously much smarter than me because they simply don’t make mistakes. Here are some real life examples of these very smart people.

Pre-commit Hooks are Toxic!

A while ago, I was on Twitter looking for recommended patterns for automatically creating git pre-commit hooks using Make so that new contributors have some minimum quality guardrails in place. There are some good NPM tools for this, but Make is popular where I work, and I wanted a “hard to skip” solution.

Within minutes of asking for help, an “experienced” developer accused me of being toxic, not showing trust, and that it would be an unforgivable invasion to inject pre-commit hooks automatically. Instead, they just make sure they don’t make the kinds of mistakes I’m trying to prevent.

I’m very trusting. I trust that I can make mistakes, and I trust that other people can, too. I also know it takes longer to fix mistakes sent to the CI server than to fix them after the pre-commit checks block me from pushing them. Good pipeline design tries to reject changes that will block the pipeline as close to the source as possible. It’s not micromanaging to help people detect issues early.

You Don’t Understand Operations!

I made a statement online one day that no one should ever log into production except in extreme ‘break glass” emergencies that get escalated very high. My opinion in this is based on all of the problems I and others have caused by bypassing all controls to “just fix things.” It turns out that I’m probably just incompetent. I quickly had an expert explaining to me all of the amazing diagnostic command line tools that were created in the ’90s and early 2000s at Sun Microsystems and how their very existence proved me wrong. Besides, all I had to do was not make mistakes, like the time a coworker deleted all of the inventory for a customer distribution center or the time I almost wiped all of the binaries from a system with a small typo.

The person who “corrected” me claimed to be an expert in security while seeming not to understand that “insider threat” is vulnerability #1. Anyone can make mistakes, and if we are logging into production to fix something urgent, we are under stress. We are more likely to make mistakes. We need to make changes through pipelines, not through SSH.

When You’re Good Enough, You Can Cut Corners

We were talking about the proper way to deliver changes safely when someone chimed in that if you’re smart enough and understood the risks, it was OK to cut corners on quality and safety. Apparently, the guardrails are only there to protect less skilled people. I wonder if they would feel comfortable flying with a very smart pilot who felt that way.

“Well, that mechanical issue put us 30 minutes behind, so let’s make up some time. We don’t need to waste time going through the checklist. We know what to do.”

If you understand the risks of riding a motorcycle without a helmet, feel free. If it goes wrong, you’re not harming anyone else. If you understand the risks of riding that same motorcycle in heavy traffic at 100 MPH, you’re putting other people at risk, and you’re criminally negligent.

You have the right to add risk without permission only if you own the software you’re writing. Doing otherwise is malpractice because it puts your coworkers and the company’s goals at risk. Grow up.

Conway’s Law Isn’t

What is a scientific law? A scientific law describes an observed phenomenon. In 1968, Melvin Conway published a paper called “How Do Committees Invent?” The central thesis, as summarized by Conway, is,

Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.”

While the effect may not be as strong as gravity, it’s an observable effect. People would claim gravity isn’t a law if they only lived in microgravity and didn’t pay attention.

During a conversation about how we’d leveraged Conway’s Law and used the “reverse Conway maneuver” to improve system architecture, several people claimed it wasn’t a law because there were no peer-reviewed papers on it or that it didn’t exist at all. However, one person, obviously the smartest, said that all you needed to do was be aware of the effect and then not allow it to harm your architecture. “There’s no need to reorganize teams at all.” Brilliant. I could have applied this as a general rule for my daily life if only it had been explained to me this way. The same goes for running with scissors. I just need to be aware that they will stab me if I trip while running with them and then simply not trip! Easy!

Making it harder to make mistakes or harder to create bleed over and inappropriate coupling in systems means we don’t need to spend as much energy ensuring those things don’t happen and can focus more on real work.

Tests Are For Bad Developers

This is a story related to me by a close friend. One of the very senior engineers in their area explained why testing software is a waste of time. “I just don’t write defective software.” Brilliant! I wish I’d thought of that years ago. Think of the time I would have saved on support calls!

If you’re too smart to write tests or too smart to need the value they provide to the less intelligent, then you’re too smart for this job.

Stop Hiring Uneducated Experts

Every one of the people in the examples above had very senior engineering positions. They got those positions either because their constant heroics were rewarded or because they talked a big game very confidently and we not around anymore when reality struck.

Never hire someone too smart to make mistakes. If you have someone working with you who believes they are, make sure they don’t work on anything important.

--

--

Bryan Finster
Defense Unicorns

Developer, Value Stream Architect, and DevOps insurgent working for Defense Unicorns who optimizes for sleep. All opinions are my own. https://bryanfinster.com