5-Minute DevOps: AI is Taking Our Jobs!!

Bryan Finster
Defense Unicorns
Published in
7 min readFeb 5, 2024

--

I’ve been running small experiments over the past year or so to see if AI has made me obsolete yet or if it’s more breathless marketing hype. I’ve tried everything from “give me the regex for …” to “write an outline for a 30-minute presentation on …” with various outcomes, some surprising. A year ago, the results were unimpressive for anything beyond simple problems. Hallucinated dependencies and invalid syntax were the norm. I asked ChatGPT about when to use GitFlow, and it failed miserably. However, recent experiments with LLM-assisted development have convinced me that it’s time to modify my workflow. Am I afraid I’ll be replaced? No. I believe others will be, though. Before I explain why, I want to talk about an AI event I attended.

Recently, I was invited to attend the first DevOps for GenAI Hackathon in NYC, a small event of about 60 people with teams of mostly strangers self-organized around the problems we voted to work on.

John Willis and Patrick DuBois hosted the event. Patrick coined “DevOps” in 2008 when he created the first DevOpsDays event. John is a co-author of the DevOps Handbook and an early thought leader in the space. So, we had OG DevOps in the house.

What does “DevOps for GenAI” mean? It definitely wasn’t about how to write YAML better. We wanted to test ideas for applying GenAI solutions to make our jobs easier. We had several interesting use cases proposed. One was to “Identify issues, root causes, and hypotheses during an incident.” I’ve spent too many hours awake doing that, so I was very interested. The team created a design proposal for how information could be ingested to train a model but didn’t have time to create a POC. The scope was just too big for the time we had. Still, I think AI-assisted ops will be a commodity product soon. It’s too big a pain point.

The team I worked on focused on “Expert as a Service.” We wanted to train a model on documentation and training to make it more discoverable and useful. Documentation is an important tool unless no one uses it because it’s outdated or too hard to find. In a previous role, we used documentation extensively as L0 support for our internal developer platform — this allowed us to keep our support team lean; eight people to support 19,000. However, we frequently had to guide people to the relevant documentation. With our hack, we wanted to see how hard it would be to take a set of documentation and change discovery from “I’m searching for this information” into a conversation where a model could suggest solutions to problems.

We started with a seed application that John and Patrick provided and documentation that I’m familiar with from DojoConsortium.org. Everyone on my team was a developer, but none of us were great at Python. However, with a few prompts to ChatGPT, we had what we needed to scrape markdown from GitHub, break pages up by topic, and add the information to the model. After loading the documents, I asked a question teams frequently ask: “How big should a story be?” The result:

Hmmm, that’s not what I wanted. I was looking for some guidance on a suggested max story duration, which I thought was in the documentation. I need to review the docs to see if the error is the docs or how we ingested the data. On the plus side, it did suggest a way to make stories smaller. This is why eating our own dog food is so important.

After discussing the outcome vs. the expectation, our conversation focused on the development flow for this use case. “How do we get feedback to improve responses?”, “What does the pipeline look like as the information is updated?”, “How do we untrain obsolete information?”, “How can we measure the quality of responses?” etc. The tool is different, but the principles still apply.

Hackathon Takeaways and Lessons

  • It’s fun to mob with other engineers you’ve never worked with who are also professionals trying to solve problems, not 10x prima donnas. We all get to learn from each other and curse at computers together.
  • LLMs can make documentation and training relevant by making it more discoverable and by converting a keyword search into a conversation. It can cross-reference and return information from across the organization and can provide richer context or even find conflicting information. Imagine getting an automated alert about a compliance policy change that conflicts with another policy. There are so many possibilities.
  • Implementing a simple hack was easier than I imagined. Yes, we had a seed, but the amount of code in the seed was very small, less than 100 lines small. Most of the heavy lifting was done with LangChain libraries and MongoDB as the vector database. Next, most of my learning will be how to classify the information for better indexing by the model.
  • It’s not magic. If you want an LLM that’s trained to your context, then you need observability to improve the information it returns. What questions are people asking? How accurate are the answers? As information is updated, how do you keep the model current and prevent it from providing obsolete information?
  • Consider the information lifecycle before throwing everything into the model. Our takeaway was that relatively static information, such as policies, training, procedures, etc., is a good fit, but rapidly changing or ephemeral information should be avoided.
  • Garbage In, Garbage Out. The model is never smarter than the people contributing the information to it, and it is only capable of making connections that may seem surprising sometimes. Again, nothing magic.

Broader Observations

Afterward, I attended an industry dinner to talk about AI. I discovered that people are worried about the impact AI will have both socially and across the industry.

One person was quite adamant about the need to remove bias from LLMs. It’s a valid concern but a fool’s errand. You cannot remove bias from an LLM or anything else humans create. How many tools do you know that just assume left-handed people don’t exist? Does your application recognize that color blindness exists? There’s bias in the code, bias in the material it is trained on, and bias in the people interpreting the output. The tools we create are never better than us. They are only reflections of us that can do some things faster than we typically can. The thing that does scare me, though, is how much trust I see people giving to information LLMs produce, more than if a human said it to them. This is terrifying to me. Instead of asking for the impossible, we should educate people to use critical thinking and be aware that bias will exist in everything we do.

Another concern was that generative AI is just another blockchain bubble. I strongly disagree with this. Everyone was trying to shoehorn blockchain into everything because it was trendy. There are use cases where blockchain would have been valuable, but those use cases also required broad adoption in an information supply chain. LLMs have numerous use cases and can be localized to a single application or more general. I can think of several that would make daily work better right now, and I will be helping to implement those where I work. LLMs will become a required platform tool and not just for coding assistance.

Replacing Developers With AI

Before attending the event, I tried a small experiment. I fed acceptance criteria into ChatGPT 4 and told it to give me a React application based on those testable behaviors. It did, and the application worked. It used a drag-and-drop library I’ve never used that would have required hours to learn. It wrote maintainable code I didn’t hate. I had a usable POC in less than 3 hours, starting from “I wonder if this will work?” Does this mean that developers can be replaced by someone writing user stories and running the generated code? The dream of so many people has been realized at last! No.

Gen AI is a major win for developer platforms. It did the boring commodity coding part of the job for me and eliminated the need for me to context switch to learning a new library. However, development is more than typing code. I had to understand the problem I was solving and how to describe that problem in a testable way. It was like working in a higher-level language that allowed me to spend more time focusing on “what” than “how.” It takes more than “it behaves this way” to engineer large systems. Until we have self-aware machines that can understand engineering tradeoffs and can do the job of translating fuzzy ideas into testable outcomes, we will need software engineers. Of course, by the time we can do that, no human will be working anyway.

I have one caveat to this. People who say that LLMs will make developers redundant are mostly wrong, but not entirely. Developers who think the job is about coding and that LeetCode problems will make them great developers are already replaceable. They are solving solved problems. The demand for developers who know that the job is about understanding the domain, solving hard problems for that domain, and understanding how to describe solutions in a testable way will grow. Skip LeetCode exercises, use LLMs for mundane chores, and learn how to become a domain expert in solving problems with software.

LLMs are tools, that’s all. Better tools amplify the user’s skill level. If that skill level is poor, this tool will make that clear to everyone. As for the operational questions we had during the DevOps for GenAI Hackathon, I don’t have answers for any of these yet. I’m still learning. I have some ideas, but they need to be tested. We need good patterns to emerge and be discussed broadly because these tools will change how we work. If you have answers or even just ideas, contact me. Let’s learn together.

Many thanks to John and Patrick for hosting this event.

Resources

--

--

Bryan Finster
Defense Unicorns

Developer, Value Stream Architect, and DevOps insurgent working for Defense Unicorns who optimizes for sleep. All opinions are my own. https://bryanfinster.com