21 Comments
User's avatar
Dave Rooney's avatar

This is so relevant! In our team right now we have a very clearly defined code base. We have made specific design choices and when we stick to them the code is clean, extendible, and enjoyable to work on. It's great!... but we have to write all the code (sadface). Jump in AI and we have specific AI instructions written with all our practices in mind, there's examples of gold standard code for a multitude of scenarios. We also have skills and workflows that help us go from nothing, to a technical plan, to a delivery plan with options and progressions that help us keep in touch and in line.

Here are the issues. If we let the AI go alone it creates a whole load of crap. It's a mess. It cuts corners and forgets things and to be honest, that's expected as it's a big code base. But if we pair with the AI 1 task at a time it's brilliant. We almost always have to tweak the code with the first few tasks but then the rest becomes fluent. We almost always have to remind it that we need tests and ask it to look again at it's skills about TDD and the types of tests we like and don't like.

The lessons right now:

1. You can only pair with AI like you pair with anyone else. Enjoy it! but you now mostly read code than physically write code (which I'm quite happy about)

2. All those people vibe coding and letting agents do all the work are screwed.

This will all likely change by July...

Ben Martin's avatar

I keep feeling that "A Deepness in the Sky" (A Deepness in the Sky - Wikipedia https://share.google/PTkGmUL6jeJWwXhw4) is the best reference point for this moment. Spoiler: the bad guys first succeed, then fail do to quiokly writing spagetti code.

Dale Hagglund's avatar

You might find some of Jason Gorman's ideas interesting. He's been experimenting with LLM based coding for a while now, and has been summarizing his experience in a series he calls the "AI Ready Software Developer". His most recent article introduces an acronym CRESS as summary for his guidelines of effective use of contexts. Without going into any of the details, the letters stand for contexts that are Current, Refutable, Empirical, Small, and Specific. I've found his stuff to be very interesting reading.

A couple starting points (hope it's ok to post links here):

- https://codemanship.wordpress.com/2026/05/04/c-r-e-s-s-principles-for-context-engineering/

- https://codemanship.wordpress.com/2025/10/30/the-ai-ready-software-developer-index/

Michael K Alber's avatar

The futures axis is where I kept getting burned until I started injecting explicit intent and constraints into every session via a grounded MCP server — one that includes resources about my own coding standards, architectural preferences, and quality expectations. Nate B. Jones puts it well: constraints shape macro-architecture, personas only clean up the micro stuff. Without that grounding, the genie just falls back on whatever mediocre patterns it trained on, and you get exactly what you're describing: passes the tests, kills your optionality.

Jeremy's avatar

Nice article. But, your final comment is a little defeatist. One thing no AI now or in the near future has over us "rapidly-obsolescing humans" is lived experiences the living interactions that we have with the world around us could possibly never be available to AI Agents. Furthermore, the technology is clever and I find it useful myself. But AI will never be able to think or dream the way we humans can, they can't generate new ideas. When we stop interacting with them what happens? Nothing! But we humans continue thinking, unstoppably! Usually, it's about where to get the next drink or what we might like to eat later, but, that in itself is good the brain needs time to recover and manipulate ideas, concepts... Keep on truckin' Kent.

Steve limb's avatar

I think your diagrams are largely a reflection of what I see when using agents. For simple additions to existing well designed and obvious code they do stay somewhat ‘high and right’. But in areas that have a less well designed architecture or green field their default output is ‘low and left’. Only after many forced refactorings can I bring it back ‘high and right’. The solution, which I’m actively working is probably not what most people want to hear. A new much stricter programming language/compiler that is designed to constrain,limit and force automated refactoring by the agent via errors and rich error information and hints at what to do. The thing is if you ask an LLM to assess the code it generated wrt SOLID/DRY or ‘clean code’ it will understand and be self critical, but the human has to trigger that behaviour. What I’m working on is getting the compiler to trigger that behaviour as the code is being written by the LLM. It’s starting to take shape (after many years of work) and I hope to have something releasable in 12-18 months.

Javier Bonnemaison's avatar

This post is brilliant! You make so many insightful and important points in such a concise, clear, and direct way. Unfortunately, it will probably go over most people's heads (Hi Mike).

EDL's avatar
Apr 30Edited

This jives with my lived experience. Of course, we had this before LLM coding but it’s accelerating. I’m not sure what happens next but I have a feeling in a couple years all the OG software engineers are going to be called out of retirement to fix it all (think y2k cobol programmers ^x)

Kai's avatar

I‘m wondering if we should try to push genies into the top left corner. Let’s call it „disposable code“. Without genies it is ludicrous to consider rewriting everything from scratch every time, but now..?

Joe Bowbeer's avatar

I've seen this discussed by analogy with the industrialization of textile, which changed the whole pipeline including the yarn. In the future, will software development be creating disposable t-shirts or t-shirt factories? What happens to the hardware?

Mike's avatar
Apr 29Edited

interesting idea but I don’t get why to call AI or LLM a 'Genie.'

By AI: "It’s cute shorthand, but it glosses over important realities: genies imply magical, wish-granting agency and no accountability, while these systems are engineered tools with real-world biases, limits, and design choices. Using “Genie” romanticizes their behavior, downplays where responsibility sits, and makes it easier to ignore how they’re built and governed. Call it what it is—models, systems, algorithms—so we stay clear-eyed about both the benefits and the risks."

Kent Beck's avatar

"Genie" emphasizes the combination of power and lack of alignment. Just because you expressed a wish and got it doesn't mean you got what you wanted.

Mike's avatar

AI: "That metaphor still misleads. It suggests mysterious agency and inevitability, when the gap you describe is actually caused by design choices, data limits, and objective trade-offs we can fix or regulate. Better to use language that highlights responsibility and engineering (e.g., “model” or “system”)"

Kent Beck's avatar

I'm not going to have a conversation with a model about models. If you object, tell me why.

Mike's avatar
Apr 29Edited

"Genie" implies magic but there is no magic in LLMs, just data, math, and compute.

And LLM understands that.

Kent Beck's avatar

It's a metaphor.

Mike's avatar
Apr 29Edited

IMO just not a good one.

"Extreme Programming" was a much better metaphor.

Joe Bowbeer's avatar

I like "genie" for now. Karpathy arguably has a better understanding of what the coding agents are doing than anyone and he's still probing and evolving and trying to understand how best to think about and explain them. There is a lot of opaque complexity in LLMs that, especially when expressed through a natural language interface, gives them a "genie" quality.

Michael McCulloch's avatar

You can involve yourself in the output product of the genie, and drag the repository, kicking and screaming, up and to the right.

With your own hands.