Genie Tarpit

Apr 29

Genies give you code that’s a degraded facsimile of the mediocre code it trained on.

21 Comments

This is so relevant! In our team right now we have a very clearly defined code base. We have made specific design choices and when we stick to them the code is clean, extendible, and enjoyable to work on. It's great!... but we have to write all the code (sadface). Jump in AI and we have specific AI instructions written with all our practices in mind, there's examples of gold standard code for a multitude of scenarios. We also have skills and workflows that help us go from nothing, to a technical plan, to a delivery plan with options and progressions that help us keep in touch and in line.

Here are the issues. If we let the AI go alone it creates a whole load of crap. It's a mess. It cuts corners and forgets things and to be honest, that's expected as it's a big code base. But if we pair with the AI 1 task at a time it's brilliant. We almost always have to tweak the code with the first few tasks but then the rest becomes fluent. We almost always have to remind it that we need tests and ask it to look again at it's skills about TDD and the types of tests we like and don't like.

The lessons right now:

1. You can only pair with AI like you pair with anyone else. Enjoy it! but you now mostly read code than physically write code (which I'm quite happy about)

2. All those people vibe coding and letting agents do all the work are screwed.

This will all likely change by July...

Ben Martin

May 11

I keep feeling that "A Deepness in the Sky" (A Deepness in the Sky - Wikipedia https://share.google/PTkGmUL6jeJWwXhw4) is the best reference point for this moment. Spoiler: the bad guys first succeed, then fail do to quiokly writing spagetti code.

Dale Hagglund

May 5Edited

You might find some of Jason Gorman's ideas interesting. He's been experimenting with LLM based coding for a while now, and has been summarizing his experience in a series he calls the "AI Ready Software Developer". His most recent article introduces an acronym CRESS as summary for his guidelines of effective use of contexts. Without going into any of the details, the letters stand for contexts that are Current, Refutable, Empirical, Small, and Specific. I've found his stuff to be very interesting reading.

A couple starting points (hope it's ok to post links here):

- https://codemanship.wordpress.com/2026/05/04/c-r-e-s-s-principles-for-context-engineering/

- https://codemanship.wordpress.com/2025/10/30/the-ai-ready-software-developer-index/

Michael K Alber

May 1

The futures axis is where I kept getting burned until I started injecting explicit intent and constraints into every session via a grounded MCP server — one that includes resources about my own coding standards, architectural preferences, and quality expectations. Nate B. Jones puts it well: constraints shape macro-architecture, personas only clean up the micro stuff. Without that grounding, the genie just falls back on whatever mediocre patterns it trained on, and you get exactly what you're describing: passes the tests, kills your optionality.

Jeremy

May 1

Nice article. But, your final comment is a little defeatist. One thing no AI now or in the near future has over us "rapidly-obsolescing humans" is lived experiences the living interactions that we have with the world around us could possibly never be available to AI Agents. Furthermore, the technology is clever and I find it useful myself. But AI will never be able to think or dream the way we humans can, they can't generate new ideas. When we stop interacting with them what happens? Nothing! But we humans continue thinking, unstoppably! Usually, it's about where to get the next drink or what we might like to eat later, but, that in itself is good the brain needs time to recover and manipulate ideas, concepts... Keep on truckin' Kent.

Steve limb

May 1

I think your diagrams are largely a reflection of what I see when using agents. For simple additions to existing well designed and obvious code they do stay somewhat ‘high and right’. But in areas that have a less well designed architecture or green field their default output is ‘low and left’. Only after many forced refactorings can I bring it back ‘high and right’. The solution, which I’m actively working is probably not what most people want to hear. A new much stricter programming language/compiler that is designed to constrain,limit and force automated refactoring by the agent via errors and rich error information and hints at what to do. The thing is if you ask an LLM to assess the code it generated wrt SOLID/DRY or ‘clean code’ it will understand and be self critical, but the human has to trigger that behaviour. What I’m working on is getting the compiler to trigger that behaviour as the code is being written by the LLM. It’s starting to take shape (after many years of work) and I hope to have something releasable in 12-18 months.

Javier Bonnemaison

Apr 30

This post is brilliant! You make so many insightful and important points in such a concise, clear, and direct way. Unfortunately, it will probably go over most people's heads (Hi Mike).

EDL

Apr 30Edited

This jives with my lived experience. Of course, we had this before LLM coding but it’s accelerating. I’m not sure what happens next but I have a feeling in a couple years all the OG software engineers are going to be called out of retirement to fix it all (think y2k cobol programmers ^x)

Kai

Apr 30

I‘m wondering if we should try to push genies into the top left corner. Let’s call it „disposable code“. Without genies it is ludicrous to consider rewriting everything from scratch every time, but now..?

Reply (1)

Joe Bowbeer

May 4

I've seen this discussed by analogy with the industrialization of textile, which changed the whole pipeline including the yarn. In the future, will software development be creating disposable t-shirts or t-shirt factories? What happens to the hardware?

Mike

Apr 29Edited

interesting idea but I don’t get why to call AI or LLM a 'Genie.'

By AI: "It’s cute shorthand, but it glosses over important realities: genies imply magical, wish-granting agency and no accountability, while these systems are engineered tools with real-world biases, limits, and design choices. Using “Genie” romanticizes their behavior, downplays where responsibility sits, and makes it easier to ignore how they’re built and governed. Call it what it is—models, systems, algorithms—so we stay clear-eyed about both the benefits and the risks."

Reply (1)

Kent Beck

Apr 29

"Genie" emphasizes the combination of power and lack of alignment. Just because you expressed a wish and got it doesn't mean you got what you wanted.

Reply (1)

Mike

Apr 29

AI: "That metaphor still misleads. It suggests mysterious agency and inevitability, when the gap you describe is actually caused by design choices, data limits, and objective trade-offs we can fix or regulate. Better to use language that highlights responsibility and engineering (e.g., “model” or “system”)"

Reply (2)

Kent Beck

Apr 29

I'm not going to have a conversation with a model about models. If you object, tell me why.

Reply (1)

Mike

Apr 29Edited

"Genie" implies magic but there is no magic in LLMs, just data, math, and compute.

And LLM understands that.

It's a metaphor.

Apr 29Edited

IMO just not a good one.

"Extreme Programming" was a much better metaphor.

I like "genie" for now. Karpathy arguably has a better understanding of what the coding agents are doing than anyone and he's still probing and evolving and trying to understand how best to think about and explain them. There is a lot of opaque complexity in LLMs that, especially when expressed through a natural language interface, gives them a "genie" quality.

Michael McCulloch

May 19

You can involve yourself in the output product of the genie, and drag the repository, kicking and screaming, up and to the right.

With your own hands.

Software Design: Tidy First?

Genie Tarpit