9 Comments
User's avatar
Yevhen Viktorov's avatar

+1 I also thought about this, thanks for explaining it so well. I suspect the challenge might be in finding code bases where quality of transformations valued more than final results. As you mentioned we often sidestep and cut corners which obscures path in retrospective.

Expand full comment
Kent Beck's avatar

That's where I like the idea of synthesizing data to train on.

Expand full comment
Yevhen Viktorov's avatar

I like the idea of using Trasformation premise. I was also thinking of try provable refactoring as a constraint when prompting for changes: https://github.com/digdeeproots/provable-refactorings

Expand full comment
Mike VanBeneden's avatar

Hey Kent, Great article! I'm curious which tools/genies you are using? I've had pretty good success with prompting over training for this kind of directing of foundation models. I believe one could achieve a lot of this by including some general system prompting around the more iterative prompts. e.g. you could almost copy paste this entire article into the Claude.md file (for Claude Code) and the iterative prompts would likely behave more like this.

Thanks for the post. I'm gonna go ask Claude to summarize this article into a system prompt and see what happens. :)

Expand full comment
Kent Beck's avatar

Tells us what it says. Also the results of the claude.md experiment.

Expand full comment
Mike VanBeneden's avatar

I'm not as well versed or practiced in these best practices as you, so the latter claude.md experiment will take more time for me to evaluate quality of results in real world.

In the meantime, this was a quick and dirty first attempt at porting this article to a system prompt.

https://claude.ai/public/artifacts/a71e932f-966e-49af-8691-a959dd73004b

Given this prompt:

https://claude.ai/share/34af3f1a-0699-45e1-9ca8-582fe2171f67

I'm sure it could be tightened up from here as a starting point. Maybe inline Transformation Priority rules. Etc. And probably best broken into multiple custom commands (if in Claude Code) so only rules relevant to the goal are brought into context.

I just re-read this part of your post...

> I can get genies doing a little bit of this stuff sometimes using system prompts. But the genie is astonishingly bad at safe sequencing & willing to abandon it at the first signs of resistance.

Again, I've had decent luck getting Claude Code specifically to follow a sequence, especially now that it manages a tasks list (sequence).

Which genies do you use?

Expand full comment
Denis Čahuk's avatar

I never thought I'd see the re-emergence of TPP in this decade. Wild!

Expand full comment
John Vandivier's avatar

If we train on code versions wouldnt the model learn changes?

Expand full comment
Avi Kessner's avatar

Next on genie oddities:

The agent isn't adding any text to my file, it's just updating the git history directly!

Expand full comment