Intentions & Actions

The yawning chasm

Dec 16, 2024

“I’m ready to throw my goddamn phone out the window. I spent an hour setting up a reservation & then the ‘server’ had an ‘error’ & I had to do it again.” I’ve been carefully watching non-geeks use computers & we are failing our users.

LLMs could do something genuinely useful for all of us. First, though, some background.

Thanks to sponsor Graceful.dev

Graceful.dev Grow your programming practice

Beautiful, effective advice for Ruby/Rails programmers (and beyond). Get 10% off lifetime by using coupon code BECK at checkout.

Windows/Panes/Menus/Items—Actions

Back at the dawn of our current user interface paradigm (early 70s Smalltalk), the interface had a four-level hierarchy:

We had windows of various types, each of which was split into panes, also of various types, each of which had a menu which had several items. That was it. Want to know what you can do with the interface? Draw this tree. Rake up the leaves. Those are your available actions.

(Now, in Smalltalk this is a vast understatement of the capabilities of the interface because one of those menu items was “Do It”, which executed arbitrary code. But for a non-programming user, these were the available actions.) (All users were expected to learn to program btw.)

Scale

Folks soon wanted more actions than fit into this paradigm. Well, folks wanted to add more actions if “folks” meant programmers/product folks. Did users want more actions? Maybe.

Anyway, the desire for more actions led to the great Sub Menu War—were menu items that expanded into more menu items a blessing or just the symptom of poor higher-level user interface design? Would different windows or different panes make one-level menus sufficient?

Sub-menus won (sorry about that y’all). Now we could accommodate the next exponent’s worth of actions.

Intentions

People come to computer systems with intentions that are separate from the actions those systems provide. I want to:

Add a layover in Minneapolis to my current trip
Watch The Voice
Switch the clauses in a sentence

I don’t want to select things, push buttons, choose menu items, & fill in fields. There’s a mapping from these higher level “intentions” to actions.

When the mapping is direct, 1-to-1, then using a computer is magic.

We even tell our friends these stories—”…and then the computer did exactly what I wanted!!!”

Mismatch

We tell these stories because they are rare. More often the computer does kinda what we want.

Or we have to synthesize a compound action out of the available actions:

Want to watch The Voice? It’s simple:

Select HDMI 2 on remote 1
Press power on remote 2
Navigate to Peacock
Choose The Voice
Oh, you want a recorded episode? Well, if you want that you need remote 3
OFFS (← acronym)

The mapping from Intention to Actions is hard:

You need to have learned the necessary actions exist
You need to remember the actions
You need to find the actions
You need to execute the actions
You need to be confident that executing the actions that you know about/remember/find in a particular order will result in your intention being satisfied
Any failure of the above will result in failure

Kids

Sidebar—Why do kids navigate the digital world so easily?

Learn—they have a social group that rapidly spreads knowledge of new actions
Remember—this is their native language, of course they remember all the actions
Find—they have steered through the action navigation mechanisms since birth
Execute—again, lots of practice
Confidence—again, lots of practice where earlier generations, even with advanced CS degrees, are only ever kinda sure they can make the computer do what they want)

LLMs

You know that things where you type “how do I turn off dark mode on my iPhone?” into Perplexity and it says, “easy peasy, go here do this go there do that”? And then you say, “Yeah please do that,” & Perplexity just stares at you?

An LLM is the always-available kid. It knows all the actions (kinda). It knows how to get to them all (kinda). It knows how to execute the actions (kinda). It knows (kinda) how to map from intentions to sequences of actions.

Just (“just,” he says, as if) remove those “kindas” & we have a compelling use for LLMs.

But in the meantime FFS (← another acronym) adding more & more actions is subtracting value from our computers. And the less the user is like a young computer science savant, the more value is subtracted.

Chris Richardson

Dec 16

Open the pod bay doors Hal...

I generally agree with your point that there's a problem with UX. These days, most UXs are terrible. Poorly designed. But I think that's a failing of organizations/designers - their goal seems to be maximizing engagement rather than old school usability (enabling users to successfully complete tasks).

Perhaps an LLM could help. But only if the organization truly wants cares about UX rather than engagement. And then there's the problem of hallucinations...

Expand full comment

2 replies by Kent Beck and others

Jeff Bailey

Dec 16Edited

> “I’m ready to throw my goddamn phone out the window. I spent an hour setting up a reservation & then the ‘server’ had an ‘error’ & I had to do it again.” I’ve been carefully watching non-geeks use computers & we are failing our users.

It's maddening. I wrote about the state of software back in 2019, and now it's even worse. GenAI is exasperating the problem. I suspect high quality software will become a unicorn in the coming years.

https://jeffbailey.us/blog/2019/11/09/death-by-1000-cuts/

11 more comments...

Software Design: Tidy First?

Discussion about this post