First published in 2017. I was already wrestling with the question of “when” as the critical design decision. Also, you can see the roots of the “empirical” in Empirical Software Design here.
Commission me to deliver a custom talk to your development organization. Your choice of format, topic, & location.
My intention in this note is to reclaim the phrase ad hoc from those who use it as a pejorative, especially as applied to infrastructure. Instead, building infrastructure ad hoc is the safest, most efficient strategy. It carefully balances the risks inherent in creating infrastructure, stages investment, and realizes economies of scale.
“Infrastructure”?
Imagine we have lots of code like this:
value = memcache.get(key)
if (! value) {
value = database.get(key)
memcache.put(key, value)
}
...value...
This code requires three network calls in the case of a cache miss, expensive network calls going from the front end servers to the data storage servers. Suppose we replace these 3 calls with a single call:
value = cache.get(key)
(Assuming we set up the cache to know how to fetch its underlying data.)
The front end no longer has to care about how to fill the cache. The cache machines can be located close (in the network sense) to the database machines, so a cache miss no longer incurs as much of a penalty.
Once the cache is in place, we can continually improve cache hit rates, reduce latency on misses, and improve behavior for “hot” keys all without changing the front end code at all. A small improvement to the cache’s behavior results in a large return on investment because of scale.
That’s what infrastructure is: a (mostly) isolated bundle of related functionality achieving economies of scale. Those economies may come from reduced capital expenditure (fewer servers needed for the same data) or reduced operating expenditure (shorter mean time to repair because the functionality can be economically refined).
So far I haven’t said anything controversial (I hope). But…
When should such infrastructure be built? What goes into the infrastructure? These are questions that provoke furious debate.
The Myth
The myth is “build the infrastructure and then build the apps on top of it”.
This trick never works (for some value of “never”). Building the infrastructure is a never-ending project. When is it good enough for apps? Tomorrow. Always tomorrow. What should go into the infrastructure? More. Always more.
Infrastructure is supposed to be built more carefully than apps. Apps gain value from exploration & thus emphasize latency of experiments. Infrastructure gains value from scale. Mistakes or inefficiencies scale too. However, infrastructure is not built with perfect knowledge. Infrastructure needs to grow & adapt.
“When?” should, then, be answered with “some sooner, some later”.
Ad Hoc Infrastructure
I’ve heard people critique as system as “ad hoc”. Hang on!
ad hoc /ˌad ˈhäk/
adverb
when necessary or needed.
adjective
created or done for a particular purpose as necessary.
I like both those definitions. Why would you create infrastructure before it is necessary? Why would you create infrastructure that’s not needed?
The “ad hoc” answers to the infrastructure question:
When? After an application already has the functionality, but in a clearly inferior form.
How much? Just as much as will let the application return to only doing its job.
Ad hoc infrastructure achieves economies of scale by using regret as the primary motivation. “I wish we had built this a different way...” leads to building a small part of the application in a better way, a way that just so happens to be useful for other apps after a bit of tweaking.
It's probably a big lesson to learn for DevOps/platform teams: just build what is necessary. And don't forget the corollary: always build for change.
It's a lesson software developers have learned (or to be exact, are still learning) but have evaded infra developers, probably due to they only have moved recently to more flexible (but more complex) components and their related technics: PaaS has replaced hardware, IaC has replaced switchboard. I just buy this a little however: My background was in CES manufacturing, and building for changes, with interchangeable parts, flexible assembly lines, and other Lean manufacturing concepts was primordial for surviving the crazy passing of the modern market. It makes no sense to me to hear a team telling me it will takes 2 men-months of work to simply update the infra to support a new build platform, when my old team of engineering were able to reconfigure a pipeline in a 1 week to take into account a new form factor with a new supplier.
So I think why I like reading your writing is not because of the great lessons, but because of the humor. "For some value of never" is a witty dry funny statement that because I have lived it, I appreciate. It made me laugh out loud in a public place!