It appears that the concept of CloudNative lacks a precise definition. According to the Cloud Native Computing Foundation (CNCF): "Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.”
The scalability of applications necessitates the irreversibility of both deployment and execution.
Resilience pertains to the recovery from temporary errors or resource unavailability, while the ability to make frequent, high-impact changes implies the reversibility of the deployment process.
Looking at the 12 factors, I add my comments according to four distinct heads:
Codebase - One codebase tracked in revision control, many deploys.
- States: Minimize the states of deployments, ensuring consistency across various deployment instances.
Dependencies - Explicitly declare and isolate dependencies.
- Interdependencies: Recognize and manage interdependencies
Configuration - Store configuration in the environment.
- States: so the environment configuration is immutable
Backing Services - Treat backing services as attached resources.
- Interdependencies: enforce public standard interactions
- States: a resource is either exist or non-exist
- Irreversibility: circuit breaker , retries is about recover from the fault satuation
Build, release, run - Strictly separate build and run stages.
- States: Similar to Division of labor in Ford
Processes - Execute the app as one or more stateless processes.
- States: Implement stateless processes to simplify application management and ensure consistency.
Port binding - Export services via port binding.
- Interdependencies: enforce public standard interactions
I think of a complex system as one for which the outputs are _incongruent_ with the inputs. That is to say: it's impossible to attribute outputs or outcomes to the interactions among and between inputs within the system. A system can be made _less_ complex by understanding and modeling interactions therein. Describing a system as complex doesn't imply sophistication; rather, it's a bad thing and should be addressed.
Wouldn’t it be great if the people using the system could decide to reverse or opt-out of some new feature that public web companies provide?
I’m reading Johann Hari’s book “Stolen Focus” and the data he presents on the epidemic of attention span decline makes me wish I could turn off the attention grabbing features being added to all the social media and commerce systems.
Sorry if this is off topic or a stretch of intent but maybe it will spark some interest.
I once said I would pay for Twitter if I had the option of locking myself out for a day or a week. I don't think you can expect that kind of features from tech companies - guarding your focus is really on you as a consumer.
You wrote that the complexity head to cut off— if you’re Facebook— is the irreversability head, because the other 3 are not approachable. Is that true of any multi-datacenter-scale computing organization? Or was Facebook different because it was harder to predict its future feature set and scale? Do other computing organizations have hope of attacking state, interdependencies, or uncertainty?
Different sources of complexity make sense to address at different scales and in different contexts. For example, an individual programmer might write automated tests to reduce uncertainty. Teams with lots of interdependencies might choose to duplicate effort to reduce them.
I work for an online dating company that has ~40 dating properties serving a large, global community. We've focused on the reversibility aspect because, frankly, it's just plain easier to control. As a baseline, we've automated testing and deployment so we can take changes (or fixes) to production multiple times an hour if necessary (although our goal is once a week). We use a combination of feature flags, A/B percentage tests, and predicate-based rules, all controllable by an "admin" app that the business team can use to adjust pretty much any behavior they want. The predicate-based rules use an English-like query language that allow them to target segments of users based on pretty much any attributes of their profile, their purchase history, or their behavior on the site.
It allows us to try out new ideas on smaller sites or geographically or subsets of the user population and either roll it out further if it looks good or roll it back if it doesn't. It dramatically reduces the risk of any changes we make, and it encourages everyone in the company to suggest possible improvements (both to the user experience and to our bottom line) since we can experiment pretty freely and reverse anything that doesn't work.
We rely very heavily on observability at all levels to see how any experiment impacts performance across all sorts of dimensions (both infrastructure and business).
We're not even "multi-datacenter scale" and our entire company is less than a dozen people.
After reading the blog, I realized that Cloud Native is primarily associated with irreversibility.
How so?
It appears that the concept of CloudNative lacks a precise definition. According to the Cloud Native Computing Foundation (CNCF): "Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.”
The scalability of applications necessitates the irreversibility of both deployment and execution.
Resilience pertains to the recovery from temporary errors or resource unavailability, while the ability to make frequent, high-impact changes implies the reversibility of the deployment process.
Looking at the 12 factors, I add my comments according to four distinct heads:
Codebase - One codebase tracked in revision control, many deploys.
- States: Minimize the states of deployments, ensuring consistency across various deployment instances.
Dependencies - Explicitly declare and isolate dependencies.
- Interdependencies: Recognize and manage interdependencies
Configuration - Store configuration in the environment.
- States: so the environment configuration is immutable
Backing Services - Treat backing services as attached resources.
- Interdependencies: enforce public standard interactions
- States: a resource is either exist or non-exist
- Irreversibility: circuit breaker , retries is about recover from the fault satuation
Build, release, run - Strictly separate build and run stages.
- States: Similar to Division of labor in Ford
Processes - Execute the app as one or more stateless processes.
- States: Implement stateless processes to simplify application management and ensure consistency.
Port binding - Export services via port binding.
- Interdependencies: enforce public standard interactions
Concurrency - Scale out via the process model.
- States: requires stateless, share nothing design
Dev/prod parity - Keep development, staging, and production as similar as possible.
- Uncertainty: Mitigate uncertainty
Logs - Treat logs as event streams.
- Maybe not directly related to the 4 heads
Admin processes - Run admin/management tasks as one-off processes.
- States: manage the admin process as code so the state change is immutable
After the review, I might mean to say resilience/zero-downtime is about Irreversibility
I think of a complex system as one for which the outputs are _incongruent_ with the inputs. That is to say: it's impossible to attribute outputs or outcomes to the interactions among and between inputs within the system. A system can be made _less_ complex by understanding and modeling interactions therein. Describing a system as complex doesn't imply sophistication; rather, it's a bad thing and should be addressed.
Wouldn’t it be great if the people using the system could decide to reverse or opt-out of some new feature that public web companies provide?
I’m reading Johann Hari’s book “Stolen Focus” and the data he presents on the epidemic of attention span decline makes me wish I could turn off the attention grabbing features being added to all the social media and commerce systems.
Sorry if this is off topic or a stretch of intent but maybe it will spark some interest.
That's definitely a different topic.
I once said I would pay for Twitter if I had the option of locking myself out for a day or a week. I don't think you can expect that kind of features from tech companies - guarding your focus is really on you as a consumer.
I actually referenced Hari's book in my own post about this: https://renegadeotter.com/2023/08/24/getting-your-focus-back.html
Any chance that there is somewhere video of profesor Enrico Zaninotto keynote talk?
https://martinfowler.com/articles/zaninotto.pdf
Thank you, I've read the transcript. I was hoping for the video. I know it's old but I had to ask.
Nope, it was before we invented visuals 🤪
You wrote that the complexity head to cut off— if you’re Facebook— is the irreversability head, because the other 3 are not approachable. Is that true of any multi-datacenter-scale computing organization? Or was Facebook different because it was harder to predict its future feature set and scale? Do other computing organizations have hope of attacking state, interdependencies, or uncertainty?
Different sources of complexity make sense to address at different scales and in different contexts. For example, an individual programmer might write automated tests to reduce uncertainty. Teams with lots of interdependencies might choose to duplicate effort to reduce them.
I work for an online dating company that has ~40 dating properties serving a large, global community. We've focused on the reversibility aspect because, frankly, it's just plain easier to control. As a baseline, we've automated testing and deployment so we can take changes (or fixes) to production multiple times an hour if necessary (although our goal is once a week). We use a combination of feature flags, A/B percentage tests, and predicate-based rules, all controllable by an "admin" app that the business team can use to adjust pretty much any behavior they want. The predicate-based rules use an English-like query language that allow them to target segments of users based on pretty much any attributes of their profile, their purchase history, or their behavior on the site.
It allows us to try out new ideas on smaller sites or geographically or subsets of the user population and either roll it out further if it looks good or roll it back if it doesn't. It dramatically reduces the risk of any changes we make, and it encourages everyone in the company to suggest possible improvements (both to the user experience and to our bottom line) since we can experiment pretty freely and reverse anything that doesn't work.
We rely very heavily on observability at all levels to see how any experiment impacts performance across all sorts of dimensions (both infrastructure and business).
We're not even "multi-datacenter scale" and our entire company is less than a dozen people.