I’ve been augmented coding several database projects, since they are 1) highly complex, 2) highly leveraged if successful, & 3) sitting around in my brain.
(I've also used 3X as a way to describe how my database needs evolve.)
My database definitely does break the constantish startup time though, the startup time is really bad. Replication with raft resolves that to an extent allowing some servers to be in a booting state for a while.
Kent, I’m wondering why you think the O() performance properties are part of the nature of a platonic database. They make sense but it’s not something I would have considered part of the essential definition. Isn’t a slow database still a database?
If the performance characteristics didn’t matter then I’d just use linear search. When I think “database”, I think, “…and I don’t have to care about scaling.” Until I do 🤪
I feel like a lot of the listed properties are specific to certain, albeit common, use cases. Some properties, like atomicity and durability, may be due to the influence of modern-ish RDBMS systems.
When developing a custom database, I think the standard agile approach is best to keep in mind. Do we need persistence if the data can be quickly regenerated? Do we need atomicity if there is one single-threaded user? Do we need indexes if performance is good enough with O(n) loops? Is it the right optimization? Are we optimizing for the right thing? (Postgres has a very complicated system for optimizing queries based on various table statistics, whether data fits entirely in memory, etc.). There’s a lot of opportunity for YAGNI.
I feel like ACID should be part of it. But one reason that you don't simply put everything in various memory (CAP also applies to memory cache) or some kind of random storage is you need at least some of the ACID properties.
For those who protest by citing BaSE DB, in my experience, BaSE is just ACID with a softer time constraint, in the sense of you can afford to wait for the ACID guarantees, but eventually, you're still expecting those guarantees to hold.
ACID properties are a good place to start. But you may not need all of them. Giving up some of the ACID properties to get performance, scalability, reliability, better "failure state" behavior, etc. can be a good trade-off in some applications.
I'd love to see what you think of my "database": https://blog.screenshotbot.io/2024/08/10/building-a-highly-available-web-service-without-a-database/
(I've also used 3X as a way to describe how my database needs evolve.)
My database definitely does break the constantish startup time though, the startup time is really bad. Replication with raft resolves that to an extent allowing some servers to be in a booting state for a while.
Does the CAP Theorem define the platonic database? (Or rather, 3 different platonic databases?)
I use CAP in distributed settings. The platonic database starts on a single machine.
Kent, I’m wondering why you think the O() performance properties are part of the nature of a platonic database. They make sense but it’s not something I would have considered part of the essential definition. Isn’t a slow database still a database?
Good piece btw. Caught my attention.
If the performance characteristics didn’t matter then I’d just use linear search. When I think “database”, I think, “…and I don’t have to care about scaling.” Until I do 🤪
I feel like a lot of the listed properties are specific to certain, albeit common, use cases. Some properties, like atomicity and durability, may be due to the influence of modern-ish RDBMS systems.
When developing a custom database, I think the standard agile approach is best to keep in mind. Do we need persistence if the data can be quickly regenerated? Do we need atomicity if there is one single-threaded user? Do we need indexes if performance is good enough with O(n) loops? Is it the right optimization? Are we optimizing for the right thing? (Postgres has a very complicated system for optimizing queries based on various table statistics, whether data fits entirely in memory, etc.). There’s a lot of opportunity for YAGNI.
I feel like ACID should be part of it. But one reason that you don't simply put everything in various memory (CAP also applies to memory cache) or some kind of random storage is you need at least some of the ACID properties.
For those who protest by citing BaSE DB, in my experience, BaSE is just ACID with a softer time constraint, in the sense of you can afford to wait for the ACID guarantees, but eventually, you're still expecting those guarantees to hold.
ACID properties are a good place to start. But you may not need all of them. Giving up some of the ACID properties to get performance, scalability, reliability, better "failure state" behavior, etc. can be a good trade-off in some applications.