This reminds me of your comment at DDD Europe - paraphrasing from memory: it turned out that 70% of software cost was in the maintenance phase; why not make it 99%.
The original poster implies that the system is “done” after his first set of iterations. That’s really unlikely.
One advantage of TDD (/BDD/etc) is to assume that you’re never done and that behavioural composition goes on for the lifetime of the system: many years or decades.
And that value should be deliverable early: the skateboard not the car.
Start with tiny behaviour, compose, iterate (over those many years or decades).
I started TDD deliberately just a year ago. And it provides me all the benefits you have described here and there. But I think what is often underestimated is TDD as a tool for learning and rapid skill improvement. I think my Software Design skill never increased so steep as in the last year.
This view came up when listening to a conversation between Dave Thomas and Dave Farley. Where Dave Thomas described that he dropped TDD for 6 or 12 months and there was now bug increase or design decrease. So I just started wondering what this means. I think it’s definitely a learning practice, while making meaningful progress on the problem you are paid for.
Pairing is not always feasible, AI not always accurate enough. TDD is a very high available high quality feedback loop that supports learning.
It's much easier to tackle a programming task by breaking it down into smaller, actionable steps. Being able to test these smaller steps increases confidence in trying new approaches and algorithms.
I struggled with recognizing this for years. I like to switch back and forth between big picture thinking and detail thinking. Everyone I found that talked about TDD for my first few years of exploring it acted like you only did little picture thinking, and then magically the big thing appeared without any intervention from yourself. I thought I was missing something important - the first place I saw someone acknowledge that you were responsible for how these little things went together to make the big thing was in the book "Growing Object Oriented Software, Guided By Tests".
But I love this post, I think it is something that those that know what they are doing take for granted, and forget to teach. I should be shouted from the rooftops, it should be one of the principles taught first.
Breaking big problems into smaller parts is a key skill in software development. It lets us work on parts separately or in steps, and helps avoid many issues caused by coupling. A lot of developers have trouble making software that is decoupled because they don't know how to identify and separate these parts to work on them one by one and then put them all together. This also makes testing easier, as it allows for testing small parts one at a time, something many people find hard to do with big, complex problems.
> “If I wanted to be supported by tests as I code, how could I have designed & coded this differently?” & I’ve often found a way I could have applied TDD if only I had known.
I think this fundamentally gets down to going from "known to unknowns" (as you once taught me, it's pretty much how I always attack problems these days when I'm stuck.)
Often times the "known" thing is the next behavior change you want to make, especially if you already have testable code. In that case, the thing you can do to get immediate insight is actually writing the test, because writing the test is cheap and exactly describes the "what", no new insight is needed here.
Sometimes that's not the case. There's enough uncertainty about what you're trying to build. A good example of this is when you're trying to figure out a function signature: you may not want to just start by writing a test case for the new function signature, rather you'll call the stub function from another function as the next piece of code you'll write, because that's the information you need: you want to understand "what" the function should be.
Another example is that if you want to get insight into if something is technically possible at all, you would jump straight into the code and put assertions or exceptions or prototype to understand current behavior. Only once you validate feasibility would start writing tests. (would you throw away the prototype code you wrote before writing the test? Unclear, maybe context dependent.)
I think this can be extended to larger cases. If you're dealing with code that's not designed for testability, then refactoring for testability is the unknown. Perhaps you have to work with the code for a bit before it becomes clear how you need to make the code testable.
For me, TDD is a technical expression of "Explain it to me like I'm five." It's amazing how many experts I've worked with over the decades can't simplify without time to "go think about it for a while." The few edges and corners of difficult to test are always held up as amusing examples of why not TDD at all.
Hi Kent, great points. It's always a great experience learning with your vision.
In my experience, some people describe TDD only as the "Test First" approach. But it's not; it's a group of techniques and ways of thinking that aim to help you make better decisions while programming.
In the "TDD by example", you mentioned:
TDD is an awareness of the gap between decision and feedback during programming and techniques to control that gap. "What if I do a paper design for a week, then test-drive the code? Is that TDD?" Sure, it's TDD. You were aware of the gap between decision and feedback and you controlled the gap deliberately.
While I really like the "awareness of the gap" definition of TDD, I think it's been superseded by a more concrete definition of the technique: https://tidyfirst.substack.com/p/canon-tdd
The more I think about it, the more I think this concrete definition ("Canon TDD") is the more helpful one. "Awareness of the gap" is so all-encompassing of how TDD experts think that it can't be falsified or criticized. "Canon TDD" gives us something to talk about. Now we can discuss when TDD is appropriate and why.
My read of Artem's post, and my diagnosis of what's going wrong for him, is a little different. I see his response to TDD as symptomatic of systems where programmers:
- write a lot of glue code
- or are doing UI-heavy work
- or aren't experts in separating I/O from logic, or interfaces from engines
When my job is just gluing pre-existing pieces of unfamiliar software together, I can't effectively write tests for my glue up front. Since I don't know how the other pieces are actually going to behave, I can't write correct mocks for them. Sometimes I write system-level tests instead, but when the glue code is simple, this feels low-value.
UI work is similar, with the added wrinkle that I often can't know exactly what I want the UI to be until I play with it. Bob Martin said somewhere[1] that he often forgoes TDD for UIs and prefers to "fiddle them into place." I agree.
Of course, these problems can be mitigated by pulling out as much logic as possible into code that's disentangled from I/O. The current culture in React JS (which from skimming Artem's blog I infer he's a user of) is to lump state, I/O, and UI together and basically just make a mess. So I think that may be part of the issue here.
Well TDD and all test driven frameworks are broken by definition (including proto-iterate-test method).
It is a little #troll because to accept TDD we have to accept a “sufficient solution”. Test is an experimental approach and since the 19th century (from what I remember) we know that a test can infirm but not confirm. That’s why using a test approach we cannot have a general solution… we only have piece of code that works in certain circumstances.
As I say, it is a friendly troll, and TDD is a method that help the resolution of a problem by “saving” all steps that lead to a sufficient solution. And from my point of view, it is enough.
On the other hand, the proto-iterate-test thingy only save what the developer think at the very end with all the issue our brain have to remember the complexity of our work.
I mostly agree, but I think I disagree with this: "using a test approach we cannot have a general solution… we only have piece of code that works in certain circumstances."
We *can* often arrive at general solutions by test-driving, *if* we make sure to refactor away duplication and conditionals.
The ability of tests to infirm but not confirm the correctness of our code is analogous to a scientific experiment's ability to falsify but not prove a hypothesis.[1] This is why refactoring is such an important part of TDD. Refactoring is to TDD what Occam's Razor is to science.
The practice of constant refactoring ensures that we're creating general solutions, not just making our tests pass with ad-hoc code (i.e. one if-statement per test).
While there's truth in the idea that testing can only make code less wrong, the situation is not nearly as dire as e.g. Jim Coplien has claimed in "Why Most Unit Testing is Waste,"[2] where he says that in order to adequately test a function `add(a int, b int)` you need 2^64 tests. If you know the program is simple (you can see the code, after all) there's no need to test every possible input.
It was an insightful read! However, I wouldn't use TDD when there is high uncertainty. Sometimes, it is important to experiment and prototype key aspects while moving fast and breaking things.
I think the biggest issue that almost nobody talks about here is scope. TDD was preached to me (and many others) as writing tests for every single line of code you have. This is literally insane.
In my experience, this is why people discount TDD. It's too many tests for unsure APIs that will simply serve as barriers while you are still exploring, trying to get to a good solution.
The "unit" in unit test became a class. And because of how people interpret the single responsibility principle, classes tend to become methods. Add all of this up and writing tests first feels like a complete waste of time. After doing this for a while, people associate TDD with this mess, eventually abandoning it altogether.
I don't practice TDD as much anymore because of these exact reasons. These issues happened in multiple code bases through decades in my career. TDD is still a great theory, but every implementation I've seen in the past 10+ years was so corrupt that I now mostly avoid it. Misguided evangelists and incentives to not think, often turn this great idea into a burden instead of a power tool.
My experience with TDD is perhaps a bit odd. I was interested in it for many years, and read quite a bit about it, but I never used it seriously until I built a ~30kloc greenfield lua project using it from day one. But, I personally found it a very useful approach almost right away, although there was certainly a non-trivial learning period of a few months before I felt mostly fluent at it. (Also, I've never really understood the Chicago/London school distinction, so I won't use those terms.)
Several of your statements just don't reflect my personal experience, with TDD and with software design more generally.
- Regarding testing every line of code "being stupid": My practice was to implement _behaviours_ for a class one at a time, with one or more tests, as required, for each behaviour. These behaviours were rarely a single line, but were also rarely more than 10 lines of change.
- Also, by behaviour I definitely do *not* mean a single method: instead, I try to focus on little use cases (ie, examples) of how I'd usIe the class, which usually involves two or three method calls to achieve a useful end. I perceive this as similar to Kent's example above, but unfortunately the stack and similar simple classes are so simple that it's hard to see the distinction.
- I agree completely that a sea of classes with a single method is almost certainly a bad design, but I don't see why that's associated with TDD per se. I have seen code with classes broken down too finely, all with "TDD" tests that were excessively complicated in order to force the required pre-requisites to test a particular method call. So, I guess one of my signs that I've broken down a class too finely, is that it's tests become complicated.
Perhaps a related anecdote. It's been a while since my serious use of TDD, but I recall feeling at that time that TDD was quite beneficial when I had decided on the next fairly concrete direction of coding: I could dive in and use TDD effectively to follow that path. At the end of such a path, I often had the feeling of "popping up out of the gopher hole" to look around at the broader landscape and see what needed to be done next.
- In the many years before I discovered TDD, I still had the habit of pausing frequently in my coding to sort of lift away from the code and look around to see if I was happy with how the code looked, if there were any small tidyings I wanted to do, etc. After such a pause, and whatever tidying I felt like doing, I'd choose the next few steps of progress and drop back into coding again, and I think this habit in part "pre-adapted" me to find TDD comfortable.
I think I may be sympathetic with Kent's description, and TDD in general, because I've always tended to look for ways to decompose a problem fairly early on. Part of my "pauses" described above was always to look for awkward code, and try to find better function / data structure breakdowns that made the code simple instead of complicated.
I still find TDD beneficial, but my critique is that the implementation of it often leads to bad outcomes because of people's obsession and oversimplification.
When you have fine grained tests to every single class, you (seemingly) inevitably have to rely on a great number of mocks to test all of these classes that sit in the middle of big chains of calls. Once that pattern of writing tests for everything while mocking most collaborators sets in, you end up with brittle and shallow tests (sometimes hundreds of them) that constrict any sort of refactoring and also don't really bring any meaningful value. Most of these tests become copies of the implementation.
Now, of course the obvious answer is: they are misusing mocks! Well, yes. But why? Why does this keep happening? I reckon it's because of the obsession of testing every method of every class. You either re-test the same behavior multiple times, or you mock it.
Take a simple example of a class chain: A -> B -> C -> D.
If you wrote a single test for A without mocking, many evangelists will tell you it's not a "unit" test, because you'd be testing B, C and D. For me, most times this is the best way to test it, because it allows me to refactor B, C, D or even completely change the chain, with a test that gives me confidence I won't break anything. Of course there are exceptions if these classes are themselves very complex which would prompt me to test those individually too, but generally, they aren't complex.
If you test A without mocking, you'll have to make sure everything B, C and D does, works. If, let's say, C makes an API call, you now have to mock it. Another consequence is that when testing B, you'll be testing C and D again. When testing C you'll test D too, and so on.
If you go with the mocking route (which, in my experience, is the most common one) you now test that A calls B with mocks, B calls C, and so on. You now have tests for every method of every class, which is good, right?
However, following this approach will leave you with rigid systems and unrealistic tests that make it very easy for integration bugs to slip through with everything green.
It's rigid because now to refactor this, almost every line of code you change will break tests, even if your behavior hasn't changed.
The tests are unrealistic because you mock so much, that integration issues will show up. B's behavior changed, we still assume it returns or does something, and A's tests will still pass, because all we test is that we call B. This issue is especially common with languages with dynamic types.
So at the end of the day, I like testing, but I don't like what was preached to me as pure TDD, where testing every single class ever created is expected.
I don't believe tests should only be at the top level either. What to test and how to do it is a conscious decision that has to be made. You can still use the TDD cycle which I like, but every TDD evangelist I've worked with will tell me "that's not TDD" because you're not writing tests for every method on every class.
So, back to my original point, I believe the TDD theory is sound and beneficial. The implementation of it ends up with big simplifications like "test every class" that end up corrupting the incentives, turning testing into a burden, disincentivizing refactoring and eventually even pushing people away from it.
Maybe I've been very unlucky, but in 15 years I've yet to find a TDD implementation that wasn't riddled with these issues.
What you described is the London School. And I that is the reason your TDD was lame. Start writing bigger classes. And don't test them in isolation. Start testing the whole modules. That's Chicago School.
Maybe I wasn't clear enough. I agree that the "unit" in testing should be bigger than classes. But the issue lies in the team environment. You can't go rogue and break their patterns/rules without conflict.
In practice, it's normally preached "london school" - as you call it - or you're doing it wrong. Maybe I was just unlucky, but I've seen this in 3 very different companies.
I agree that this behavioral composition skill exists. I feel like I can apply it to UI work, but only once I have had experience with the problems of a similar type and mastered the components of the system.
How would you recommend improving this skill without having to fiddle around for several projects and spend many years? Are there books, courses, or exercises that specifically target this particular skill?
This reminds me of your comment at DDD Europe - paraphrasing from memory: it turned out that 70% of software cost was in the maintenance phase; why not make it 99%.
The original poster implies that the system is “done” after his first set of iterations. That’s really unlikely.
One advantage of TDD (/BDD/etc) is to assume that you’re never done and that behavioural composition goes on for the lifetime of the system: many years or decades.
And that value should be deliverable early: the skateboard not the car.
Start with tiny behaviour, compose, iterate (over those many years or decades).
Hi Kent,
I started TDD deliberately just a year ago. And it provides me all the benefits you have described here and there. But I think what is often underestimated is TDD as a tool for learning and rapid skill improvement. I think my Software Design skill never increased so steep as in the last year.
This view came up when listening to a conversation between Dave Thomas and Dave Farley. Where Dave Thomas described that he dropped TDD for 6 or 12 months and there was now bug increase or design decrease. So I just started wondering what this means. I think it’s definitely a learning practice, while making meaningful progress on the problem you are paid for.
Pairing is not always feasible, AI not always accurate enough. TDD is a very high available high quality feedback loop that supports learning.
It's much easier to tackle a programming task by breaking it down into smaller, actionable steps. Being able to test these smaller steps increases confidence in trying new approaches and algorithms.
I struggled with recognizing this for years. I like to switch back and forth between big picture thinking and detail thinking. Everyone I found that talked about TDD for my first few years of exploring it acted like you only did little picture thinking, and then magically the big thing appeared without any intervention from yourself. I thought I was missing something important - the first place I saw someone acknowledge that you were responsible for how these little things went together to make the big thing was in the book "Growing Object Oriented Software, Guided By Tests".
But I love this post, I think it is something that those that know what they are doing take for granted, and forget to teach. I should be shouted from the rooftops, it should be one of the principles taught first.
Breaking big problems into smaller parts is a key skill in software development. It lets us work on parts separately or in steps, and helps avoid many issues caused by coupling. A lot of developers have trouble making software that is decoupled because they don't know how to identify and separate these parts to work on them one by one and then put them all together. This also makes testing easier, as it allows for testing small parts one at a time, something many people find hard to do with big, complex problems.
> “If I wanted to be supported by tests as I code, how could I have designed & coded this differently?” & I’ve often found a way I could have applied TDD if only I had known.
I think this fundamentally gets down to going from "known to unknowns" (as you once taught me, it's pretty much how I always attack problems these days when I'm stuck.)
Often times the "known" thing is the next behavior change you want to make, especially if you already have testable code. In that case, the thing you can do to get immediate insight is actually writing the test, because writing the test is cheap and exactly describes the "what", no new insight is needed here.
Sometimes that's not the case. There's enough uncertainty about what you're trying to build. A good example of this is when you're trying to figure out a function signature: you may not want to just start by writing a test case for the new function signature, rather you'll call the stub function from another function as the next piece of code you'll write, because that's the information you need: you want to understand "what" the function should be.
Another example is that if you want to get insight into if something is technically possible at all, you would jump straight into the code and put assertions or exceptions or prototype to understand current behavior. Only once you validate feasibility would start writing tests. (would you throw away the prototype code you wrote before writing the test? Unclear, maybe context dependent.)
I think this can be extended to larger cases. If you're dealing with code that's not designed for testability, then refactoring for testability is the unknown. Perhaps you have to work with the code for a bit before it becomes clear how you need to make the code testable.
For me, TDD is a technical expression of "Explain it to me like I'm five." It's amazing how many experts I've worked with over the decades can't simplify without time to "go think about it for a while." The few edges and corners of difficult to test are always held up as amusing examples of why not TDD at all.
It seems to me that programming without tests is like climbing without a rope; as long as the climb is trivial for your level, it can seem fine.
Hi Kent, great points. It's always a great experience learning with your vision.
In my experience, some people describe TDD only as the "Test First" approach. But it's not; it's a group of techniques and ways of thinking that aim to help you make better decisions while programming.
In the "TDD by example", you mentioned:
TDD is an awareness of the gap between decision and feedback during programming and techniques to control that gap. "What if I do a paper design for a week, then test-drive the code? Is that TDD?" Sure, it's TDD. You were aware of the gap between decision and feedback and you controlled the gap deliberately.
While I really like the "awareness of the gap" definition of TDD, I think it's been superseded by a more concrete definition of the technique: https://tidyfirst.substack.com/p/canon-tdd
The more I think about it, the more I think this concrete definition ("Canon TDD") is the more helpful one. "Awareness of the gap" is so all-encompassing of how TDD experts think that it can't be falsified or criticized. "Canon TDD" gives us something to talk about. Now we can discuss when TDD is appropriate and why.
My read of Artem's post, and my diagnosis of what's going wrong for him, is a little different. I see his response to TDD as symptomatic of systems where programmers:
- write a lot of glue code
- or are doing UI-heavy work
- or aren't experts in separating I/O from logic, or interfaces from engines
When my job is just gluing pre-existing pieces of unfamiliar software together, I can't effectively write tests for my glue up front. Since I don't know how the other pieces are actually going to behave, I can't write correct mocks for them. Sometimes I write system-level tests instead, but when the glue code is simple, this feels low-value.
UI work is similar, with the added wrinkle that I often can't know exactly what I want the UI to be until I play with it. Bob Martin said somewhere[1] that he often forgoes TDD for UIs and prefers to "fiddle them into place." I agree.
Of course, these problems can be mitigated by pulling out as much logic as possible into code that's disentangled from I/O. The current culture in React JS (which from skimming Artem's blog I infer he's a user of) is to lump state, I/O, and UI together and basically just make a mess. So I think that may be part of the issue here.
[1]: https://blog.cleancoder.com/uncle-bob/2013/03/06/ThePragmaticsOfTDD.html
For the GUI, The Humble Dialog Box is your friend.
https://martinfowler.com/articles/humble-dialog-box.html
Well TDD and all test driven frameworks are broken by definition (including proto-iterate-test method).
It is a little #troll because to accept TDD we have to accept a “sufficient solution”. Test is an experimental approach and since the 19th century (from what I remember) we know that a test can infirm but not confirm. That’s why using a test approach we cannot have a general solution… we only have piece of code that works in certain circumstances.
As I say, it is a friendly troll, and TDD is a method that help the resolution of a problem by “saving” all steps that lead to a sufficient solution. And from my point of view, it is enough.
On the other hand, the proto-iterate-test thingy only save what the developer think at the very end with all the issue our brain have to remember the complexity of our work.
So… #tddForza :P
I mostly agree, but I think I disagree with this: "using a test approach we cannot have a general solution… we only have piece of code that works in certain circumstances."
We *can* often arrive at general solutions by test-driving, *if* we make sure to refactor away duplication and conditionals.
The ability of tests to infirm but not confirm the correctness of our code is analogous to a scientific experiment's ability to falsify but not prove a hypothesis.[1] This is why refactoring is such an important part of TDD. Refactoring is to TDD what Occam's Razor is to science.
The practice of constant refactoring ensures that we're creating general solutions, not just making our tests pass with ad-hoc code (i.e. one if-statement per test).
While there's truth in the idea that testing can only make code less wrong, the situation is not nearly as dire as e.g. Jim Coplien has claimed in "Why Most Unit Testing is Waste,"[2] where he says that in order to adequately test a function `add(a int, b int)` you need 2^64 tests. If you know the program is simple (you can see the code, after all) there's no need to test every possible input.
[1]: https://github.com/benchristel/benchristel.github.io/wiki/TestDrivenDevelopment#analogy-to-the-scientific-method
[2]: https://github.com/benchristel/benchristel.github.io/files/13542031/Why-Most-Unit-Testing-is-Waste.pdf
That being said, test works perfectly to find bugs #infirm :P
I’m also more of prototype iterate test person but I still appreciate you writing this, Kent
After I read yours and Artem’s, I find both of you more in agreement than not.
> & I’ve often found a way I could have applied TDD if only I had known
I’ll emphasise the “if only I had known”
Because that’s the crux of the issue. With hindsight almost everything can be TDD.
This is the most important part that you and Artem are in agreement with.
If I’m right abt the crux of the issue, I’m less sure behavioral composition really solves that crux
Behavioral composition actually helps me o come up with a better guess at how parts of a larger thing work together.
IN that respect, it can be equally applied to TDD or prototype iterate test PIT henceforth
But it’s never a guarantee that the best guess will turn out right
It was an insightful read! However, I wouldn't use TDD when there is high uncertainty. Sometimes, it is important to experiment and prototype key aspects while moving fast and breaking things.
I think the biggest issue that almost nobody talks about here is scope. TDD was preached to me (and many others) as writing tests for every single line of code you have. This is literally insane.
In my experience, this is why people discount TDD. It's too many tests for unsure APIs that will simply serve as barriers while you are still exploring, trying to get to a good solution.
The "unit" in unit test became a class. And because of how people interpret the single responsibility principle, classes tend to become methods. Add all of this up and writing tests first feels like a complete waste of time. After doing this for a while, people associate TDD with this mess, eventually abandoning it altogether.
I don't practice TDD as much anymore because of these exact reasons. These issues happened in multiple code bases through decades in my career. TDD is still a great theory, but every implementation I've seen in the past 10+ years was so corrupt that I now mostly avoid it. Misguided evangelists and incentives to not think, often turn this great idea into a burden instead of a power tool.
Hi Luis.
My experience with TDD is perhaps a bit odd. I was interested in it for many years, and read quite a bit about it, but I never used it seriously until I built a ~30kloc greenfield lua project using it from day one. But, I personally found it a very useful approach almost right away, although there was certainly a non-trivial learning period of a few months before I felt mostly fluent at it. (Also, I've never really understood the Chicago/London school distinction, so I won't use those terms.)
Several of your statements just don't reflect my personal experience, with TDD and with software design more generally.
- Regarding testing every line of code "being stupid": My practice was to implement _behaviours_ for a class one at a time, with one or more tests, as required, for each behaviour. These behaviours were rarely a single line, but were also rarely more than 10 lines of change.
- Also, by behaviour I definitely do *not* mean a single method: instead, I try to focus on little use cases (ie, examples) of how I'd usIe the class, which usually involves two or three method calls to achieve a useful end. I perceive this as similar to Kent's example above, but unfortunately the stack and similar simple classes are so simple that it's hard to see the distinction.
- I agree completely that a sea of classes with a single method is almost certainly a bad design, but I don't see why that's associated with TDD per se. I have seen code with classes broken down too finely, all with "TDD" tests that were excessively complicated in order to force the required pre-requisites to test a particular method call. So, I guess one of my signs that I've broken down a class too finely, is that it's tests become complicated.
Perhaps a related anecdote. It's been a while since my serious use of TDD, but I recall feeling at that time that TDD was quite beneficial when I had decided on the next fairly concrete direction of coding: I could dive in and use TDD effectively to follow that path. At the end of such a path, I often had the feeling of "popping up out of the gopher hole" to look around at the broader landscape and see what needed to be done next.
- In the many years before I discovered TDD, I still had the habit of pausing frequently in my coding to sort of lift away from the code and look around to see if I was happy with how the code looked, if there were any small tidyings I wanted to do, etc. After such a pause, and whatever tidying I felt like doing, I'd choose the next few steps of progress and drop back into coding again, and I think this habit in part "pre-adapted" me to find TDD comfortable.
I think I may be sympathetic with Kent's description, and TDD in general, because I've always tended to look for ways to decompose a problem fairly early on. Part of my "pauses" described above was always to look for awkward code, and try to find better function / data structure breakdowns that made the code simple instead of complicated.
I still find TDD beneficial, but my critique is that the implementation of it often leads to bad outcomes because of people's obsession and oversimplification.
When you have fine grained tests to every single class, you (seemingly) inevitably have to rely on a great number of mocks to test all of these classes that sit in the middle of big chains of calls. Once that pattern of writing tests for everything while mocking most collaborators sets in, you end up with brittle and shallow tests (sometimes hundreds of them) that constrict any sort of refactoring and also don't really bring any meaningful value. Most of these tests become copies of the implementation.
Now, of course the obvious answer is: they are misusing mocks! Well, yes. But why? Why does this keep happening? I reckon it's because of the obsession of testing every method of every class. You either re-test the same behavior multiple times, or you mock it.
Take a simple example of a class chain: A -> B -> C -> D.
If you wrote a single test for A without mocking, many evangelists will tell you it's not a "unit" test, because you'd be testing B, C and D. For me, most times this is the best way to test it, because it allows me to refactor B, C, D or even completely change the chain, with a test that gives me confidence I won't break anything. Of course there are exceptions if these classes are themselves very complex which would prompt me to test those individually too, but generally, they aren't complex.
If you test A without mocking, you'll have to make sure everything B, C and D does, works. If, let's say, C makes an API call, you now have to mock it. Another consequence is that when testing B, you'll be testing C and D again. When testing C you'll test D too, and so on.
If you go with the mocking route (which, in my experience, is the most common one) you now test that A calls B with mocks, B calls C, and so on. You now have tests for every method of every class, which is good, right?
However, following this approach will leave you with rigid systems and unrealistic tests that make it very easy for integration bugs to slip through with everything green.
It's rigid because now to refactor this, almost every line of code you change will break tests, even if your behavior hasn't changed.
The tests are unrealistic because you mock so much, that integration issues will show up. B's behavior changed, we still assume it returns or does something, and A's tests will still pass, because all we test is that we call B. This issue is especially common with languages with dynamic types.
So at the end of the day, I like testing, but I don't like what was preached to me as pure TDD, where testing every single class ever created is expected.
I don't believe tests should only be at the top level either. What to test and how to do it is a conscious decision that has to be made. You can still use the TDD cycle which I like, but every TDD evangelist I've worked with will tell me "that's not TDD" because you're not writing tests for every method on every class.
So, back to my original point, I believe the TDD theory is sound and beneficial. The implementation of it ends up with big simplifications like "test every class" that end up corrupting the incentives, turning testing into a burden, disincentivizing refactoring and eventually even pushing people away from it.
Maybe I've been very unlucky, but in 15 years I've yet to find a TDD implementation that wasn't riddled with these issues.
What you described is the London School. And I that is the reason your TDD was lame. Start writing bigger classes. And don't test them in isolation. Start testing the whole modules. That's Chicago School.
Please reply in a thoughtful and respectful manner.
Maybe I wasn't clear enough. I agree that the "unit" in testing should be bigger than classes. But the issue lies in the team environment. You can't go rogue and break their patterns/rules without conflict.
In practice, it's normally preached "london school" - as you call it - or you're doing it wrong. Maybe I was just unlucky, but I've seen this in 3 very different companies.
I agree that this behavioral composition skill exists. I feel like I can apply it to UI work, but only once I have had experience with the problems of a similar type and mastered the components of the system.
How would you recommend improving this skill without having to fiddle around for several projects and spend many years? Are there books, courses, or exercises that specifically target this particular skill?