Untangling Spaghetti: Debugging Non-Terminating Object Programs
Make an understandable mess, fix the bug, then make it easier to understand
Originally posted February 2009. This is a level of deeply geeky geeking about the tactics & philosophy of programming that I don’t often cover these days. I’m not sure why not. More to come.
Summary: To debug non-terminating object programs, successively inline the methods on the stack until you have a self-referencing method. Seeing the problem in a single method helps you identify the source of the problem and identify possible fixes.
Introduction
I had fun today debugging an infinite loop. Some of the techniques we used build on things I've talked about recently so I thought I'd reflect on the experience a bit. There's no happy ending to my story, though: the defect remained at the end of the session. Still, the techniques are worth thinking about because infinite loops in object programs can be difficult to debug. I wish you better luck with yours.
Infinite Iteration
The classic infinite loop is simple enough:
while (true);
Nothing in the body of the loop affects the termination condition. All infinite iterations can, I think, be reduced to this construct, even if they have a lot of stuff going on in the conditional and loop body that turns out to be irrelevant to loop termination. Eliminate the irrelevant details and you're left with:
while (true);
.
To cure infinite iteration you need to figure out what needs to happen in the loop body that will affect the termination condition. Do you need a conditional break:
if (shouldStop) break;
Do you need to iterate a counter? Advance a pointer? Timeout? Change the loop termination condition? You need to find something that will bring the proceedings to a close.
Infinite Recursion
Recursion adds a wrinkle to non-terminating computations. Spotting an infinite iteration only requres looking at a single stack frame. Spotting infinite recursion requires looking at the whole stack. The reduced infinite recursion:
foo() { foo(); }
results in a repeating stack:
foo()
foo()
foo()
...
The cure for infinite recursion is figuring out under what conditions the recursive call should not be made
(if (stop) return; else foo();)
It can be a challenge to figure out this condition, but you still only have to examine a single procedure to do so.
Mutual Recursion
Mutual recursion makes the story more interesting still. What happens when you have
foo() { bar(); }
bar() { foo(); }
is that you have a repeating stack, but with period 2 instead of 1:
foo()
bar()
foo()
bar()
...
Repairing infinite mutual recursion is harder than repairing simple recursion because you have more options. The problem could be in foo()
, in bar()
, or in the relationships between the two. Which one needs fixing? At what point does the computation have the information needed to forgo yet another call?
Object Recursion
In The Saff Squeeze [ed: need to revive this post next] I started to explore the joys of inlining code. I have the sense that there is a lot more to inlining than I have figured out so far. Nearly every day I program I find a new use for it, now that I'm paying attention. It is an excellent tool for exploring, understanding, and manipulating complicated control flows. I'm not ready to write the Grand Unified Theory of Inlining yet, but responding to non-terminating object programs turned out to be a good application of inlining. Here's how it works.
Spotting infinite recursion is relatively easy in the case of a single procedure:
foo() { foo(); }
Inlining transmutes the mutual recursion case into simple recursion. If I inline bar()
from above I get foo() { foo(); }
. The source of the infinite recursion is clearer. Once the code is working I can always re-extract subprocedures that communicate clearly.
Inlining in the service of debugging applies the "make it run, make it right, make it fast" mantra in reverse. If my code doesn't run, making the design worse through inlining isn't a sin if it helps me fix the defect [ed: @jessitron calls this “un-tidying”]. The additional insight I gain from making the code work in more cases often helps me further improve the design when the time comes to "make it right" again.
Objects make spotting non-terminating computations even more exciting (by which I mean "tedious and frustrating"). The periodicity of the stack can be long and is complicated by the potential presence of many different classes and objects as well as methods:
A.b()
C.d()
E.f()
A.b()
C.d()
E.f()
...
Sometimes the instances of A, C, and E are the same objects every time through the loop, sometimes they are different but somehow equivalent instances (at least as far as loop termination is concerned). The longest cycle I've seen is six frames long, which makes spotting the pattern tough. Fortunately, since the computation is infinite you can easily make as many frames as you need to provide yourself data for identifying the loop :-)
When I've run out of stack space and spotted the cycle, how do I figure out the problem? Since inlining is my trick du jour I suggest inlining. Inlining works across objects just as well as it does across procedures, although you may have to temporarily make fields and methods public to keep everything working. (Always take a snapshot of the state of the code before starting one of these sessions--you'll likely make lots of ugly changes that become irrelevant once you figure out how to fix the problem. I say this because we didn't take a snapshot...)
We picked the entry point of the infinite cycle and started inlining. Eventually we got a method that looked like A.b() {...b();...}
. We could see exactly why the cycle wasn't terminating. What still wasn't clear to us was what to do about it. Just seeing the cycle clearly seemed like progress, though, and worth writing up.
In fixing a non-terminating object program you have a couple of options: break the cycle or don't get into the cycle in the first place. For our code, I suspect not getting into the cycle in the first place was the right idea, but we didn't have time to explore this option once we'd identified the problem logic.
Conclusion
One of the things that struck me while we were working was the connection between inlining and partial evaluation. I wonder if revisiting the literature on partial evaluation would provide further ideas for the applying inlining or tools for making inlining more effective.
Another point that struck me was how infrequently this situation occurs. Is it even worth thinking/writing/learning about non-terminating object programs if they only happen to you once a year? I think the answer is "yes" (obviously, and I suppose if you've gotten this far you likely agree). Multi-object infinite loops are a hard problem to solve. It seems like there is always some ugly hack available to break the cycle, but one that is unlikely to satisfy future conditions. Having a clear plan for understanding infinite cycles gives me a chance to step back and really understand the situation and what should be done about it, to design, not just make an expedient coding change. That is how I aspire to work, and if exploring situations that happen infrequently is the cost of thoughtful development, I'll pay it.
If you try this inlining technique to understand an infinite cycle, please let me know how it goes. Happy debugging. And designing.
When I read "Refactoring" as a junior dev, I thought inlining was strange since I had only thought of extract subroutine. That weirdness made me realize that all refactorings should be symmetric. Sometimes the inverse is too obvious to justify a chapter. Turns out there is a lot of power in asking "Can I reverse this?". The ability to inline can help a lot because it reduces the penalty for premature extraction.
Inlining is something people should do more.
Often when I need to understand a code. For debug, refactor or changing functionality. I open a branch and start inlining the code so it is in single file.