Home > Software Design > WCP: How does your software stack up?

WCP: How does your software stack up?

Tuesday, October 18, 2011 Leave a comment Go to comments

When we link our programs together, the success of the operation reassures us that all our code and our static and global(?!) variables fit into the available memory. But what about the function parameters, the local variables and various other unknowns which go on the stack?

When someone mentions dynamic memory allocation, the worst-casers amongst us – which should be all of us – immediately start ranting against heap allocation and advocating either the somewhat safer use of fixed-size-block pools, or the ostensibly even safer but often design-compromising measure of banning of dynamic allocation altogether. But heaps and pools I’ve covered before on this blog. What we all tend to brush under the carpet is the admission that the implicit use of the stack in all our programming (in C and C++ at least) is also dynamic allocation and deserves much more careful consideration than we tend to give it.

The good news is that a LIFO stack, unlike a heap, is not subject to fragmentation. The bad news is that if it runs out of memory then, also unlike a heap, it typically fails to inform the program and, whether it does so or not, the failure is usually irrecoverable.

Worst case stack usage, on the face of it, seems easy to calculate but it actually turns out to be very difficult, as evidenced by the cost of the very few tools which claim to make a half-decent job of it. I say “half-decent” not because the tools are no good but because, by the candid admission of their vendors, they require quite a lot of help from their users, which, in turn, implies quite a lot of careful and time-consuming work.

I don’t want to go into a lot of detail here but this article by Nigel Jones on Embedded Gurus is a good starting point for those who do. The points I want to make are as follows:

  • Many of us (and I include myself) do not apply our WCP thinking sufficiently to the matter of stack provision. It is not OK to use just “intelligent guesswork” in deciding something which is so vital to the proper functioning of our software.
  • With more help from the compilers and linkers, which in general do not report to us as much as they could about stack usage, it would be much easier to work out stack sizes even for small projects where the expense of proprietary tools cannot easily be justified.
  • My feeling is that, applying the KISS principle, there is much that could be done at the individual function level,  as an adjunct to unit testing, where full conditional code coverage could and should be a realistic possibility. Once each function’s individual worst-case stack requirement has been reliably established it is then necessary only to traverse the call tree looking for the overall worst case. Although tedious, this might be easier and cheaper, for a small project, than configuring and deploying a specialised tool to analyse all the branches of the entire system. I am thinking hard about this approach at the moment.
  • Given the residual uncertainty about the worst case provision, despite all reasonable efforts to establish it, I believe it is essential to design timely stack overflow detection into our systems, mustering any MPU or other hardware assistance available, even if the best response we can arrange is a safe shutdown.
Advertisements
Categories: Software Design Tags: ,
  1. Dan
    Tuesday, October 18, 2011 at 18:05

    Good points about stack usage & allocation.

    One thing many engineers overlook – the problem that as software is refactored, debugged/fixed, enhanced, etc. its RAM usage (stack in particular) changes. Maybe now we’re using more stack in certain places, maybe now we’re using less. **But no one ever goes back & re-evaluates the stack usage.**

    I’ve worked in (with) a lot of different organizations, and I can say confidently that very few of them do very detailed stack usage analysis in the first place… and even those who do tend to do it one time, during the initial development, and then the subject is never re-visited. Typically task stacks (in the case of an RTOS) are sized conservatively, and then some kind of margin is added on top (either a fixed amount or a percentage).

    My point is that the holy grail would be some way to analyze the worst case at build time, and then have some type of stack size configuration file generated which would be guaranteed to be sufficient / correct. This file would be used at initialization (when tasks are created) to ensure stack sizes are adequate.

    This is unattainable, however, because the dynamic nature of most programs makes this difficult or impossible. For example, with function pointers, the call flow isn’t known at compile time. And with C++ and dynamic types (not to mention 3rd party libraries, etc.) the problem expands even further. Now adjust your compiler options, inline (or un-inline) some routines, get rid of some globals & pass more parameters, and bang! everything has changed again. Change an iterative algorithm to a recursive one, yep that’s gonna screw things up too…

    I guess my point is that this is indeed a complex problem and I haven’t found the proverbial silver bullet. I just now went over to Nigel’s post & read it – he identified many of the same issues, especially the points about function pointers & also not measuring once & then forgetting about it.

    I’ll be interested to see what others have to say. Although I’m a big believer in doing the required rigorous work each time, what I’d really like to see is some sort of automated process that (mostly) removes the human from the equation (in other words, a system that always ensures by the time the code is loaded on the target, we’re “guaranteed” to not have stack overflow problems). Not saying it’s attainable, but it’s an interesting thought. Until then, I think we’ll have to continue to use things like stack checking hooks, MPUs, etc. to detect the problem post-facto & recover.

  2. Peter Bushell
    Friday, October 21, 2011 at 16:19

    You make a very important point, Dan, about the need to evaluate the stack requirement at the last minute. Perhaps it need not be done rigorously for every build, but a fresh evaluation (and adjustment, if necessary) should be done before each release.

    My thoughts about doing some of the spadework at unit testing time are relevant here. If per-function figures can be obtained through an enhanced, automated test harness, then normal regression testing will produce fresh numbers each time. There remains the call-tree analysis, but that could be automated too (with some initial manual input for function pointers, which shouldn’t need to be modified too often). These are my current thoughts and, like you, I welcome other people’s.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: