This is the first post in a new category of mine that I’ll post to when fixing interesting bugs. What is interesting? Well, the ones that make you scratch your head and say, “what the heck would cause that?” I won’t be posting about every bug, and I also won’t be posting any source code. However, hopefully the tidbits shown here will teach everyone about intricacies of certain areas of each operating system (most likely OS X in my case), as well as insights on design flaws or good designs, depending on the situation
Enjoy!
The bug that I fixed this morning that had me scratching my head was vubriomt. The gist of the bug is that if you use a SaveAsDialog and give it more than 4 types, it will crash. Now this is really odd — in all my years at REAL Software, I have never seen anyone use a fixed size buffer without at least an assert (and even those situations are rare!). I had a glimmer of hope that perhaps that was the bug.
Quite a few crashing bugs are really easy to fix — just run them under the debugger, and when it crashes, you see the offending line. However, this one was crashing in NavCreatePutFileDialog, the Carbon call to create the SaveAsDialog. We simply pass in an options stucture, and it didn’t like part of it.
I track down where we created the options structure, and more specifically the file types list. Nope, no fixed size buffer. We were simply setting it to the result of CFArrayCreate. Next spot to check, assigning the CFStrings. It looked like we could be leaking those (further investigation showed me that we aren’t), and finally, the array creation.
That was it! Someone forgot to multiply the number of elements by sizeof( CFStringRef * ), so we were creating a buffer that was only 5 bytes large instead of 20 bytes. However, why did it only crash after 5 types?
This is the interesting part — on Mac OS X malloc automatically aligns to 16 bytes and pads out allocations to 16-byte offsets. So, when we ask malloc to allocate a buffer of 4 bytes, the buffer returned will actually be 16 bytes long. However, a buffer overrun is a buffer overrun — this is simply an implementation detail that is subject to change. But this explains why the crash happened *after* four types: four types can sufficiently fit into 16 bytes just fine, it’s the fifth type that would overwrite the allocated space.
On Mac OS X, there is an option to use guard malloc to run your application. This slows down your application, but will cause your application to crash as soon as you overwrite your buffer. This would have been my next step if I hadn’t already spotted the error. To do this, you’re going to want to launch your application from the terminal. Try this:
Work-Intel-iMac:~/Desktop jonjohnson$ export DYLD_INSERT_LIBRARIES="/usr/lib/libgmalloc.dylib"
Work-Intel-iMac:~/Desktop jonjohnson$ ./My Application.app/Contents/MacOS/My Application
Allocations will be placed on word (4 byte) boundaries.
- Small buffer overruns may not be noticed.
- Applications using AltiVec instructions may fail.
GuardMalloc-11
Bus error
And now, if we look at the crash log (or run it under GDB since it was the framework crashing), I can see that it crashed where the source of the problem was! This is a good reason to occasionally test your program out while using guard malloc — it may highlight some obscure bugs that you would cause crashes in ways that are hard to believe.