Steve's Pad

Crash logs hunt

April 3, 2007

When Path Finder crashes an automatic crash reporter pops up asking you to send the crash log to us. You might wonder where are all those crash logs actually getting sent and why is it important to send them? Who looks at them and what does this weird text output that you send mean anyway?

Well, that's my job. First let me present myself a bit - my name is Alexandra, I'm doing testing, forum support, manage the bug tracker at Cocoatech, and a part of my job is to look at crash logs every day, sorting them out and organizing them in a database. How am I doing it? Glad you asked :)

First a bit of technical voodoo. Understanding crash logs is actually rather simple. At the moment the program crashes, it was executing something, typically in several threads (means it was doing several tasks simultaneously). One of the threads was executing what you were actually doing just before the crash occurred and ran into problems. This thread is then marked as "crashed". That's where I'm looking.

Crashlog.png


First I try to determine if I've already seen a similar crash in other crash logs - if I have a feeling that I have, I run a spotlight search on my database and see how many identical crashing threads I have. If it turns out that there are many - clearly we've got a problem. If I didn't find that it's a common crash - I look at the comment you wrote, this gives me an additional clue of what might have happened. If you wrote a comment, but the stack remains mysterious - I put the crash log in an appropriate folder - for example crashes that happened while moving files - in one folder, crashes on renaming - in another one, etc. So when I see that people get a lot of crashes while doing one particular task, even if the crashing thread of those crashes is not the same, something's definitely wrong.

Now comes the question - is it crashing in our code or somewhere else? This is also rather easy to see. If you look at the crash log above, you can notice that the left part of the output contains names like this: com.apple.CoreFoundation, com.cocoatech.cocoatechAppKit, lemkesoft.GraphicConverterCMI and so on. These are unique identifiers of applications, frameworks, but also input managers, contextual menu plugins and so forth. The execution of a thread, as displayed in a crash log, goes from bottom to the top - i.e. the last function that was executed appears on top. Typically "bad guys" are somewhere in the middle - the program is able to continue somewhat, but rapidly gets to an undesirable state, not being able to find what it expects and thus crashes. For example in the crash log above the Graphic Converter contextual menu was very likely the reason of that crash. This way I can sometimes clearly identify a third party contextual menu plugin or an input manager / haxie as being the problem and in this case I usually email the person back, telling her about that. If a third party plugin or an input manager causes Path Finder to crash often, I email the developer of the plugin in question. However the percentage of those crashes remains minimal. Lots of crashes are mysterious beasts not identifiable by any race or kind, living in an obscure forest of... eh, now my inspiration suddenly ended when it just started to be thrilling :)

Ok, so that's when you won't hear anything from me - a lot of crash logs just don't show any useful info. That means that the crashed thread showed only system frameworks or only a couple of system methods, etc. Those crashes can reveal OS bugs though, that's why Apple crash reporter comes up when every program crashes and not only an Apple-one. Since all programs rely on Apple's system frameworks, they also would like to know if a lot of third-party programs crash in a particular part of a system framework for example.

Now how does it actually help us? Consider the following situation: back in December just after release of Path Finder 4.6, I suddenly started getting an abnormal quantity of crash logs, all containing the same stack that showed problems loading Stuffit contextual menu plugin. After some investigation, it rapidly turned out that built-in Stuffit 11.0.1 was incompatible with the contextual menu of previous versions of Stuffit that people happened to have installed on their systems. We were able to fix the problem quickly, releasing a 4.6.1 emergency update just in 5 days following 4.6 release. Without the automatic crash reporter the problem would have taken a lot more time to identify. Not all common crashes are fixable in such an easy way though. Why would you ask? If you see that your program crashes in a particular function, you can just go to that part of code, see the problem and fix it, right? Unfortunately, it's only rarely the case. Most crashes are just like indices on the place of crime, they help, but if you will be able to find the cause depends on many things, the most valuable being the ability to reproduce the crash.

So, please, do send every crash to developers *and* to Apple with a short comment describing what you were doing when the crash happened (and pleeaase no "just woke up" or "having lunch" :) ) . You will help to make the developer world a happier place!

Posted by grotsasha at April 3, 2007 1:10 AM | TrackBack

Comments

1. Posted by: mocenigo at April 3, 2007 2:21 PM

Some crashes are really strange beasts, that lurk around and are difficult to reproduce. Consider the ctask-not-starting bug. I have a folder now with 4096 files on a dual g5, and that bug is not appearing yet. Weird.

Nice entry. We need people doing the hard work, finding bugs or processing the junk produced by the others that find them.

Only, next time some more juicy details, please ;-)

Roberto

2. Posted by: Alexandra at April 3, 2007 2:34 PM

What do you mean by juicy details?

3. Posted by: mocenigo at April 3, 2007 2:38 PM

Well, some tech pr0n. More screenshots. A snippet of a gdb window showing some arcane objective-c invocations. But it was more like a bit provoking.

Question: what was the most hideous bug you had to hunt?

Roberto

4. Posted by: Alexandra at April 3, 2007 2:49 PM

Well, all gdb and debugging stuff, that's not me who's doing that generally. Also the question about the most tough bug is probably best addressed to Steve, our developer. The truth is that fixing crashes is rarely easy. Also quite often after long debug sessions (and sometimes some back and forth with Apple), it turns out it is an OS Bug and there's nothing we can do.

Post a comment




Remember Me?

Comments Preview:


Trackback Pings

TrackBack URL for this entry:
http://www.cocoatech.com/mt/mt-tb.cgi/35