Catching float- and struct-returning messages to nil

posted by Tim on 03.11.07 @ 7:46 pm

Wim came up with a neat trick a while back that we’ve used to find and fix several bugs in our software, and to file a bunch of Radars. There are several messenger dispatch functions in the Objective-C runtime. Of particular interest here are objc_msgSend_fpret and objc_msgSend_stret. These are used by the compiler when calling a method that returns a float or struct, respectively.

Depending on your architecture, the result of such a message can be undefined when sent to nil. Messaging nil is very useful most of the time, but you can introduce rarely manifesting bugs in this case.

Looking at the disassembly for these two functions in gdb, though, gives us an easy way to catch them. Under 10.4.8/x86, we see the following:

(gdb) x/50i objc_msgSend_fpret
0x90a573c0 :        mov    4(%esp),%eax
0x90a573c4 :      test   %eax,%eax
0x90a573c6 :      je     0x90a57420
0x90a573c8 :      mov    0(%eax),%eax
...

That is, load the first argument, check for zero, if so jump to 0x90a57420.

Likewise, in objc_msgSend_stret:

(gdb) x/50i objc_msgSend_stret
0x90a57340 :        mov    8(%esp),%eax
0x90a57344 :      test   %eax,%eax
0x90a57346 :      je     0x90a573a0
0x90a57348 :      mov    0(%eax),%eax
...

In our ~/.gdbinit we can have:

# Nil-handling path for objc_msgSend_fpret.
b *0x90a57420
comm
  p (char *)$ecx
end
# Nil-handling path for objc_msgSend_stret.
b *0x90a573a0
comm
  p (char *)$ecx
end

(where the print command shows the selector).

Cool! I like it.

I don’t _like_ it as it’s fragile, but then again, so are such bugs…. Probably it’s as good an approach as you’ll ever get. What’s the location under 10.5? That’s the risk of putting it in your .gdbinit

This is pain enough to make it worth Apple’s while to export a symbol for you.

[…] Tim Wood shows what to put in your gdbinit so that you can catch when your code incorrectly sends a message nil. […]

Having a symbol would be nice; it’s pretty trivial to find the address, but you have to remember to update your .gdbinit in the first place.

The symbol under 10.5 is liable to change, but x/i will let you fine it quickly.

Sadly, I have only PPC machines (no objc_msgSend_fpret), but I caught a few objc_msgSend_stret cases with this. Thanks!

If you want something less fragile, you can set the _objc_nilReceiver variable to something non-nil, and messages sent to nil will be delivered to that object instead. The difficulty is that the same object is used for all kinds of messages, so the receiver had better have the right method signature for any given selector — you can’t dispatch based on the selector, because it’ll be in the second or third argument register depending on whether it was a non-struct-returning or struct-returning call in the first place. (And id-returning messages to nil are harmless, well-defined, and very common.) I found it easier to just scan the objc_msgSend_stret assembly for the jump and break there.

I found that this was breaking on a lot of methods I couldn’t do anything about, so I looked up my backtrace and found

#0 0×90a573a0 in objc_msgSend_stret ()
#1 0×933c98be in -[NSView rectPreservedDuringLiveResize] ()

which led me to making my breakpoint conditional:

b *0×90a573a0 if *(void **)$esp != 0×933c98be

On intel. On ppc, your return address is is $lr instead of *$esp.