Saturday, March 25

Pimp My Code, Part 9: Beginner code.

Today a fellow blogger asked me to pimp his post. Since it's only two lines, I figure I can take a break from my busy schedule of, uh, drinking and stuff, and help a brother out.
So I did the Challenge problem in Chapter 4 of Cocoa Programming for Mac OS X, Second Edition. I've come up with two different "solutions".

Solution 1: "Screw retain counts"
- (IBAction) reportCharacterCount: (id)sender
{

NSString *inputString = [inputField stringValue];
[outputField setStringValue:[NSString stringWithFormat:@"%@ has %d letters.",
inputString,[inputString length]]];
}

Solution 2: "I'm a good boy"
- (IBAction) reportCharacterCount: (id)sender
{

NSString *inputString = [[NSString alloc] initWithString:[inputField stringValue]];
[outputField setStringValue:[NSString stringWithFormat:@"%@ has %d letters.",
inputString,[inputString length]]];
[inputString release];
}

For pedagogical reasons, could someone tell me what the difference is between the two? And if possible, which one is better?

(As you can probably guess, my very first solution consisted of version 1 with a release, which brought me to the debugger in a hurry.)
The short answer is the difference between the two is that the second one wastes time and memory to no good effect. There are several problems with the second one: for example, if you really wanted a immutable copy of a string, you should just use [[inputField stringValue] copy] and not -initWithString:, because the latter always allocates a new string, whereas the former will just return the same object with an increased retain count if the original string was already immutable. Now that's fast!

But, in fact, there's no point in doing this copy of the inputField's string, for two reasons. First off, when you call -setStringValue: on outputField it's really the field's job to make sure it holds on to a immutable copy of the string you've passed in, so it's going to call -copy or do something similar itself. (It's true there were bugs in early versions of NeXTstep where sometimes mutable strings would be retained or returned instead of immutable ones, but those are mostly ironed out now.)

Secondly, and more importantly, you're not actually passing this string directly to your output NSTextField, you are generating a new, autoreleased string in your +stringWithFormat: call, which has the inputField's stringValue as a sub-string. Now, leaving aside the actual implementation details of +stringWithFormat:, it's a given that it will somehow keep an immutable copy of any strings you pass into it. Otherwise, honestly, nothing in this damn system would work.

Less code is better if it's functionally the same, and the second implementation is absolutely no safer in any way. Even if you were, say, messing with multiple threads at some point, so the value of inputField could change during your action method, both implementations would be failures, so there's really no conceivable situation in which the second implementation is better.

Also, what's up with that blank line at the beginning of your methods? Seriously, that isn't helping anyone.

Finally, I should point out both implementations are really non-optimal in the post-10.3 world: what you should really do is bind the 'value' inputField to a new instance variable in your controller in Interface Builder (say, NSString *inputString;), and then bind outputField's 'value' to your controller with the path of, say, 'outputString', then write the following:

Solution 3: Bindings
+ (void)initialize;
{
[self setKeys:[NSArray arrayWithObject:@"inputString"]
triggerChangeNotificationsForDependentKey:@"outputString"];
}

- (NSString *)outputString;
{
return [NSString stringWithFormat:@"%@ has %d letters.",
inputString, [inputString length]]];
}

Admittedly, this isn't really less code, and in fact it change the semantics of your app: eg, it doesn't require you to send the action to populate the outputField. But bindings are generally the way you should code these days; if you find yourself using target/action, or otherwise manually pushing or pulling values to or from controls, think hard about using bindings instead.

Labels:

Thursday, March 16

This Post is Microsoft Enhanced (TM).

Ok, look, I know this is a family blog (What exactly does that mean, really? That Dad reads the blog aloud to his 2.5 kids as he smokes a pipe in his easy chair with his hair all full of brillcreme?), so I don't want to get too ribald, but when I was a kid there were only two kinds of condoms, regular and with bumps, "enhanced for her pleasure."

This apparently meant embedding grains of sand in the latex. I remember that, growing up, I had seen condoms and had read Where Did I Come From? (hippie parents), but had no idea how these grains of sand were meant to enhance sex, even the sort of smooshy-gooshy idea I had of sex back then.

Honestly, I'm 36 now and still have no idea what the hell they were thinking. I don't know if they sell those condoms any more. If they do, I'd like to caution my younger viewers: I've never met a female whose response wasn't: "OUCH! SON OF CHRIST! WHAT THE FRAK! WHY WOULD SOMEONE PUT GRAINS OF SAND IN THAT FRAKKING THING?" (Only, without all the BSG talk.)

--

I think of the phrase "enhanced for her pleasure" every time I boot up my stupid digital video recorder (DVR), because somehow Microsoft managed to talk my cable monopoly into selling me a box that constantly, on every screen, in the upper-right corner, says, "Microsoft Enhanced."

And, much like tiny grains of sand in your genitals, this recorder has so many idiotic annoyances and glitches that I wonder, "What do you guys think enhanced means?" and "WHY would you WANT your name associated with this?" (To both Comcast _and_ Microsoft, I think that.)

I mean, ok, it can, mostly, record TV programs. I mean, sometimes it does. Other times, it just decides not to. I mean, it'll SAY it's going to record all episodes of LOST at 9PM on Wednesdays on Channel 102, and there will BE Lost at 9PM this Wednesday on Channel 102, but it'll say, "Nothing more scheduled to record," even though it's the one that made up the criteria, and it did manage to record the first half of the season, and I can see the other shows in the schedule. Maybe the second half of the season stinks? I'll never know, because, hey, it decided not to record it. (Only 78% full! I checked!)

Microsoft Enhanced! For her pleasure!

But, hey, I'm fine with having to come home before my favorite shows and unschedule them and then reschedule them so my DVR will suddenly decide it's OK to record them again. I like to think of it as sort of rotating the tires; I keep the show schedule fresh. It's amazing how much less TV you'll watch when you have to reschedule everything every week, too. Besides, it's not like the POINT of having a DVR is you DON'T have to be there to start each recording manually. And this is first-generation technology, right? Right? I mean, there's NEVER been any other brands of DVRs, right? Microsoft invented this shit, no? And enhanced it?

But my actual least-favorite "enhancement" is that this DVR decides, whenever I turn it on, that it should default to showing me some stupid channel, full volume. Because, god, what would I do alone in a room with only silence to comfort me? Mayhap I would die. Now, silly "user interface experts" would argue that the idea behind a DVR is that you're a discriminating TV viewer; that you, in fact, hate the vast wasteland that is TV, and that every moment you are exposed to programs that aren't among the 4 or 5 tolerable ones you have selected is actually quite painful. (Think I'm exaggerating? Watch all of "Elimidate." Just once. Then let's talk about pain.)

So, I turn on this stupid DVR (whisper it with me as you press the power button... "microsoft enhanced"... like the wind at your windows) and it starts BLARING whatever is playing on whatever channel it last recorded, because that's probably the best default option, right?

It wouldn't be reasonable to present me, silently, with a menu of shows I've recorded, right? No. Because the most likely answer to "Why did Wil turn on the TV at 2:15AM?" is not "to watch one of the five shows he recorded this week, as he does every week, which is all he ever does with this stupid device," it's "TO WATCH THE LAST 15 MINUTES OF ELIMIDATE! BONK-BONK-WONK-WONK WONK-WONK WONK-WONK, BEOMP BEOMP BEOMP!""

But even that's not enough. As I work through the menu systems, it continues to give me a feed of some random channel in the upper-left hand corner. I'd like to point out that there are something like 900 channels on this system. It doesn't care which one it's showing me. It's just knows that I need to be seeing TV, dammit. I don't care if it's infomercials for electric knives on the Discovery channel at 4AM, that shit is BETTER THAN SILENCE. SILENCE IS THE SOUL-KILLER.

WHAT THE FUCK IS WRONG WITH THE PEOPLE WHO WORK AT MICROSOFT! SERIOUSLY! Do they really enjoy all this noise? IT'S MICROSOFT MOTHERFUCKING ENHANCED, FOR HER MOTHERFUCKING PLEASURE!

If I watch a show, like the season finale of Battlestar Galactica this year, which left me shaking with anger and pleasure at the same time, and then, after the screen fades silently to black, I go back to the main menu, my little Microsoft Enhanced friend is right there for me. "Show's over! Let me pick a channel and show you crap! Full volume! Right up here in the upper left! Having trouble picking a show because I'm yelling at you? About crap? Crap you don't care about? Just ignore me! Go on, pick a show! C'mon, what's wrong, you freezing up? Just pick one! Hell, I'll do it! I'm Microsoft Enhanced! For her pleasure!"

There's other annoyances. For example, if you fast-forward through commercials, and then press "play" when you're actually, you know, back into the program, an incredibly sophisticated set of things happens inside the DVR, none of which are "pleasing." The first is, the machine is so poorly-designed that ANY keypress takes a second or so to register, so it'll appear that you did NOT press play for a second, and the PVR will blithely zip ahead for a while longer. Then, to make up for this fact, they will REWIND the recording some bizarre and ever-changing amount based on their assumption of how slow they think you might react to the show having starting up again.

I'm guessing they tested this DVR on people with Parkinson's disease, because they typically rewind a full THIRTY seconds to before the spot where I pressed play. Try imagine the kind of calculation you have to do on the fly: "Ok, I saw the show begin, so I need to press play sometime soon so that I can start the pause to when the auto-rewind will take me back to before some amount based on the current fast-forward mode I'm in..." I usually spend thirty or forty seconds oscillating around the beginning of the show. It's like Microsoft doesn't WANT me to skip ads! Whaaa? What about her pleasure? Please tell me you didn't forget about her?

And then let's talk about the pause button. If I press the pause button during live TV, which I do a lot, the machine actually waits a second to pause, then stutters, then replays the last two seconds and adds a bonus second I haven't seen before, then really pauses. It's like a kid who wants to test your limits when you tell him to shut up. "Hey, give me a candy bar! Give me a candy bar! Give.." "SHUT UP!" ".. Give me a candy...?" When I first started pressing pause I'd get fooled by the stutter every time and think it hadn't REALLY paused, so I'd press it again, so in another second it'd resume, which made me think maybe the pause button really was the mislabeled "two-second-respite" button.

The stuttering pause is annoying all the more because the pause button is my only savior, because when I first turn the DVR and TV on, 'pause' is the ONLY button that will stop the noise. Let's think about this. You turn this device on and there is NOTHING you can do, short of finding the one button out of 800 that's labeled 'pause' (by definition, a temporary block) to stop it from blaring crap at you. Crap that you don't want, and never wanted. Now, I'd like to point out, the "stop" button doesn't work in Live TV mode. You can't say, "No, I don't want a channel going, I'd just like silence." Sure, you can pick WHICH channel of crap you to view, and you can ask for a brief break from crap, but you can't pick "No Crap." Because, you know... the enhancements... the ones made by Microsoft.

And if you pause the live feed for that glorious, brief release, and choose and then play a program you like, and then stop that program... the PVR will go back to live. Yes, it unpaused itself while you were watching what you actually wanted to watch. "Hey, uh, we noticed you left us all paused, there, for like an hour, and, uh, we figured, 'that can't be right', so we just restarted the channel of crap there for you in the upper-left. No charge this time, buddy. Just thank Microsoft Enhanced."

The impression you get from this device (admittedly, sold to me by Comcast, my cable company BY LAW MOTHERFUCKER) is that TV is not just like a faucet, it is more like a stream. If you block the flow of TV for more than a few minutes you will JAM UP THE CABLE FLOW AND CAUSE SERIOUS HARM TO SOMETHING. THE PRESSURE WILL BUILD. THE ENHANCEMENTS WILL SUFFER. THOSE BASS FISH AREN'T GOING TO CATCH THEMSELVES. THE WINDBAGS ON CNN HAVE IMPORTANT STORIES ABOUT MURDERS AND KITTENS TO TELL. I feel like I'm in Max Headroom: 20 Minutes in the Future, where all TV sets were permanently stuck "on" by law.

Here's the kicker: pausing the live feed actually starts a timer, because the PVR is still recording its randomly-selected channel in the hopes that you'll later give it a reprieve and watch whatever random crap was on. I mean, maybe you just need to go to the bathroom, and didn't want to miss the part where the electric knife can cut through a frozen pizza AND a slice of bread -- LENGTHWISE! But, here's the best part: after a couple hours of recording, the PVR will run out of disk space from recording this live TV feed. (Which, remember, you cannot turn off.) What would be a reasonable thing to do here. Let's look at this as dispassionate UI designers, OK?

You're making a PVR. Someone pressed pause six hours ago. You've been recording live video with the assumption the person might come back, but now you've run out of disk space. The screen has been frozen for six hours. Might you, the putative UI designer, assume at this point that the person has left? That their TV is now off and they're done with the system? And, thus, when you run out of disk space, you'd say, "Heck, I'll hold onto this live feed stuff until the user comes back, or until some other program comes on that the user actually asks me to record, but until then I'll just go into sleep mode."

NO! IDIOT! The TV must flow! The person probably fell into a stupor. The logical thing to do is RESUME LIVE PLAY. That'll wake 'em up!

And indeed it has. No joking, in the days before I learned this cute little UI decision, I paused my DVR (instead of turning it off), turned off my TV (but not my 500 watt stereo), went to bed, and been woken up THREE TIMES in the last year at 4AM to some ads blaring at the top of their lungs in the next room, because the DVR ran out of air and just HAD TO GET IT ALL OUT! I CANNOT WASTE THE PRECIOUS STREAM OF CRAP! Let me tell you, waking up to this will make a grown man wet his bed. I'm not proud of it, but there it is.

Microsoft Enhanced. For their pleasure.

Labels:

Saturday, March 4

Pimp My Code, Part 8: Mary, Mary, why you buggin?

In today's episode, instead of harshing on some poor fellow's code, we trace the short, miserable life of a real, live bug, from commission to detection to squashing, hopefully illustrating some general techniques for would-be exterminators.

--

About a month ago Mike Lee, my bodyguard and majordomo, was trying to replicate our biggest crashing bug in Delicious Library by running it in gdb and replicating a user's actions. This bug actually wasn't even particularly common, but since we don't have very many crashing bugs in Delicious Library (compared to other projects in which I've been involved), and since I wanted to go drinking and not hang out and watch him screw around with Delicious Library, I took an interest in what he was doing.

Mike was loading certain books from Amazon.com to re-create the library of a user who reported that when he loaded or reloaded some items they would crash Delicious Library every time. Note the emphasis here. Whenever you hear "every time" in a bug report you should breath a sigh of relief, because you'll be able to find the bug. If something is reproducible, it's fixable. Period. It may be a pain, you may have to start cutting out large parts of your program and doing a binary search of your code, but you can fix it if it's reproducible.

In this bug, loading certain items would always crash DL for the user, while loading other items always worked fine. Other users had different items that would cause a crash, and the vast majority of users had no items that would cause a crash. We also couldn't get any items to crash on our machines.

To sum up: we have a crasher bug, that's bad, but it's reproducible for the user, that's good. Different users have different cases that cause crashes, that's bad again, and Mike couldn't reproduce the bug himself, and that's real bad. Also, I want to go drink, which is good in small doses but bad in large doses. What to do?

To his credit, Mike had taken the right first step (try to duplicate the conditions of the crash), but he was starting to spend too long on it (and more time debugging is less time boozing). It was time to move on.

Step two was to examine the exception that was being thrown right before the crash. NSArchiverArchiveInconsistency - *** NSUnarchiver: inconsistency between written and read data for object 0x0 Hmm, doesn't mean anything to me. Except, let's look up "NSArchiverArchiveInconsistency" on Google. A little reading informed me that it can be thrown when reading in bad data from an archive.

An archive? But I don't use NSUnarchivers in Delicious Library, do I? The next part of step two was to search my project in Xcode for NSUnarchiver and its superclass NSCoder. I found that I didn't use them anywhere NEAR the Amazon book-loading code. This made no sense! Sure, NSUnarchiver is also used in NIB loading, but, again, this wasn't anywhere near that kind of code.

Ok, move on. When you're looking for a bug, the best thing to do is keep an open mind. Play with ideas, try things out, then move on. Don't get hung up on one approach. Just try to get a feel for the whole system and the answer will come to you.

Step three was to look at the backtrace:

Backtrace
0x96323f10: _NSExceptionHandlerExceptionRaiser (in ExceptionHandling)
0x92896f3c: +[NSException raise:format:] (in Foundation)
0x92893ff0: __decodeObject_old (in Foundation)
0x928a5060: -[NSURLCache _diskCacheGet:] (in Foundation)
0x928a4b34: -[NSURLCache cachedResponseForRequest:] (in Foundation)
0x928a3f94: -[NSURLConnection(NSURLConnectionInternal) _performLoad] (in Foundation)
0x928a3a00: __resourceLoaderPerform (in Foundation)
0x9074afc4: ___CFRunLoopDoSources0 (in CoreFoundation)
0x9074a43c: ___CFRunLoopRun (in CoreFoundation)
0x90749ebc: _CFRunLoopRunSpecific (in CoreFoundation)
0x928a3760: +[NSURLConnection(NSURLConnectionInternal) _resourceLoadLoop:] (in Foundation)
0x9287c2b4: _forkThreadForFunction (in Foundation)
0x9002c3b4: __pthread_body (in libSystem.B.dylib)


Oooh, that's ugly. This isn't the main thread. This isn't even a thread I created. This is Apple's thread. I never even called any of these methods, this is something that got called as a side-effect from my code.

Now, let me state something unequivocally: 98% of the time when you think you've found a bug that is not your fault, it really is your fault. The other 2% of the time... well, it's probably your fault as well.

However, in this case... well, damn. We're deep in Apple's code. But I *know* Apple's got lots of code that uses urls and the web and stuff. I mean, hey, Safari, right? They've tested this. I don't believe their dang URL cache stuff has crashers this bad in it, unless I'm doing something odd to trigger them. So I can't just say, "Meh, Apple's fault, too bad, bitches," and be done with it.

Let's look closer at the trace. At the top, we're handling an exception. Big surprise, we knew that. Below that, we're raising the exception, which we also expected. Under that, we're decoding an object. Alright, so now we've confirmed where we suspected our "NSArchiverArchiveInconsistency" exception came from -- Apple was trying to decode an object and hit something it did not like. Check. But what was being decoded, from where did it come, and why was it corrupt? There are a lot of questions we need to answer.

The first interesting line for us is the one that starts 0x928a5060, because it says, [NSURLCache _diskCacheGet:]. Now, wait a minute. NSURLCache? That's a class I've heard of. I look that up in Apple's documentation and it does what I might expect; it's for caching stuff that is loaded off the web.

Now it's time to let you in on something that you didn't know, so you couldn't have solved this mystery up to this point: in Delicious Library we load book descriptions from Amazon.com. In August of 2005 we released our 1.5.2 version, which disabled Apple's built-in disk caching of data that is downloaded using the Cocoa frameworks. This was done because the disk cache could conceivably grow quite large (with cover images) and server no purpose: there was never any need to look at the cached data, as whenever we reloaded from Amazon.com we really wanted the latest data, not a cached version.

Back up a second. We're getting a crash in NSURLCache, in a method called _diskCacheGet:, but I just said that Delicious Library does not use a disk cache. Something's wrong here, right? We've got two assumptions that don't go together. You can smell the cognitive dissonance. This is your best friend. When you smell this smell, you know you're close to solving the whole thing.

What's the next step? Check your assumptions. Never say, "Well, I did blank, I know I did blank, blank is done, and that's the blank story!" Look at the actual code that does blank, and make DAMN SURE it really blanks.

How did we disable the disk cache in 1.5.2? We called [[NSURLCache sharedURLCache] setDiskCapacity:0].

What exactly does this do? Check Apple's documentation. "Sets the receiver’s on-disk cache capacity to diskCapacity, measured in bytes... The on-disk cache will truncate its contents to diskCapacity, if necessary."

Now it's time for a leap of logic. We know that we're not having trouble when we're writing to the disk cache, we're having trouble reading from the disk cache (we turned off writing, and the backtrace clearly shows us loading from the disk, not writing to it). And, about the cache limiting call we added in 1.5.2: it doesn't actually prevent us from reading the disk cache, it just keeps us from writing any new data to the disk cache.

So, what happens if someone has been using Delicious Library from 1.5 or before (so they already have a disk cache, because it was enabled back then), and then they upgrade to 1.5.2, and reload one of the books they loaded under 1.5?

You wouldn't expect this to crash, because either the book is still in the disk cache, in which case it'd load from the cache (and at worst just have old data), or the cache has been correctly truncated, so the book would not be in the cache, so the book would load from the net. In neither case would you expect a crash.

But here's where we see a possible case that Apple could have missed, because they probably haven't shipped a program that originally had caching on, but then turned it off in a later version. So what if they never really tested their cache truncation code, and it turns out when you set the disk capacity to 0 Cocoa corrupts that program's existing URL cache, and makes it so any time you get a cache hit your program crashes? New programs that set their caches to 0 would never have a problem, and neither would old programs that don't mess with their cache size. One might reasonably think that Delicious Library was one of the few programs that had an older version with a disk cache and a new version that truncated it, and thus was the first to hit test this part of Apple's code (and find the bug).

This would explain why Mike couldn't replicate the user's problems -- he was starting on a machine that had never run 1.5 or earlier, so it didn't have any truncated (eg, corrupted) items in the URL cache (or any items at all, since we disabled writing them). It also explains why certain random items crashed for certain users upon reload -- whichever items they happened to have sitting in their URL cache when they upgraded from 1.5 to 1.5.2 would become corrupted for all time, and those would forever crash instead of loading. And it explains why a lot of users never saw this bug; they didn't reload old books or they started with version 1.5.2 or greater, so they never hit Apple's corrupted cache.

--

We have a theory, but how do we test it? Two ways: first, Mike asks some users with this problem to send us their URL cache directories (~/Library/Caches/Delicious Library/), and he copies their directories into his home directory and tries loading the items they say to cause crashes. He is able to replicate their crashes, so we've done a positive test. Mike then asks all the users who have reported this problem to blow away that URL cache directory for Delicious Library and try again. 100% report the problem is gone (yay!), and thus we've done a negative test.

We're now pretty darn sure. All that remains it to write a little code so that future users of Delicious Library don't encounter this problem. Sure, it's Apple's bug, but users don't know that and wouldn't much care even if they did; if my program crashes it's on my head, so I'm going to work around this bug. Here's the code, for your edification:

Workaround code
// Turn off NSURLCache (we sort of keep our own cache...)
and remove the existing cache directories so we don't get strange crashers from NSURLCache
(NSArchiverArchiveInconsistency) in Apple's code. (File RADAR!)
NSArray *applicationCacheDirectories = NSSearchPathForDirectoriesInDomains(NSCachesDirectory,
NSUserDomainMask, YES);
unsigned int applicationCacheDirectoryIndex = [applicationCacheDirectories count];
while (applicationCacheDirectoryIndex--)
[[NSFileManager defaultManager] removeFileAtPath:[[applicationCacheDirectories
objectAtIndex:applicationCacheDirectoryIndex] stringByAppendingPathComponent:[[NSProcessInfo processInfo]
processName]] handler:NULL];
[[NSURLCache sharedURLCache] setDiskCapacity:0];


From the time I noticed Mike's activity to my writing this workaround code took about a half hour. Mike spent the next day verifying the fix and nursing a hangover.

--

What are the lessons here? First off, when you're tracing down a bug, keep your mind open. Don't get married to a theory or an approach. This isn't like programming, you've got to defocus your mind, read a bunch, look at your code. Don't think "I know I did this," think, "I *thought* I did this, but maybe I'm crazy." Question EVERY LINE around the crash. Does "if" really mean what I think it does? Does "*" bind tighter than "++"? Did I really write the correct variable name there, or did I write in some different variable and I keep reading it as the correct variable because that's what I meant to write?

Second, when you've got a possible theory on the cause of a bug, don't stop until you can prove to yourself how such a bug could exist and go undetected up until now, whether it's your fault or someone else's. It's not good enough to say, for example, "Oh, must be Bob's fault." If you can't explain how and why Bob made such a mistake, and then devise a workaround or fix, you don't really know what's going on, and thus you don't really know if you've fixed the bug. And, worse, you didn't really learn anything, so you're not going to be any better at it next time.

I never, ever fix a bug, even from another user, unless I can say with some surety that I can see the circumstances that led to that person writing that bug. It may be as simple as "oh, copy and paste error" or "this person was tired and overlooked this edge case, and that didn't come up until now." But if you see a bug and think, "How in the hell did this code ever work?" then you've got a lot more work to do (assuming the code did ever work), because you aren't allowed to move on until you can answer that question.

Understanding your code, and the underlying code of the environment you're in, is more important than fixing any one bug.

Many mojitos died to bring you this information.

Labels: