Halloween problem revisited

This morning I was dragged out of my bed with a phone call from a friend who’s also a web developer. Apparently he got a serious bug in his code and April 1st is the launch date. So I got myself over to his office and had him show me the problem. The problem was that for some reason, occasionally when a user tried to remove a record from DataGrid, he’ll either get a Null Exception or have wrong records being deleted.

My poor friend literally flipped every stones in his code to uncover this bug, but he couldn’t find out what the cause was. I suddenly had a nostalgia for the infamous “Halloween Problem” that I ran into in the past. After an hour of debugging and running some test code, we came to conclusion that it was indeed the Halloween Problem!

Halloween problem revisited

The Halloween problem often shows up when iterate through a collection and delete item(s) at the same time. In this case, the act of deleting an item causes the iteration to misbehave. In my friend’s case, when user removes a record directly from DataGrid, the pointer used for iteration got messed up. The result was items that should be deleted were not, and instead, the good records got deleted.

The other case that Halloween problem usually occurs is when you work with XML nodes. Many people use DOM API to loads entire XML document to memory, then start enumerating and removing element at the same time. If you enumerate through the collections using foreach statement, it is likely that the internal pointer will get corrupted and throw the operation off. Well..actually this was the case that I ran into Halloween problem first time a few years ago :p. Since then I’ve been more careful when enumeration over actual data records is required.

And what’s with the name “Halloween”?

AFAIK, the problem was first discovered on Halloween by a group of database engineers. They found that by running a test query that alter certain value of records will cause those records to re-appear again further down the enumeration. Apparently that test query alters the column that’s used for indexing, altering that column value causes the record to be re-indexed and picked up again down the list. Ultimately it causes an infinite loop.

HalloweenProblem.EXE

I ended up writing some test code to show my friend the effect of this problem. I found that Microsoft is indeed aware of this problem (I’m not sure which version of .NET that it started). .NET framework will throw an exception when Halloween problem is detected.

HalloweenProblem.exe throws exception
HAPPY HALLOWEEN!

When I first saw this I was surprised. First, I didn’t know that they fixed it. And the name, HalloweenProblem.exe? Nice one Microsoft :).

Fixing Halloween Problem

“Cache” seems to be the best practice to get over this problem. You can avoid the problem by cache the records and enumerate through the cache instead. Then, after you identify which records to be deleted, you can specifically delete them after enumeration is done. The key here is not to mess up the internal pointer.

Depending on the behavior of your enumeration code, you might be able to get away with using C# ‘yield’ keyword as well.

Teera on March 29th 2008 in Software Development, .NET

Trackback URI | Comments RSS

Leave a Reply