ArrayList and Synchronized()...the misleading method
I recently found myself wondering about synchronization when using an ArrayList in managed code. I was building up an ArrayList and iterating through it. Both insertion into and iteration over the list are being done by many threads simultaneously. Naturally, I wanted to ensure my code was thread safe.
I started out by looking at the ArrayList class definition and immediately noticed the Synchronized() method. Great! This must be what I’m looking for.
So, I carried on with my programming…..
ArrayList foo = new ArrayList();
ArrayList fooSync = ArrayList.Synchronized(foo);
Then in other threads I am doing:
And finally, there is another set of threads taking something off of the front of the list as illustrated below:
Object bar = fooSync;
// Do something with bar
(This is of course a trivial version of the algorithm, but it illustrates the point)
And viola! We have misused Synchronized().
ArrayList does not guarantee that your code is thread safe, even when you have a Synchronized() version of your ArrayList. Rather, it only guarantees that the internal data structures are. This of course is logical, but that’s really what this code is assuming. Said differently, a Synchronized ArrayList guarantees that doing things like Add() and Remove() do not corrupt the list. However, anything you do above and beyond that is not protected.
In this example, we have a block of code where I am checking if there is anyone in the list, and if so doing something with the first element, then removing it. This is not thread safe. Imagine there is one element in the list and two threads. Thread1 goes in and sees that Count>0, so it starts going through the loop. Before it gets to Remove(0), it gets preempted, and Thread2 starts. Thread2 also sees that something is in the list…namely the element that Thread1 is already operating over! It too starts doing something with object bar. And well, who knows what happens from here. Maybe this is bad enough in your app (doing the action on bar twice).
(It might sound bad that this could AV, but just think of an even worse scenario. Imagine this instead….another thread has, while they are doing their thing, done an insertion in to the list, and that one is now getting removed before it is even being operated on! Now the app will continue but silently lose an object that was not operated on at all.)
Of course, this makes perfect sense. The ArrayList developers could write a version of their class which protects their data structures, but how could they possibly make this scenario work? Short of some sort of language-level contract around where atomicity might be preserved (I’m not a language guy, but I can’t see how that would work, though I don’t deny it might be possible), they can’t possibly solve this problem. How do they know where atomicity is assumed in my code?
Moral of the story: Always make sure you understand your class & method contracts.