Thursday, May 28, 2009

Multiple Write and Single Read Mechanism


Just imagine: you are a successful financial company with a big entourage of traders slogging it out throughout the day. They make money for you and you would want to build a system that works as hard to help them. Sheer muscle may get you some where, but sooner or later you are going into big problems associated with building a distributed system that handles a huge-huge volume of data. It can be a hard to tame, resource hungry beast.

Lets say you want it to process the data supplied in a known format and send alerts/emails to all the concerned traders (some rule based events). You also want it to be accessible over the web to your team sitting in different corners of the country. On top of it, the speed is of paramount importance.

My team has solved a similar problem and it has been one hell of a experience. Don't know from where I should try to collate my experience. In each article I will summarize a problem we faced and the solutions that we opted for.

We were facing a nasty problem with our messaging system. Our system had three queues where messages were posted to and read from. Each of the queue had multiple consumers and the posting was done using a pooled connection factory (ActiveMQ). The problem was that after some time the consumers would refuse to pick any fresh messages from the queue, as if they were stuck forever. We suspected it to be an ActiveMQ bug, but at the end it turned out to be a problem with our threading and caching. Our threads were getting dead-locked blocking the ports from which a consumer could read anything. We were also trying to cache the whole method call instead of the results. Got it rectified and learnt a few more things about the threaded systems and caching. In the process learned quite a few things from my manager and colleagues. Thanks to Andrew and Ben.

First, we wanted to use a ReadWriteLock mechanism in which a number of threads write to a list for a particular amount of time after which another thread processes all the data written. When the data is being read, no thread can write to the common pool and the reader should allow any thread that was already writing to complete before attempting to read. We opted to use nuts-and-bolts multi-threading instead of delving deeper into java's lock mechanisms. Here is a generic class that can do that for you:

MultiWriteSingleRead.java

In the next post we will explore using a cache on the same class so as we process information selectively.

No comments: