04 May 2010

Often times, a message queue is used as a way to publish "events" to other services in the enterprise. The publish/subscribe architecture decouples clients from the senders and alleviates the publisher from specific knowledge of the consumers of the messages. This, plus the asynchronous nature of a message queue - the publisher does not block while clients consume the events - makes it ideal for publishing events to keep other systems aware of the state of a given system.

Now, let's establish the use case: we want to consume events in a very busy system. In our case, it's possible to receive multiple events. Or, perhaps in your system you've positioned the message queue as a way to deliver commands - "pings" - using the "command bus" pattern. It may - and quite often is - be acceptable to ignore duplicate requests in architectures like these. For example, a "command" message notifying a system that it can being processing a batch of data for the day only needs to be handled once per day, not 10 times, even if 10 different events are published. It'd be ghastly and inefficient to process the load 10x a day. What's required is some way to make message submission idempotent for certain messages - to make them indifferent to duplicate message submission.

Backstory: I've been playing with JBoss' HornetQ a lot recently. It's a very fast message queue: it recently bested ActiveMQ in the SpecJMS2007 benchmark by more than 300%!. It is able to perform these feats because it uses a native, asynchronous IO layer on Linux centered around the kernel's libaio functionality. On all other platforms, it's just gosh darned fast, regardless, but doesn't benefit from the native code acceleration.

So, imagine my surprise when I found out that HornetQ supports something it calls a Last-Value Header - a well known message header that - when the value is duplicated by other messages - causes the submitted message to override the existing message: the latest message with a duplicate header wins.

Here's how code submission to the queue looks using Spring's JmsTemplate functionality:

this.jmsTemplate.send(this.destination, new MessageCreator() { 
    public Message createMessage(final Session session) throws JMSException { 
        TextMessage textMessage = session.createTextMessage( ... ); 
        return textMessage; 

So, it's often easy to find a business value that can be used to derive a semantically correct, unique key to identify duplicate events. Processing a customer's order with 3 items in the shopping cart at 12:30:30s PM? Build a key combining the 30 second window, the customer ID, the count of items, and the order ID. This provides a service-level mechanism to prevent nasty double submit issues, for example.

You need to enable this characteristic on the queue itself in the configuration files.

In HornetQ there are a few files under the $HORNETQ/config/ folder that you need to be aware of to configure HornetQ: hornrtq-jms.xml, hornetq-configuration.xml, and hornetq-users.xml. In this scenario, we need to only modify the hornetq-configuration.xml.

For a queue configured in hornetq-jms.xml

   <queue name="dupesQueue"> 
        <entry name="/queue/dupesQueue"/> 

... you'll need to make the following changes to hornetq-configuration.xml:

  <address-setting match="jms.queue.dupesQueue"> 

Simple, right? So, go ahead, send all the messages you want - only one will remain (unless of course that message is consumed. This only guards against duplicate submissions assuming the messages haven't been delivered yet. Enjoy!