Tuesday, 28 June 2011

Dissecting the Disruptor: How do I read from the ring buffer?

The next in the series of understanding the Disruptor pattern developed at LMAX.

After the last post we all understand ring buffers and how awesome they are.  Unfortunately for you, I have not said anything about how to actually populate them or read from them when you're using the Disruptor.

ConsumerBarriers and Consumers
I'm going to approach this slightly backwards, because it's probably easier to understand in the long run.  Assuming that some magic has populated it: how do you read something from the ring buffer?


(OK, I'm starting to regret using Paint/Gimp.  Although it's an excellent excuse to purchase a graphics tablet if I do continue down this road.  Also UML gurus are probably cursing my name right now.)

Your Consumer is the thread that wants to get something off the buffer.  It has access to a ConsumerBarrier, which is created by the RingBuffer and interacts with it on behalf of the Consumer.  While the ring buffer obviously needs a sequence number to figure out what the next available slot is, the consumer also needs to know which sequence number it's up to - each consumer needs to be able to figure out which sequence number it's expecting to see next.  So in the case above, the consumer has dealt with everything in the ring buffer up to and including 8, so it's expecting to see 9 next.

The consumer calls waitFor on the ConsumerBarrier with the sequence number it wants next

    final long availableSeq = consumerBarrier.waitFor(nextSequence);

and the ConsumerBarrier returns the highest sequence number available in the ring buffer - in the example above, 12.  The ConsumerBarrier has a WaitStrategy which it uses to decide how to wait for this sequence number - I won't go into details of that right now, the code has comments in outlining the advantages and disadvantages of each.

Now what?
So the consumer has been hanging around waiting for more stuff to get written to the ring buffer, and it's been told what has been written - entries 9, 10, 11 and 12.  Now they're there, the consumer can ask the ConsumerBarrier to fetch them.


As it's fetching them, the Consumer is updating its own cursor.

You should start to get a feel for how this helps to smooth latency spikes - instead of asking "Can I have the next one yet?  How about now?  Now?" for every individual item, the Consumer simply says "Let me know when you've got more than this number", and is told in return how many more entries it can grab.  Because these new entries have definitely been written (the ring buffer's sequence has been updated), and because the only things trying to get to these entries can only read them and not write to them, this can be done without locks.  Which is nice.  Not only is it safer and easier to code against, it's much faster not to use a lock.

And the added bonus - you can have multiple Consumers reading off the same RingBuffer, with no need for locks and no need for additional queues to coordinate between the different threads.  So you can really run your processing in parallel with the Disruptor coordinating the effort.

The BatchConsumer is an example of consumer code, and if you implement the BatchHandler you can get the BatchConsumer to do the heavy lifting I've outlined above.  Then it's easy to deal with the whole batch of entries processed (e.g. from 9-12 above) without having to fetch each one individually.

EDIT: Note that version 2.0 of the Disruptor uses different names to the ones in this article.  Please see my summary of the changes if you are confused about class names.

8 comments:

  1. Nice, nice post - love the diagrams! After reading every single scrap of documentation on the Disruptor pattern, this post was the one that gave me my final "Aha!" moment. Thanks!

    ReplyDelete
  2. Trisha, you should just draw the diagrams on the board and take photos :-)
    I am still trying to digest the idea of the design here: is its goal to have just one writer that make the code thread-safe without lock?

    Thanks,
    Doug

    ReplyDelete
  3. You should only have a single thread writing to a single variable at any time, to prevent locks (see the section on modifying entries in the wiring post).

    Yes, this is to prevent the use of locks as locks are terrible for performance. But the design around multiple Consumers/EventHandlers is to also allow you to run things in parallel without contention.

    ReplyDelete
  4. Hi Trisha, just read this post. It' great; but I'm wondering about visibility guarantees. If there's no locking, how would a consumer see changes made to a message earlier by a producer?

    ReplyDelete
    Replies
    1. The short version is - because of the way the sequence number is updated, the Java Memory Model ensures that all the writes that happened before the sequence number is updated are guaranteed to have happened. Therefore when the consumer sees a sequence number (let's says 14), it can be certain that all the values in the event at slot 14 are correct.

      Delete
  5. Hi Trisha, We have been trying to figure out how different WaitStrategy works, where can we find material regarding the same.

    ReplyDelete
    Replies
    1. Hi Ravi,

      I see you've posted a message to the google group, which is definitely the best place to get help. The only material I personally have is the following:

      Blocking: Lock. This strategy can be used when throughput and low-latency are not as important as CPU resource.
      BusySpin: Hard on the CPU. Fast, reduces jitter. Best to tie to a specific core
      Sleeping: Spins, yields, then parks. Best compromise but has spikes
      Yielding: Spins then yields. Compromise, less spikes

      Hope that helps!

      Delete