olivier deheurles

Just another programming weblog

Archive for the 'Hardware' Category

High performance Journaler

As you may know I’ve spend some time porting the disruptor to .NET and I’m now quite happy with the current version. If you want to have a look you can find the source code on GitHub or play with the assemblies available on NuGet.

Of course feel free to fork and send pull request, I will do my best to merge your new goodies to the main repo.

I’m now planning to implement other core components described in Martin’s/Mike’s presentation. The next one on the list is the journaler.

Here is a quick overview of this component:

  • it is responsible of persisting to disk every single message (byte array) received from the network adapter
  • it is consuming messages from a ring buffer (disruptor) and should store messages to disk as fast as possible (minimize latency and maximize throughput)
  • it is a mono-threaded component, writing to disk serially
  • since the disruptor can deliver messages in batch, the journaler should take advantage of this mechanism: if multiple messages are available in the ring buffer the journaler thread should pick as many messages as possible before flushing to disk (nice mechanism to absorb messages bursts)
  • the journaler should be designed with mechanical sympathy in mind: data is send to disk by block, filling blocks as much as possible before flushing to disk sounds like a good idea..
  • the journaler should minimize allocations (GC pressure) and ideally be alloc free
  • of course messages should be persisted to disk in a binary format that allow messages to be read later on and the journaler API should offer both read and write functionalities

You can find a more in depth overview of the Journaler and other LMAX components in Martin Fowler’s review of the LMAX architecture.

I’m going to create a new GitHub project to host this component. My plan is to first define the journaler API, then build a performance test project and then I will play around to test different implementations. I would like this component to be stand-alone and have no dependency on third party components (disruptor included).

If you are interested and want to contribute let me know: mail at odeheurles dot com

No comments

Hardware, Parallel computing and stuff…

About one month ago I watched a video describing the architecture of LMAX (Financial Exchange) and realized that I did not know that much about hardware. I made some researches and found really good documents, blogs and videos and I thought it may be a good idea to share my findings…

First I would recommend to have a look at LMAX - How to Do 100K TPS at Less than 1ms Latency presented by Martin Thompson and Michael Barker, in about one hour it gives a pretty good overview of the challenges you have to face when building a HPC system with high level of contended concurrency. Comments below the video also worth having a look to get more details on there architecture.

Then there is an excellent paper from Ulrich Drepper (Red Hat): What Every Programmer Should Know About Memory. It presents current commodity hardware architectures focusing on :

  • RAM - don’t be afraid by this first part, you can skip most of it,
  • CPU Caches, cache coherency protocols, etc. – very interesting and important to understand too
  • Virtual memory
  • the second half of the document focuses on “what programmers can do” (must read) and “Memory performance tools” (relevant if you are working on Linux systems).

Paul E. McKenney, working on Linux kernel, is writing a book on parallel programing: Is Parallel Programming Hard, And, If So, What Can You Do About It? I won’t give a feedback yet because I’ve not yet finished it yet but I can already say that it’s really worth reading.

I have read as well several white papers from Herb Sutter (one of the C++ big names), you can find them on his web site, in his books and articles section. If you prefer videos there is “Machine Architecture: Things Your Programming Language Never Told You video on YouTube with corresponding pdf slides.

Interested by parallel programming and wants to learn more about wait-free, lock-free, obstruction free, etc.? If you are, you should really go on Dmitry Vyukov’s website 1024cores and read the introduction and articles in order, they are pretty quick to read but very informative. You will find as well lots of algorithms and data structures for parallel programming – MUST READ. You should subscribe to it’s blog as well.

All documents and videos above are presented from the perspective of Linux/C++/Java but are very relevant even for a Windows/.NET developer.

Now if you want to learn more about Windows and .NET I would recommend:

  • Joe Duffy’s blog and book are must read for parallel programming on Windows,
  • Interested by internals of .NET Garbage collector? Maoni Stephens, working on the GC, presents latest evolutions of the .NET 4 GC in a Channel9 video, read here blog for more details.
  • Patrick Dussud, one of the Microsoft Technical Fellows, and one of the CLR founders and chief architect of the .NET Garbage Collector, has some videos on the .NET GC internals here and here. You will notice that I’m not the only French guy with an horrible English accent ;)
  • I’ve downloaded as well the open source code of the .NET platform (Rotor). You should really have a look. For instance you can find the C++ source code of the execution engine (\sscli20\clr\src\vm) and the corresponding BCL code in .NET (\sscli20\clr\src\bcl\system). Did you ever wondered how .NET objects are stored internally? Where is the implementation of “extern” methods for the BCL classes? It’s there! The document Object Internals is a good companion to start browsing this large code base.
  • CLR: Vance Morrison blog and Jeffrey Richter’s blog and book CLR via CSharp
  • Windows internals: Mark Russinovich, Technical Fellow and Windows Kernel guru, has a good video “Inside Windows 7” on Channel9 and his book Windows Internals, the absolute reference for Windows’ OS core.

Other interesting stuff to read:

  • Ring Buffer implementations (CPU Cache friendly + optimisations): MCRingBuffer and LibertyQueue white papers,
  • False sharing: here and here (Herb Sutter strikes back)

Last thing to say: I’ve been reading quite a lot lately and it’s pretty hard to keep track of what you have read, what you would like to read, etc.

  • I’ve found Read It Later service extremely useful: it integrates with your browser  (Chrome for me), a single click and you have added a document to your list of stuff to read and your mobile devices (iPhone, iPad for me) get synchronized automatically. It works quite well for HTML pages.
  • For PDFs on Ipad/iPhone, GoodReader is a must. Note that it can connect to your DropBox if you have one, very useful to share documents between devices.

Enjoy…

No comments