This post is an answer to these twitts.
Context: Martin has been describing a technique to implement arrays of structs in Java and Kelly is looking for something similar in .NET.
As Martin pointed out, an array of structure in .NET offers the same caracteristics:
- reduced memory footprint compared to an array of objects
- GC friendly
An array of object (reference type) is allocated on the GC heap and each cell contains a reference to an object (or a null reference) allocated on the GC heap as well.
The reference by iteself is of size 32 bits or 64 bits, depending on the architecture. Each object has as well a header of 64 or 128 bits depending, again, of the architecture.
An array of structures is as well allocated on the GC heap but in that case each structure value in stored directly in the array and the structure does not have a header.
The overhead of an array of objects compared to an array of structures is:
Overhead = PTR_SIZE * 3 * ELEMENTS_COUNT
where PTR_SIZE is the size of a pointer (ie. 32 or 64 bits) and ELEMENT_COUNT the number of elements in the array.
For small classes (ie. with few fields) the difference is significant.
Impact on GC
When you allocate an array of structure of size N you actually allocate enough memory to store N instances of this structure on the GC heap.
var arrayOfStruct = new DateTime;
create an array of DateTime (DateTime is a struct) of size 100 (ie. 100 contigus blocks of size sizeof(DateTime))
Writing to the array
var aStruct = new DateTime(2013, 12, 12);
arrayOfStruct = aStruct;
The first line allocates a new DateTime, since this is a struct it is not allocated on the heap but on the stack.
The second line assign the date to element 0 of the array, effectively copying the value to element 0 of the array.
Those 2 lines do not allocate anything on the GC heap.
Mutating a structure within the array
In this example we have replaced completly the element at index 0 but if the structure is mutable it is also possible to mutate directly any element:
var arrayOfMutableStruct = new MutableStruct;
arrayOfMutableStruct.Field1 = 5;
Normally any method returning a value type returns a copy but arrays are very special: in the previous example arrayOfMutableStruct is effectively the instance stored within the array and not a copy of it, thus the assignement will work as expected and Field1 will have value 5 (within the array).
Another way of mutating a struct contained within an array is to use the ref keyword in a parameter of a function:
public struct MutableStruct
public int Field1;
public override string ToString()
static void Main()
var array = new MutableStruct;
static void Mutate(ref MutableStruct mutable)
mutable.Field1 = 32;
Note: it is generally advised to design only immutable structures in .NET (just google it to find why).
Back to the GC now: we have seen several ways of modifying the data contained in an array of struct and none of them allocates anything on the heap (ie. does not trigger any GC).
GC impact, part 2
Ok we have seen that it is posssible to write algorithm which can read and write to an array of struct without triggering any GC, this is interresting but what if another part of the system triggers a GC: is our array of struct an expensive data structure for the GC to analyse or not?
The answer is that if you structure does not reference any reference type (ie. contains a reference to an object), the cost is extremly low: when the GC look at the array it checks its type. If the type of the array is a value type (structure) which does not reference any reference type, the GC has finished its job for the full array and does not have to visit its elements. Maoni Stephens (.NET GC dev) explains this behavior of the GC in a Channel9 video. I just had a quick look and could not find it anymore… sorry :(
For an array of object this is a different story: the GC has to go through each element.
NOTE: we often see algorithms which are “allocation free/GC Free” on some specific data structures and think “That’s great, if I stick that in my codebase it will cost nothing to the GC” but that’s wrong. Maybe the algo won’t trigger GCs but another part of our system can trigger a GC and during that GC, traversing the datastructure in question might be expensive (ie. long GC pauses).
Array of structures are probably the most GC friendly multi-element data structures in .NET: it is possible to store a large amount of data, mutate them and read from them without triggering any GC (ie. allocation free). And on top of that, even if another part of the system triggers a GC, the array of struct is very quick analyse.
As you may know I’ve spend some time porting the disruptor to .NET and I’m now quite happy with the current version. If you want to have a look you can find the source code on GitHub or play with the assemblies available on NuGet.
Of course feel free to fork and send pull request, I will do my best to merge your new goodies to the main repo.
I’m now planning to implement other core components described in Martin’s/Mike’s presentation. The next one on the list is the journaler.
Here is a quick overview of this component:
- it is responsible of persisting to disk every single message (byte array) received from the network adapter
- it is consuming messages from a ring buffer (disruptor) and should store messages to disk as fast as possible (minimize latency and maximize throughput)
- it is a mono-threaded component, writing to disk serially
- since the disruptor can deliver messages in batch, the journaler should take advantage of this mechanism: if multiple messages are available in the ring buffer the journaler thread should pick as many messages as possible before flushing to disk (nice mechanism to absorb messages bursts)
- the journaler should be designed with mechanical sympathy in mind: data is send to disk by block, filling blocks as much as possible before flushing to disk sounds like a good idea..
- the journaler should minimize allocations (GC pressure) and ideally be alloc free
- of course messages should be persisted to disk in a binary format that allow messages to be read later on and the journaler API should offer both read and write functionalities
You can find a more in depth overview of the Journaler and other LMAX components in Martin Fowler’s review of the LMAX architecture.
I’m going to create a new GitHub project to host this component. My plan is to first define the journaler API, then build a performance test project and then I will play around to test different implementations. I would like this component to be stand-alone and have no dependency on third party components (disruptor included).
If you are interested and want to contribute let me know: mail at odeheurles dot comNo comments
It is not always easy to find time to learn new things and the best time I have found is the London transport: like lots of people I spend between 1 hour and 1 hour and a half per day in the tube and I’m trying spend this time intelligently, ie. not playing angry birds.
If you think about it, that’s 6-8 hours per week you can spend learning new stuff! That’s a lot…
Computer science books are generally massive beasts and I do not want to carry them in the tube, I generally use my iPad instead. GoodReader and DropBox is my preferred combo: I just organize my books within the DropBox folder on my PC and then synchronize GoodReader with the book folder, works very well over Wi-Fi or even 3G.
I watch lots of videos, my preferred sources are:
- Channel9: videos are available in mp4 directly so you do not need to do any conversion to put them on iPhone or iPad
- MIT OpenCourseWare (OCW): you can access directly lots of MIT undergraduate and graduate courses for free and some of them are available in video format. Example: Introduction to Algorithm contains 25 lectures, all available in mp4 format.
- iTunes U: you have probably already used iTunes, but maybe not iTunes U. Instead of browsing music or videos you can access courses from the best Universities. Example: Programming Massively Parallel Processors with CUDA, 16 lectures on NVidia CUDA. iTunes U is really friction less: subscribe to course, synchronize with iPhone/iPad, done.
When I find videos which are not in mp4 format I use MediaCoder video encoder: it’s free, easy to use and configure, it has templates for iPhone and iPad screen resolutions, can use your GPU (fast) and support batch encoding: you setup several videos to encode and MediaCoder will process them one after the other automatically.No comments
I’ve heard several colleagues of mine talking about CQRS and I spend some time today looking it.
Let’s have a look to the architecture diagram (I will use Udi’s one):
Figure 1: Udi’s CQRS architecture diagram
We are currently using this type of architecture in several areas for the project I’m working on, but the flow is actually different: user does not pull for data, it is pushed to him (or to be more accurate the client pulls the data once and is then notified of changes to stay in sync).
Let’s look at a real use case: the user is using a UI to trade in real-time. When the application starts-up the UI retrieves the list of trades for this user and, the same time, notifies the server of his interest for trade events (i.e. the server stores a subscription for this user).
Figure 2: Application start-up, trade subscription
You might say it does not looks like Udi’s diagram… Wait! We will get there.
Ok, now let’s look at what happens if the user decides to trade:
a trade request (command) is send to an execution engine which will do lots of fancy checks (does this client has enough credit? Is the price valid? etc.). You have 2 possible output to this workflow: all checks pass and the trade is done or one of the check fails and the trade is rejected. We will focus here on the happy path
as soon as the decision has been taken several things needs to happen:
the user needs to be acknowledged of the success or the failure of the transaction
the back office system needs to be notified (this is where the trades will actually be booked (other stored in a big database if you prefer)
other users needs to be notified of this trade: for instance the sales responsible of this client and other users belonging to the same client (client=company) want to be aware of this trade.
Figure 3: execution flow
Now the picture looks like the CQRS one but there is one major difference: the guy on the picture is looking the other way around. No, not this one… The trade notifications have been pushed to all users affected by the state change. If you look at the CQRS architecture the client is pulling for the data (the query).
Let’s review a typical user experience for a classical data driven app (or CQRS oriented):
Bob opens order view with order 1
Sara opens order view with order 1
Bob modifies order 1
Sara has stale data and she does not know yet
Sara continue doing some stuff on order 1, have a coffee and a cigarette and finally commit 20 minutes latter her work
At this stage 2 things can happen:
developers did not thought their systems could be used by more than 1 user at the same time and Bob just lost all his work: it has been overridden by Sara and he does not even know that it happened (he may figure out end of month when they will review all the orders)
developers did a better job and implemented optimistic locking and Sara is happy to learn that she can not save her work and has to do it again. She smashes her keyboard (which surprises Bob sitting right next to her) and she goes outside for another cigarette.
In both scenarios you could improve user experience by polling the server to see if the data has changed and notify the user if it’s the case, but this kind of architecture is not scalable, at all.
Pushing the data to the client when it changes is, from my perspective, a far better way to handle the problem:
users are notified in real time when a particular piece of information has changed: there is still a time window where the data is stale for some users but this window is of the order of the second (or the millisecond if you add a few million $), not an undetermined amount of time. Optimistic locking is still required but conflicts are less likely to happen,
this significantly improves users collaboration. And instead of notifying only when the data is changed, users can be notified who is currently working on the same piece of data with the same mechanisms: Sara see in the status bar that “Bob is modifying this order” and she asks him to tell her when he is done with this order. Sara is not stressed and do not need a cigarette.
this architecture is a lot more scalable than a pull model and network traffic is significantly reduced
I don’t understand why CQRS, which is using asynchronous notification mechanism sever side is not applying the same technics for the client.
It is not harder to have the UI layer notified asynchronously: there are now lots of enterprise grade technologies out there accessible to everybody: 0MQ, RabbitMQ, and lots of http push (or COMET) servers can be used to implement what I just described here in LAN or over the web.
A push system would probably delay Sara’s cancer of a good couple of years…
LMAX team released the concurrency framework at the heart of their system about one month ago and I decided to port it to .NET.
The source code, unit tests and performance tests have all been ported to .NET and current trunk is in sync with latest Javav version. Performance tests show equivalent level of performance for Java and .NET versions.
My-channels is currently working on version 6 of Nirvana which should be released end Q1/beginning Q2. I’m going to quickly run through the new features:
Direct Data Delivery
Direct Data Delivery (DDD) is the main new addition to the product, it is a new message delivery paradigm.
Before starting, a bit of background: as you may know Nirvana is used by several banks within their SDP (Single Dealer Platform). What is a SDP you may ask? Most of the financial institutions offer to their clients trading applications available on the Web, generally via a Rich Internet Application (Silverlight, Flex, HTML, etc.) or a Rich Client (Java, .NET). Clients can see bank’s prices and trade in real-time. In these environments you have pretty serious requirements for messaging: you need to:
- push prices over internet (HTTP generally mandatory to cross firewalls and proxies) to clients,
- have guaranteed message delivery for trade execution and order management.
Nirvana covers these requirements but it is still challenging to design and implement the price distribution.
- because prices are generally distributed by tier levels associated to client groups or even with specific price per client. Mapping this distribution topology to JMS Topics (called Channels in Nirvana) is not always easy: if a client can only see one level of tiering you need to make sure he can’t see other levels by defining properly Channels granularity and ACLs.
- you may have to dynamically create channels or queues at runtime, this requires Nirvana admin API, adding a bit of complexity in the system.
- to summarize the JMS model fits pretty well when you need to communicate between server side applications in LAN or WAN. You just have to define a static Channel/Queue schema with static ACLs. But when you are dealing with external users it’s different: you have to grant them ACLs dynamically when they are authenticating.
Direct Data Delivery (DDD) is going to be very useful for these kind of scenarios, let me explain how it works: when the client API connects to the realm it can choose to subscribe to a Data Stream; clients can have only one Data Stream per active Nirvana session. On publisher side you can create, with required ACLs, an other type of resource called Data Group and associate different clients data streams with these data groups. You can then publish messages to a data group and Nirvana will automatically deliver them to clients contained within the group.
How does it helps in the SDP scenario?
- no need to create dynamically channels with the Admin API: data groups can be created dynamically with the Nirvana client API,
- client-side implementation is simplified: you no longer have to subscribe to the different channels used for price distribution, every thing can be managed server-side securely.
- it’s very easy to change client’s from one tiering group to another: remove it’s data stream from one data group and add it to an other one
- DDD will support all the useful features required to distribute price efficiently: last tick cache with delta delivery enabled (snapshot + partial updates) and conflation
The client API now provides batching for find and find + subscribes to channels. Very useful again with client scenarios (latency is higher over internet than in LAN so batching is always helpful).
A new .NET API with RX Extensions
Nirvana is a Java product and the .NET API has been designed based on the Java API, so it looks a bit ‘”Java” from the eyes of a .NET developer ;-)
The new .NET API has been redesigned to provide a better experience for .NET developers and also provides access to the DDD and batching functionalities.
On top of that you can plug Rx (Microsoft Reactive Extensions) to get Nirvana event streams exposed as IObservable. I’m not going to explain here what is Rx, you can find lots of information on the DevLabs web site, on the forum or for instance on blogs of some of my mates here and here.
Stay tuned…1 comment
About one month ago I watched a video describing the architecture of LMAX (Financial Exchange) and realized that I did not know that much about hardware. I made some researches and found really good documents, blogs and videos and I thought it may be a good idea to share my findings…
First I would recommend to have a look at LMAX – How to Do 100K TPS at Less than 1ms Latency presented by Martin Thompson and Michael Barker, in about one hour it gives a pretty good overview of the challenges you have to face when building a HPC system with high level of contended concurrency. Comments below the video also worth having a look to get more details on there architecture.
Then there is an excellent paper from Ulrich Drepper (Red Hat): What Every Programmer Should Know About Memory. It presents current commodity hardware architectures focusing on :
- RAM – don’t be afraid by this first part, you can skip most of it,
- CPU Caches, cache coherency protocols, etc. – very interesting and important to understand too
- Virtual memory
- the second half of the document focuses on “what programmers can do” (must read) and “Memory performance tools” (relevant if you are working on Linux systems).
Paul E. McKenney, working on Linux kernel, is writing a book on parallel programing: Is Parallel Programming Hard, And, If So, What Can You Do About It? I won’t give a feedback yet because I’ve not yet finished it yet but I can already say that it’s really worth reading.
I have read as well several white papers from Herb Sutter (one of the C++ big names), you can find them on his web site, in his books and articles section. If you prefer videos there is “Machine Architecture: Things Your Programming Language Never Told You” video on YouTube with corresponding pdf slides.
Interested by parallel programming and wants to learn more about wait-free, lock-free, obstruction free, etc.? If you are, you should really go on Dmitry Vyukov’s website 1024cores and read the introduction and articles in order, they are pretty quick to read but very informative. You will find as well lots of algorithms and data structures for parallel programming – MUST READ. You should subscribe to it’s blog as well.
All documents and videos above are presented from the perspective of Linux/C++/Java but are very relevant even for a Windows/.NET developer.
Now if you want to learn more about Windows and .NET I would recommend:
- Joe Duffy’s blog and book are must read for parallel programming on Windows,
- Interested by internals of .NET Garbage collector? Maoni Stephens, working on the GC, presents latest evolutions of the .NET 4 GC in a Channel9 video, read here blog for more details.
- Patrick Dussud, one of the Microsoft Technical Fellows, and one of the CLR founders and chief architect of the .NET Garbage Collector, has some videos on the .NET GC internals here and here. You will notice that I’m not the only French guy with an horrible English accent ;)
- I’ve downloaded as well the open source code of the .NET platform (Rotor). You should really have a look. For instance you can find the C++ source code of the execution engine (\sscli20\clr\src\vm) and the corresponding BCL code in .NET (\sscli20\clr\src\bcl\system). Did you ever wondered how .NET objects are stored internally? Where is the implementation of “extern” methods for the BCL classes? It’s there! The document Object Internals is a good companion to start browsing this large code base.
- CLR: Vance Morrison blog and Jeffrey Richter’s blog and book CLR via CSharp
- Windows internals: Mark Russinovich, Technical Fellow and Windows Kernel guru, has a good video “Inside Windows 7” on Channel9 and his book Windows Internals, the absolute reference for Windows’ OS core.
Other interesting stuff to read:
- Ring Buffer implementations (CPU Cache friendly + optimisations): MCRingBuffer and LibertyQueue white papers,
- False sharing: here and here (Herb Sutter strikes back)
Last thing to say: I’ve been reading quite a lot lately and it’s pretty hard to keep track of what you have read, what you would like to read, etc.
- I’ve found Read It Later service extremely useful: it integrates with your browser (Chrome for me), a single click and you have added a document to your list of stuff to read and your mobile devices (iPhone, iPad for me) get synchronized automatically. It works quite well for HTML pages.
- For PDFs on Ipad/iPhone, GoodReader is a must. Note that it can connect to your DropBox if you have one, very useful to share documents between devices.
In previous post we have seen how we can use a channel key to keep the last value of a sequence of events. If you have a channel key defined on a channel you can use an optimization to reduce the bandwidth used inbound and outbound of Nirvana called delta delivery. Instead of publishing a new event for each update, you can publish only the changes and Nirvana will merge these updates within the current value and dispatch the changes to the different subscribers.
There are some restrictions to be able to use the merge engine:
- as I said previously you need a channel key defined,
- only the properties of the event can be merged, this does not work if you’re using the byte array payload,
- you need a flat data structure, the merge engine does not work with nested properties
You just need to create a nRegisteredEvent on the channel to be able to publish partial updates, here is a sample:
You need to publish a full update (snapshot) on the first update to properly initiate the last value cache.
You can continue using the same API when using delta delivery: the client API will receive a snapshot on subscription and will merge the updates providing you with a reconstructed nConsumeEvent on each update.
Additionally if you want to get the partial update, you can use the nRegisteredEventListener’s update method to get a nConsumeEvent containing updated fields only.
We have reviewed in Part I how Nirvana stores events within a channel and how you can control events expiration using TTL (time to live) and capacity. An other way to purge events is to use channel keys.
Let say we have a channel containing blog posts. Each time a new post is add we publish a message to this channel. If we modify a blog post, we don’t care about the previous version, we just want to keep the last value of the blog post in the channel. To achieve that we just have to define a channel key named “blogID” on the channel and when we publish a blog post we define a property (nEventProperty) blogID and set the value to the identifier of the post.
Let say we publish a blog post message for the blogID=1. If we modify the blog post and then publish a new message for blogID=1, Nirvana will automatically remove the previous version from the channel. So now, if somebody subscribes on this channel it will automatically get the latest version only (and not 2 or more versions of the same blog post).
When creating a channel key you can specify how many “versions” of each element you want to keep: there is a depth property associated with each channel key.
In the previous example, if we specify a depth of 5 for our blogID channel key, Nirvana will keep the last 5 versions for each blogID.
Since channels can store state, you want to make sure you channels are not growing indefinitely: using a channel key with depth 1 you are guaranteed that only the latest value is kept on the channel.
Using a channel as a distributed cache
A channel with a channel key is nothing less than a distributed cache where your data is indexed by the channel key. If you dispose of a clustered channel, your data is reliably replicated over the different nodes of the cluster. And since this is a channel, your consumers get notified each time an element changes in the cache.
Depending on the size of the cached data, you may want to store your messages in memory (reliable or simple channel types) or on disk (persistent channel type).
Building a local cache on top of a channel
Depending on the use case you may want to build an in-memory copy of the channel data, this way you can do quick lookups in your cache instead of going back to Nirvana. If your instance fails you just have to subscribe again on the channel to restore the current state of your cache.
Detecting stale data
If the data stored in your channel is not static (ie. can change) you probably want to make sure it is always up to date. Imagine that the publisher of the information fails: the data would no longer be published to the channel and your consumer would be using stale data. To detect this failure you can for instance decide to publish heartbeats on a predefined value of the channel key. For instance if blogID is your channel key you publish on a regular basis an event with a property blogID=heartbeat. On the subscriber side, you keep track of the heartbeats and make sure it is properly received. If you miss several heartbeats you can decide how to react (this completely depends of your use case…)
Subject filtering has been implemented quite recently in Nirvana and has been enhanced in the latest version (5.5).
Subject filtering allows the publisher to decide to which consumer(s) the message will be delivered to.
Before looking at subject filtering we need to define what subscriber name is. Each client needs to provide a username when connecting to Nirvana.
If you don’t specify yourself the username when creating the session the client API will specify a default one for you. Username is for instance used with ACLs: you can entitle resources (queues and channels) based on username and host:
You can specify the username with the following method of the nSessionFactory class:
To use subject filtering you just need to use one of the following methods of the nConsumeEvent class:
Use the first method if you want to distribute the message to a single user, the second one if you need to specify a list of receivers.
Subject filtering is very useful to implement request/response scenarios. Let see what we need:
- a request queue or channel (depends if we need load balancing on request or not), this will be used by our requester to publish its request message
- a response channel to distribute the response
- the requester tags the request message with a correlation ID (a GUID will perfectly do the job) – you can use a property of the nConsumeEvent to store this ID.
- The responder extracts this correlation ID and applies it on the response, so the requester can match together the request and the response.
- The responder extracts the username of the requester (using the getPublisherUser() method on the nConsumeEvent)
- The responder uses subject filtering to dispatch the message to the requester only – multiple consumers could be subscribed at the same time on the response channel and you probably don’t want all of them to receive the response!
To keep in mind…
- Since the publisher can decide to which consumers the messages will be delivered you don’t have to worry about subscriber’s ACLs
- A side effect of subscriber name affects Nirvana Enterprise Manager: if you try to Snoop a channel where messages are dispatched using subject filtering you won’t see anything. Using snoop over a channel just subscribes on the channel so if the messages are dispatched with subscriber names not matching your username, messages are not dispatched to you.