olivier deheurles

Just another programming weblog

Archive for the 'Middleware' Category

A first look at CQRS or How push can help you quit smoking

I’ve heard several colleagues of mine talking about CQRS and I spend some time today looking it.

If you want to understand what it is about please have a look here:
- AxomFramework (Java)
- NCQRS (.NET)
- Udi Dahan has a detailed post “Clarified CQRS

Let’s have a look to the architecture diagram (I will use Udi’s one):

CQRS

Figure 1: Udi’s CQRS architecture diagram

We are currently using this type of architecture in several areas for the project I’m working on, but the flow is actually different: user does not pull for data, it is pushed to him (or to be more accurate the client pulls the data once and is then notified of changes to stay in sync).

Let’s look at a real use case: the user is using a UI to trade in real-time. When the application starts-up the UI retrieves the list of trades for this user and, the same time, notifies the server of his interest for trade events (i.e. the server stores a subscription for this user).

image

Figure 2: Application start-up, trade subscription

You might say it does not looks like Udi’s diagram… Wait! We will get there.

Ok, now let’s look at what happens if the user decides to trade:

  • a trade request (command) is send to an execution engine which will do lots of fancy checks (does this client has enough credit? Is the price valid? etc.). You have 2 possible output to this workflow: all checks pass and the trade is done or one of the check fails and the trade is rejected. We will focus here on the happy path
  • as soon as the decision has been taken several things needs to happen:
    • the user needs to be acknowledged of the success or the failure of the transaction
    • the back office system needs to be notified (this is where the trades will actually be booked (other stored in a big database if you prefer)
    • other users needs to be notified of this trade: for instance the sales responsible of this client and other users belonging to the same client (client=company) want to be aware of this trade.

image

Figure 3: execution flow

CQRS

Now the picture looks like the CQRS one but there is one major difference: the guy on the picture is looking the other way around. No, not this one… The trade notifications have been pushed to all users affected by the state change. If you look at the CQRS architecture the client is pulling for the data (the query).

Let’s review a typical user experience for a classical data driven app (or CQRS oriented):

  • Bob opens order view with order 1
  • Sara opens order view with order 1
  • Bob modifies order 1
  • Sara has stale data and she does not know yet
  • Sara continue doing some stuff on order 1, have a coffee and a cigarette and finally commit 20 minutes latter her work
  • At this stage 2 things can happen:
    • developers did not thought their systems could be used by more than 1 user at the same time and Bob just lost all his work: it has been overridden by Sara and he does not even know that it happened (he may figure out end of month when they will review all the orders)
    • developers did a better job and implemented optimistic locking and Sara is happy to learn that she can not save her work and has to do it again. She smashes her keyboard (which surprises Bob sitting right next to her) and she goes outside for another cigarette.

In both scenarios you could improve user experience by polling the server to see if the data has changed and notify the user if it’s the case, but this kind of architecture is not scalable, at all.

Pushing the data to the client when it changes is, from my perspective, a far better way to handle the problem:

  • users are notified in real time when a particular piece of information has changed: there is still a time window where the data is stale for some users but this window is of the order of the second (or the millisecond if you add a few million $), not an undetermined amount of time. Optimistic locking is still required but conflicts are less likely to happen,
  • this significantly improves users collaboration. And instead of notifying only when the data is changed, users can be notified who is currently working on the same piece of data with the same mechanisms: Sara see in the status bar that  “Bob is modifying this order” and she asks him to tell her when  he is done with this order. Sara is not stressed and do not need a cigarette.
  • this architecture is a lot more scalable than a pull model and network traffic is significantly reduced

I don’t understand why CQRS, which is using asynchronous notification mechanism sever side is not applying the same technics for the client.

It is not harder to have the UI layer notified asynchronously: there are now lots of enterprise grade technologies out there accessible to everybody: 0MQ, RabbitMQ, and lots of http push (or COMET) servers can be used to implement what I just described here in LAN or over the web.

A push system would probably delay Sara’s cancer of a good couple of years…

Comments are off for this post

Nirvana 6 – What’s new?

My-channels is currently working on version 6 of Nirvana which should be released end Q1/beginning Q2. I’m going to quickly run through the new features:

Direct Data Delivery

Direct Data Delivery (DDD) is the main new addition to the product, it is a new message delivery paradigm.

Before starting, a bit of background: as you may know Nirvana is used by several banks within their SDP (Single Dealer Platform). What is a SDP you may ask? Most of the financial institutions offer to their clients trading applications available on the Web, generally via a Rich Internet Application (Silverlight, Flex, HTML, etc.) or a Rich Client (Java, .NET). Clients can see bank’s prices and trade in real-time. In these environments you have pretty serious requirements for messaging: you need to:

  • push prices over internet (HTTP generally mandatory to cross firewalls and proxies) to clients,
  • have guaranteed message delivery for trade execution and order management.

Nirvana covers these requirements but it is still challenging to design and implement the price distribution.

Why?

  • because prices are generally distributed by tier levels associated to client groups or even with specific price per client. Mapping this distribution topology to JMS Topics (called Channels in Nirvana) is not always easy: if a client can only see one level of tiering you need to make sure he can’t see other levels by defining properly Channels granularity and ACLs.
  • you may have to dynamically create channels or queues at runtime, this requires Nirvana admin API, adding a bit of complexity in the system.
  • to summarize the JMS model fits pretty well when you need to communicate between server side applications in LAN or WAN. You just have to define a static Channel/Queue schema with static ACLs. But when you are dealing with external users it’s different: you have to grant them ACLs dynamically when they are authenticating.

Direct Data Delivery (DDD) is going to be very useful for these kind of scenarios, let me explain how it works: when the client API connects to the realm it can choose to subscribe to a Data Stream; clients can have only one Data Stream per active Nirvana session. On publisher side you can create, with required ACLs, an other type of resource called Data Group and associate different clients data streams with these data groups. You can then publish messages to a data group and Nirvana will automatically deliver them to clients contained within the group.

How does it helps in the SDP scenario?

  • no need to create dynamically channels with the Admin API: data groups can be created dynamically with the Nirvana client API,
  • client-side implementation is simplified: you no longer have to subscribe to the different channels used for price distribution, every thing can be managed server-side securely.
  • it’s very easy to change client’s from one tiering group to another: remove it’s data stream from one data group and add it to an other one
  • DDD will support all the useful features required to distribute price efficiently: last tick cache with delta delivery enabled (snapshot + partial updates) and conflation

API Batching

The client API now provides batching for find and find + subscribes to channels. Very useful again with client scenarios (latency is higher over internet than in LAN so batching is always helpful).

A new .NET API with RX Extensions

Nirvana is a Java product and the .NET API has been designed based on the Java API, so it looks a bit ‘”Java” from the eyes of a .NET developer ;-)

The new .NET API has been redesigned to provide a better experience for .NET developers and also provides access to the DDD and batching functionalities.

On top of that you can plug Rx (Microsoft Reactive Extensions) to get Nirvana event streams exposed as IObservable. I’m not going to explain here what is Rx, you can find lots of information on the DevLabs web site, on the forum or for instance on blogs of some of my mates here and here.

Stay tuned…

1 comment

Nirvana Part V: Merge Engine and Delta Delivery

In previous post we have seen how we can use a channel key to keep the last value of a sequence of events. If you have a channel key defined on a channel you can use an optimization to reduce the bandwidth used inbound and outbound of Nirvana called delta delivery. Instead of publishing a new event for each update, you can publish only the changes and Nirvana will merge these updates within the current value and dispatch the changes to the different subscribers.

There are some restrictions to be able to use the merge engine:

  • as I said previously you need a channel key defined,
  • only the properties of the event can be merged, this does not work if you’re using the byte array payload,
  • you need a flat data structure, the merge engine does not work with nested properties

Publisher

You just need to create a nRegisteredEvent on the channel to be able to publish partial updates, here is a sample:

delta

You need to publish a full update (snapshot) on the first update to properly initiate the last value cache.

Subscriber

You can continue using the same API when using delta delivery: the client API will receive a snapshot on subscription and will merge the updates providing you with a reconstructed nConsumeEvent on each update.

Additionally if you want to get the partial update, you can use the nRegisteredEventListener’s update method to get a nConsumeEvent containing updated fields only.

update

Comments are off for this post

Nirvana Part IV: Channel Keys

We have reviewed in Part I how Nirvana stores events within a channel and how you can control events expiration using TTL (time to live) and capacity. An other way to purge events is to use channel keys.

Let say we have a channel containing blog posts. Each time a new post is add we publish a message to this channel. If we modify a blog post, we don’t care about the previous version, we just want to keep the last value of the blog post in the channel. To achieve that we just have to define a channel key named “blogID” on the channel and when we publish a blog post we define a property (nEventProperty) blogID and set the value to the identifier of the post.

Let say we publish a blog post message for the blogID=1. If we modify the blog post and then publish a new message for blogID=1, Nirvana will automatically remove the previous version from the channel. So now, if somebody subscribes on this channel it will automatically get the latest version only (and not 2 or more versions of the same blog post).

When creating a channel key you can specify how many “versions” of each element you want to keep: there is a depth property associated with each channel key.

In the previous example, if we specify a depth of 5 for our blogID channel key, Nirvana will keep the last 5 versions for each blogID.

channelKey

Since channels can store state, you want to make sure you channels are not growing indefinitely: using a channel key with depth 1 you are guaranteed that only the latest value is kept on the channel.

Using a channel as a distributed cache

A channel with a channel key is nothing less than a distributed cache where your data is indexed by the channel key. If you dispose of a clustered channel, your data is reliably replicated over the different nodes of the cluster. And since this is a channel, your consumers get notified each time an element changes in the cache.

Depending on the size of the cached data, you may want to store your messages in memory (reliable or simple channel types) or on disk (persistent channel type).

Building a local cache on top of a channel

Depending on the use case you may want to build an in-memory copy of the channel data, this way you can do quick lookups in your cache instead of going back to Nirvana. If your instance fails you just have to subscribe again on the channel to restore the current state of your cache.

Detecting stale data

If the data stored in your channel is not static (ie. can change) you probably want to make sure it is always up to date. Imagine that the publisher of the information fails: the data would no longer be published to the channel and your consumer would be using stale data. To detect this failure you can for instance decide to publish heartbeats on a predefined value of the channel key. For instance if blogID is your channel key you publish on a regular basis an event with a property blogID=heartbeat. On the subscriber side, you keep track of the heartbeats and make sure it is properly received. If you miss several heartbeats you can decide how to react (this completely depends of your use case…)

Comments are off for this post

Nirvana Part III: Subject Filtering

Subject filtering has been implemented quite recently in Nirvana and has been enhanced in the latest version (5.5).

Subject filtering allows the publisher to decide to which consumer(s) the message will be delivered to.

Before looking at subject filtering we need to define what subscriber name is. Each client needs to provide a username when connecting to Nirvana.

If you don’t specify yourself the username when creating the session the client API will specify a default one for you. Username is for instance used with ACLs: you can entitle resources (queues and channels) based on username and host:

sampleACL

You can specify the username with the following method of the nSessionFactory class:

nSessionFqctory

To use subject filtering you just need to use one of the following methods of the nConsumeEvent class:

subscriberName

Use the first method if you want to distribute the message to a single user, the second one if you need to specify a list of receivers.

Request/Response

Subject filtering is very useful to implement request/response scenarios. Let see what we need:

  • a request queue or channel (depends if we need load balancing on request or not), this will be used by our requester to publish its request message
  • a response channel to distribute the response
  • the requester tags the request message with a correlation ID (a GUID will perfectly do the job) – you can use a property of the nConsumeEvent to store this ID.
  • The responder extracts this correlation ID and applies it on the response, so the requester can match together the request and the response.
  • The responder extracts the username of the requester (using the getPublisherUser() method on the nConsumeEvent)
  • The responder uses subject filtering to dispatch the message to the requester only - multiple consumers could be subscribed at the same time on the response channel and you probably don’t want all of them to receive the response!

request-response

To keep in mind…

  • Since the publisher can decide to which consumers the messages will be delivered you don’t have to worry about subscriber’s ACLs
  • A side effect of subscriber name affects Nirvana Enterprise Manager: if you try to Snoop a channel where messages are dispatched using subject filtering you won’t see anything. Using snoop over a channel just subscribes on the channel so if the messages are dispatched with subscriber names not matching your username, messages are not dispatched to you.
Comments are off for this post

Nirvana Part II : Queues

Queue is the second main construct available in Nirvana to distribute messages between producer and consumer applications.

Unlike channels, queues dispatch events to one and only one consumer and reads are destructive: when an event in send to a subscriber, the event is automatically removed (popped) from the queue. It is as well possible for entitled consumers to browse the queue content using peek operation.

Implementing load balancing using a queue

Queues are particularly useful to implement load balancing between applications: if multiple consumer are subscribed on the queue, events will be load balanced between subscriber using a round-robin algorithm. You will likely want to use a queue to load balance events between multiple active instances of an application.

Example:

This example describes a load balancing scenario between one producer and 2 consumers (C1 and C2):

  • C1 is started first and subscribes to the queue
  • the producer is then started and publishes event1 to the queue
  • since there is only one subscriber on the queue the event is directly dispatched to C1 and the event is removed from the queue
  • C2 is now started and subscribed to the queue
  • the producer publishes event2
  • this time there are two subscribers on the queue: the event is randomly dispatched to one of the consumers
  • C2 is stopped (or C2 process fails)
  • the producer publishes event3
  • Nirvana has detected that the connection with C2 has been lost: C2 is no longer subscribed to the queue so the event is dispatched to C1

Queues attributes

You can refer to the previous post about channels, attributes for queues and channels are the same.

Note about queues

It is good to keep in mind that queues are more expensive for Nirvana than channels: destructive reads requires additional synchronization, especially in a clustered environment.

Comments are off for this post

Nirvana Part I: Channels

This first post provides a basic understanding of the Channel, which is one of the two main types of resources used to distribute messages (or events) between producer and consumer applications.

A channel provides the same basic behaviors than a JMS topic and adds some features on top of it. You basically use a channel when you want to distribute events to all your consumer applications.

image

Nirvana does not use multicast to distribute the event to the different consumers: the realm (Nirvana server) has a TCP connection with each consumer and the fan-out occurs at realm level.

A channel can be created manually using Nirvana Enterprise Manager or using the administration API. Actually Nirvana Enterprise Manager is “just” a GUI on top of the admin API, so keep in mind that everything you can do via the admin GUI can be done programmatically in Java, .NET or C++ using the Admin API.

Channel Attributes

When creating a channel you need to consider the attributes of the channel; here is a screen capture of the Create Channel screen in the enterprise manager:

image

Here is a quick summary of the different attributes:

  • Channel name: each channel and queue needs to be identified by a unique Name. You can store hierarchically your channels and queues in a set of folders by separating folder name and channel name by a /. Example: folderLevel1/folderLevel2/channelName.
  • Channel TTL: defines the time to live of the event in milliseconds. When an event is published on a channel it will be automatically deleted of the channel after the TTL expires. TTL=0 defines an infinite time to live.
  • Channel Capacity: defines the maximum number of events that can be stored on the channel. Capacity=0 defines a channel without capacity limit.
  • Channel type: controls how events are stored by the realm.
  • Dead Event Store: If an event expires and has not been processed you can use this attribute to redirect the event on an other channel or queue.
  • Channel Keys: will be described in a separate post
  • Use Merge Engine: will be described in a separate post
Type Events stored in memory Events stored on disk Comment
Persistent No Yes Events are stored on disk (one file per channel). For performance reason events are not deleted automatically, they are just marked as deleted. To delete them physically the maintenance feature need to be used on the channel (Admin GUI or Admin API)
Reliable Yes Event ID only The event is stored on the heap of the realm. Event IDs are guaranteed to be incremental on the channel even if the realm is restarted: a file per channel is used to store the ID.
Simple Yes No Same as reliable but the event ID is not stored on disk, restarting the realm resets event IDs to 0.
Mixed Yes/No Yes/No The publisher can choose on a per event basis if the event should be persisted to disk or in Memory. The TTL of the event can be overridden as well.
Transient No No This is the most lightweight channel type. In this mode the events are never stored at realm level: if you publish an event it will be distributed only to currently subscribed consumers.

More information can be found online on my-channels web site: http://www.my-channels.com/developers/nirvana/common/channelattributes.html

Understanding events lifetime

TTL, Capacity and Channel Type allow you to control how events are stored on the channel and for how long. To get a better understanding, here is an example.

We create a channel of type Simple with a TTL of 10sec.

  1. A first consumer subscribes on the channel
  2. A producer publishes an event to the channel
  3. T=0 - The message is send to the realm and directly dispatched to the first consumer,
  4. Since the channel is configured to store events (simple type stores them in memory) the event stays in the channel for the configured time to live (10 seconds)
  5. T=5 - After 5 seconds an other consumer subscribes on the channel: the event is automatically dispatched to this consumer*
  6. T=10 – The TTL has expired, the event is removed from the channel and available for garbage collection on the realm.
  7. T=12 – An other consumer subscribes, the event is not dispatched since it has been removed from the channel.

Note*: I have intentionally simplified step 5: when subscribing to a channel you can specify from which event you want to subscribe. Each event currently stored on the channel with an Event ID greater than the specified ID will automatically be dispatched to the subscriber. Subscribing with an Event ID of 0 will always dispatch all the events available on the channel at subscription time.

Glossary

  • Realm: this is the name given to a Nirvana server. Multiple realm can be configured to form a Nirvana cluster.
  • Event ID: when publishing an event to a channel or a queue an Event ID is assigned automatically to the event by the realm. This is an incremental integer.
1 comment