olivier deheurles

Just another programming weblog

Archive for October, 2010

Nirvana Part V: Merge Engine and Delta Delivery

In previous post we have seen how we can use a channel key to keep the last value of a sequence of events. If you have a channel key defined on a channel you can use an optimization to reduce the bandwidth used inbound and outbound of Nirvana called delta delivery. Instead of publishing a new event for each update, you can publish only the changes and Nirvana will merge these updates within the current value and dispatch the changes to the different subscribers.

There are some restrictions to be able to use the merge engine:

  • as I said previously you need a channel key defined,
  • only the properties of the event can be merged, this does not work if you’re using the byte array payload,
  • you need a flat data structure, the merge engine does not work with nested properties

Publisher

You just need to create a nRegisteredEvent on the channel to be able to publish partial updates, here is a sample:

delta

You need to publish a full update (snapshot) on the first update to properly initiate the last value cache.

Subscriber

You can continue using the same API when using delta delivery: the client API will receive a snapshot on subscription and will merge the updates providing you with a reconstructed nConsumeEvent on each update.

Additionally if you want to get the partial update, you can use the nRegisteredEventListener’s update method to get a nConsumeEvent containing updated fields only.

update

Comments are off for this post

Nirvana Part IV: Channel Keys

We have reviewed in Part I how Nirvana stores events within a channel and how you can control events expiration using TTL (time to live) and capacity. An other way to purge events is to use channel keys.

Let say we have a channel containing blog posts. Each time a new post is add we publish a message to this channel. If we modify a blog post, we don’t care about the previous version, we just want to keep the last value of the blog post in the channel. To achieve that we just have to define a channel key named “blogID” on the channel and when we publish a blog post we define a property (nEventProperty) blogID and set the value to the identifier of the post.

Let say we publish a blog post message for the blogID=1. If we modify the blog post and then publish a new message for blogID=1, Nirvana will automatically remove the previous version from the channel. So now, if somebody subscribes on this channel it will automatically get the latest version only (and not 2 or more versions of the same blog post).

When creating a channel key you can specify how many “versions” of each element you want to keep: there is a depth property associated with each channel key.

In the previous example, if we specify a depth of 5 for our blogID channel key, Nirvana will keep the last 5 versions for each blogID.

channelKey

Since channels can store state, you want to make sure you channels are not growing indefinitely: using a channel key with depth 1 you are guaranteed that only the latest value is kept on the channel.

Using a channel as a distributed cache

A channel with a channel key is nothing less than a distributed cache where your data is indexed by the channel key. If you dispose of a clustered channel, your data is reliably replicated over the different nodes of the cluster. And since this is a channel, your consumers get notified each time an element changes in the cache.

Depending on the size of the cached data, you may want to store your messages in memory (reliable or simple channel types) or on disk (persistent channel type).

Building a local cache on top of a channel

Depending on the use case you may want to build an in-memory copy of the channel data, this way you can do quick lookups in your cache instead of going back to Nirvana. If your instance fails you just have to subscribe again on the channel to restore the current state of your cache.

Detecting stale data

If the data stored in your channel is not static (ie. can change) you probably want to make sure it is always up to date. Imagine that the publisher of the information fails: the data would no longer be published to the channel and your consumer would be using stale data. To detect this failure you can for instance decide to publish heartbeats on a predefined value of the channel key. For instance if blogID is your channel key you publish on a regular basis an event with a property blogID=heartbeat. On the subscriber side, you keep track of the heartbeats and make sure it is properly received. If you miss several heartbeats you can decide how to react (this completely depends of your use case…)

Comments are off for this post

Nirvana Part III: Subject Filtering

Subject filtering has been implemented quite recently in Nirvana and has been enhanced in the latest version (5.5).

Subject filtering allows the publisher to decide to which consumer(s) the message will be delivered to.

Before looking at subject filtering we need to define what subscriber name is. Each client needs to provide a username when connecting to Nirvana.

If you don’t specify yourself the username when creating the session the client API will specify a default one for you. Username is for instance used with ACLs: you can entitle resources (queues and channels) based on username and host:

sampleACL

You can specify the username with the following method of the nSessionFactory class:

nSessionFqctory

To use subject filtering you just need to use one of the following methods of the nConsumeEvent class:

subscriberName

Use the first method if you want to distribute the message to a single user, the second one if you need to specify a list of receivers.

Request/Response

Subject filtering is very useful to implement request/response scenarios. Let see what we need:

  • a request queue or channel (depends if we need load balancing on request or not), this will be used by our requester to publish its request message
  • a response channel to distribute the response
  • the requester tags the request message with a correlation ID (a GUID will perfectly do the job) – you can use a property of the nConsumeEvent to store this ID.
  • The responder extracts this correlation ID and applies it on the response, so the requester can match together the request and the response.
  • The responder extracts the username of the requester (using the getPublisherUser() method on the nConsumeEvent)
  • The responder uses subject filtering to dispatch the message to the requester only - multiple consumers could be subscribed at the same time on the response channel and you probably don’t want all of them to receive the response!

request-response

To keep in mind…

  • Since the publisher can decide to which consumers the messages will be delivered you don’t have to worry about subscriber’s ACLs
  • A side effect of subscriber name affects Nirvana Enterprise Manager: if you try to Snoop a channel where messages are dispatched using subject filtering you won’t see anything. Using snoop over a channel just subscribes on the channel so if the messages are dispatched with subscriber names not matching your username, messages are not dispatched to you.
Comments are off for this post

Nirvana Part II : Queues

Queue is the second main construct available in Nirvana to distribute messages between producer and consumer applications.

Unlike channels, queues dispatch events to one and only one consumer and reads are destructive: when an event in send to a subscriber, the event is automatically removed (popped) from the queue. It is as well possible for entitled consumers to browse the queue content using peek operation.

Implementing load balancing using a queue

Queues are particularly useful to implement load balancing between applications: if multiple consumer are subscribed on the queue, events will be load balanced between subscriber using a round-robin algorithm. You will likely want to use a queue to load balance events between multiple active instances of an application.

Example:

This example describes a load balancing scenario between one producer and 2 consumers (C1 and C2):

  • C1 is started first and subscribes to the queue
  • the producer is then started and publishes event1 to the queue
  • since there is only one subscriber on the queue the event is directly dispatched to C1 and the event is removed from the queue
  • C2 is now started and subscribed to the queue
  • the producer publishes event2
  • this time there are two subscribers on the queue: the event is randomly dispatched to one of the consumers
  • C2 is stopped (or C2 process fails)
  • the producer publishes event3
  • Nirvana has detected that the connection with C2 has been lost: C2 is no longer subscribed to the queue so the event is dispatched to C1

Queues attributes

You can refer to the previous post about channels, attributes for queues and channels are the same.

Note about queues

It is good to keep in mind that queues are more expensive for Nirvana than channels: destructive reads requires additional synchronization, especially in a clustered environment.

Comments are off for this post

Nirvana Part I: Channels

This first post provides a basic understanding of the Channel, which is one of the two main types of resources used to distribute messages (or events) between producer and consumer applications.

A channel provides the same basic behaviors than a JMS topic and adds some features on top of it. You basically use a channel when you want to distribute events to all your consumer applications.

image

Nirvana does not use multicast to distribute the event to the different consumers: the realm (Nirvana server) has a TCP connection with each consumer and the fan-out occurs at realm level.

A channel can be created manually using Nirvana Enterprise Manager or using the administration API. Actually Nirvana Enterprise Manager is “just” a GUI on top of the admin API, so keep in mind that everything you can do via the admin GUI can be done programmatically in Java, .NET or C++ using the Admin API.

Channel Attributes

When creating a channel you need to consider the attributes of the channel; here is a screen capture of the Create Channel screen in the enterprise manager:

image

Here is a quick summary of the different attributes:

  • Channel name: each channel and queue needs to be identified by a unique Name. You can store hierarchically your channels and queues in a set of folders by separating folder name and channel name by a /. Example: folderLevel1/folderLevel2/channelName.
  • Channel TTL: defines the time to live of the event in milliseconds. When an event is published on a channel it will be automatically deleted of the channel after the TTL expires. TTL=0 defines an infinite time to live.
  • Channel Capacity: defines the maximum number of events that can be stored on the channel. Capacity=0 defines a channel without capacity limit.
  • Channel type: controls how events are stored by the realm.
  • Dead Event Store: If an event expires and has not been processed you can use this attribute to redirect the event on an other channel or queue.
  • Channel Keys: will be described in a separate post
  • Use Merge Engine: will be described in a separate post
Type Events stored in memory Events stored on disk Comment
Persistent No Yes Events are stored on disk (one file per channel). For performance reason events are not deleted automatically, they are just marked as deleted. To delete them physically the maintenance feature need to be used on the channel (Admin GUI or Admin API)
Reliable Yes Event ID only The event is stored on the heap of the realm. Event IDs are guaranteed to be incremental on the channel even if the realm is restarted: a file per channel is used to store the ID.
Simple Yes No Same as reliable but the event ID is not stored on disk, restarting the realm resets event IDs to 0.
Mixed Yes/No Yes/No The publisher can choose on a per event basis if the event should be persisted to disk or in Memory. The TTL of the event can be overridden as well.
Transient No No This is the most lightweight channel type. In this mode the events are never stored at realm level: if you publish an event it will be distributed only to currently subscribed consumers.

More information can be found online on my-channels web site: http://www.my-channels.com/developers/nirvana/common/channelattributes.html

Understanding events lifetime

TTL, Capacity and Channel Type allow you to control how events are stored on the channel and for how long. To get a better understanding, here is an example.

We create a channel of type Simple with a TTL of 10sec.

  1. A first consumer subscribes on the channel
  2. A producer publishes an event to the channel
  3. T=0 - The message is send to the realm and directly dispatched to the first consumer,
  4. Since the channel is configured to store events (simple type stores them in memory) the event stays in the channel for the configured time to live (10 seconds)
  5. T=5 - After 5 seconds an other consumer subscribes on the channel: the event is automatically dispatched to this consumer*
  6. T=10 – The TTL has expired, the event is removed from the channel and available for garbage collection on the realm.
  7. T=12 – An other consumer subscribes, the event is not dispatched since it has been removed from the channel.

Note*: I have intentionally simplified step 5: when subscribing to a channel you can specify from which event you want to subscribe. Each event currently stored on the channel with an Event ID greater than the specified ID will automatically be dispatched to the subscriber. Subscribing with an Event ID of 0 will always dispatch all the events available on the channel at subscription time.

Glossary

  • Realm: this is the name given to a Nirvana server. Multiple realm can be configured to form a Nirvana cluster.
  • Event ID: when publishing an event to a channel or a queue an Event ID is assigned automatically to the event by the realm. This is an incremental integer.
1 comment