Hazelcast: Keep your cluster close, but cache closer

Hazelcast has a neat feature called Near Cache. Whenever clients talk to Hazelcast servers each get/put is a network call, and depending on how far the cluster is these calls may get pretty costly.

The idea of Near Cache is to bring data closer to the caller, and keep it in sync with the source. Which is why it is highly recommended for data structures that are mostly read.

Hazelcast Near Cache

Near Cache can be created / configured on both: server and client sides

Optionally Near Cache keys can be stored on the file system, and then preloaded when the client restarts.

The examples below are run from a Clojure REPL and use chazel, which is a Clojure library for Hazelcast. To follow along you can:

$ git clone https://github.com/tolitius/chazel
$ cd chazel
$ boot dev
boot.user=> ;; ready for examples

In case you don’t have boot installed it is a one liner install.

Server Side Setup

Well use two different servers not too far from each other so the network latency is enough to get a good visual on how Near Cache could help.

On the server side we’ll create an "events" map (which will start the server if it was not yet started), and will add 100,000 pseudo events to it:

;; these are done on the server:
 
(def m (hz-map "events"))
 
(dotimes [n 100000] (put! m n n))

We can visualize all these puts with hface:

hface putting 100,000 entries

Client Side Without Near Cache

On the client side we’ll create a function to walk over first n keys in the "events" map:

(defn walk-over [m n]
  (dotimes [k n] (get m k)))

Create a new Hazelcast client instance (without Near Cache configured), and walk over first 100,000 events (twice):

(def hz-client (client-instance {:hosts ["10.x.y.z"]}))
 
(def m (hz-map "events" hz-client))
 
(time (walk-over m 100000))
=> "Elapsed time: 30534.997599 msecs"
 
(time (walk-over m 100000))
=> "Elapsed time: 30547.810322 msecs"

Each iteration roughly took 30.5 seconds, and by monitoring the server’s network it was sending packets back and forth for every get:

Hazelcast with no Near Cache

We can see that all these packets came from / correlate well to an "events" map:

hface putting 100,000 entries

Client Side With Near Cache

Now let’s create a different client and configure it with Near Cache for the "events" map:

(def client-with-nc (client-instance {:hosts ["10.x.y.z"]
                                      :near-cache {:name "events"}}))

Let’s repeat the exercise:

(def m (hz-map "events" client-with-nc))
 
(time (walk-over m 100000))
=> "Elapsed time: 30474.719965 msecs"
 
(time (walk-over m 100000))
=> "Elapsed time: 102.141527 msecs"

The first iteration took 30.5 seconds as expected, but the second, and all the subsequent ones, took 100 milliseconds. That’s because a Near Cache kicked in, and all these events are close to the client: are in the client’s memory.

As expected all subsequent calls do not use the server:

Hazelcast with Near Cache

Keeping Near Cache in Sync

The first logical question is: ok, I brought these events into memory, but would not they become stale in case they change on the server?

Let’s check:

;; checking on the client side
(get m 41)
=> 41
;; on the server: changing the value of a key 41 to 42
(put! m 41 42)
;; checking again on the client side
(get m 41)
=> 42

Pretty neat. Hazelcast invalidates “nearly cached” entries by broadcasting invalidation events from the cluster members. These events are fire and forget, but Hazelcast is very good at figuring out if and when these events are lost.

There are a couple of system properties that could be configured to control this behaviour:

  • hazelcast.invalidation.max.tolerated.miss.count: Default value is 10. If missed invalidation count is bigger than this value, relevant cached data will be made unreachable, and the new value will be populated from the source.

  • hazelcast.invalidation.reconciliation.interval.seconds: Default value is 60 seconds. This is a periodic task that scans cluster members periodically to compare generated invalidation events with the received ones from Near Cache.

Near Cache Preloader

In case clients are restarted all the near caches would be lost and would need to be naturally repopulated by applications / client requests.

Near Cache can be configured with a preloader that would persist all the keys from the map to disk, and would repopulate the cache using the keys from the file in case of a restart.

Let’s create a client instance with such a preloader:

(def client-with-nc (client-instance {:hosts ["10.x.y.z"] 
                                      :near-cache {:name "events"
                                                   :preloader {:enabled true
                                                               :store-initial-delay-seconds 60}}}))

And walk over the map:

(def m (hz-map "events" client-with-nc))
 
(walk-over m 100000)

As per store-initial-delay-seconds config property, 60 seconds after we created a reference to this map, preloader will persist the keys into the nearCache-events.store file (filename is configurable):

INFO: Stored 100000 keys of Near Cache events in 306 ms (1953 kB)

Now let’s restart the client and try to iterate over the map again:

(shutdown-client client-with-nc)
(def client-with-nc (client-instance {:hosts ["10.x.y.z"]
                                      :near-cache {:name "events"
                                      :preloader {:enabled true}}}))
 
(def m (hz-map "events" client-with-nc))
 
(time (walk-over m 100000))
INFO: Loaded 100000 keys of Near Cache events in 3230 ms
"Elapsed time: 2920.688369 msecs"
 
(time (walk-over m 100000))
;; "Elapsed time: 103.878848 msecs"

The first iteration took 3 seconds (and not 30) since once the preloader loaded all the keys, the rest (27 seconds worth of data) came back from the client’s memory.

This 3 second spike can be observed by the network usage:

Hazelcast with Near Cache

And all the subsequent calls now again take 100 ms.

Near Cache Full Config

There are a lot more Near Cache knobs beyond the map name and preloader. All are well documented in the Hazelcast docs and available as edn config with chazel.

Here is an example:

{:in-memory-format :BINARY,
 :invalidate-on-change true,
 :time-to-live-seconds 300,
 :max-idle-seconds 30,
 :cache-local-entries true,
 :local-update-policy :CACHE_ON_UPDATE,
 :preloader {:enabled true,
             :directory "nearcache-example",
             :store-initial-delay-seconds 15,
             :store-interval-seconds 60},
 :eviction  {:eviction-policy :LRU,
             :max-size-policy :ENTRY_COUNT,
             :size 800000}}

Any config options that are not provided will be set to Hazelcast defaults.