"; */ ?>


22
Dec 15

The Story of Booting Mount

Feeling The Code


I don’t agree with the opinion that “cool kids now use boot“. People who say that are just missing out on the power of “feeling the code” rather than being abstracted from the code by a “better XML”. Same deal with people 10 years ago who said “cool kids are using functional languages”.

Don’t get me wrong I like lein a lot. It is simple to start with, it is well documented, it is very googlable, it is sharing platform (i.e. templates), mature, etc.. But boot is very different, it does not aim to do what lein does, it aims to do “what you want”. There is a difference.

Mounting a Bootable Partition


Since the late 90s when I got in to Linux, I found bootable partitions most exciting, they actually bootstrap everything, they were these wizards waving their magic wands and systems appeared. Granted the wave could take minutes, but we are humans, we always wait for the magic, even if it takes the whole life.

First thing that needs to be done for the magic to happen, this bootable partition needs to be mounted.

I wanted to do it for some time now, when, I could not figure out why ClojureScript brought in as a dependency with :classifier “aot” caused compilation problems with lein/cljsbuild, David Nolen suggested that this is rather due to the lein environment issues. So 2 and 2 together: it was the right time to “boot” myself up.

And since the partition was already mounted it was ready to boot.

Grokking the New Simple


Rather than tell you how great boot is, I’ll share non obvious (to me) things that I stumbled upon converting mount from lein to boot. Let’s rock & roll:

REPL is just REPL

Since I needed a support for both Clojure and ClojureScript, I looked at many examples and noticed a pattern: usually in a dev mode one task groups several, where most of the examples have a (watch) task in that group.

I just wanted to start out, so I decided that at a minimum I need a REPL and (I guess) this watcher to be able to mimic the lein repl behavior, so I did:

(deftask dev [] 
  (comp
    (watch)
    (repl)))

And it worked! I ran boot dev and I got a REPL which would see all the updates from vim (via the updated vim-fireplace).

But then I decided to stop the REPL, and it just froze.. I ran jstack on the PID and saw lots of watcher threads locking and derefing futures. Ok, so that’s not a good combination.

The answer is simpler than I expected: it’s just boot repl. Nothing else is needed to get to the lein repl functionality.

“Bring on Your Own Data Readers” Party

The Clojure mount example app uses in memory Datomic, so when I tried to start the app, boot told me:

no reader to handle the #db/id tag

This was easily googlable, and revealed that boot has a (load-data-readers!) function that “refreshes *data-readers* with readers from newly acquired dependencies”.

An interesting bit here is that (load-data-readers!) can’t be a part of a “top level” task that is executed with boot since:

java.lang.IllegalStateException: Can't set!: *data-readers* from non-binding thread

So calling boot dev, in case “load-data-readers!” is there, is not an option. But getting into a REPL “boot repl“, and then calling (dev) works beautifully.

REPL Logging

At this point I could get into the boot REPL and start the mount example app. A slight problem was that I did not see any logging from the app within the REPL.

That’s when I found boot-logservice that brought the logging back to the REPL:

(def log4b
  [:configuration
   [:appender {:name "STDOUT" :class "ch.qos.logback.core.ConsoleAppender"}
    [:encoder [:pattern "%-5level %logger{36} - %msg%n"]]]
   [:root {:level "TRACE"}
    [:appender-ref {:ref "STDOUT"}]]])
;; ...
 
(deftask dev []
 
  ;; ...
 
  (alter-var-root #'log/*logger-factory* 
                  (constantly (log-service/make-factory log4b)))
  ;; ... 
)
Shaking up tools.namespace

While it is not a requirement, and most of the time unnecessary, the example app uses tools.namespace to make it easier for people who rely on it heavily to get into mount.

By default “tools.namespace” won’t find anything to refresh, since boot uses its own “secret” temp directories for sources, and “tools.namespace” simply does not know about them.

This was an easy one, since it is well documented by boot. Hence having (apply set-refresh-dirs (get-env :directories)) in the “dev” task pointed “tools.namespace” to the right directories to refresh.

The Joy of Deploy: Build and Publish

At this point having the Clojure part figured out, before moving to the ClojureScript support, I decided to deploy mount to Clojars, to understand how it’s done with boot.

I found bootlaces, and just plugged it in, it was very straightforward:

(def +version+ "0.1.7-SNAPSHOT")
 
(bootlaces! +version+)
 
;; other things.. and
 
(task-options!
  pom {:project     'mount
       :version     +version+
       :description "managing Clojure and ClojureScript app state since (reset)"
       :url         "https://github.com/tolitius/mount"
       :scm         {:url "https://github.com/tolitius/mount"}
       :license     {"Eclipse Public License"
                     "http://www.eclipse.org/legal/epl-v10.html"}})

Then I did:

boot build-jar push-snapshot

and everything was going smoothly, it asked for my Clojars username, then password.. but then:

clojure.lang.ExceptionInfo: java.lang.AssertionError: 
Assert failed: current git branch is 0.1.7 but must be master
               (or (not ensure-branch) (= b ensure-branch))

Boot told me that it prefers publishing snapshots from the “master”. I don’t disagree, but for some projects I like snapshots from version branches. I don’t really like “git flow”, I like “git freedom”.

Looking at the bootlaces code it seems that “master” is hardcoded. By this time I already started to feel the concept of a “boot task” and noticed that it is hardcoded under the “push” internal task, which means that this task’s options can potentially be overridden:

;; ...
 
(task-options!
 
  push {:ensure-branch nil}       ;; <<<<<<<<<<
 
  pom {:project     'mount
       :version     +version+
       ;; ... 
       })

And what d’you know, it worked! This was most likely the first “aha moment” which wired some of my neurons in boot ways.

Shall Not Pass!

Mount’s “test” root has both cljc tests and clj/cljs test apps that these tests use. The structure looks similar to:

|~test/
| |~clj/...
| | `+tapp/
| |~cljs/...
| | `+tapp/
| |~mount/
| | |+test/...
| | `-test.cljc

In lein, I can give “test” + “test/clj” for Clojure tests, and “test” + “test/cljs” for ClojureScript tests as the sources paths.

In boot I can’t do that, boot says:

java.lang.AssertionError: Assert failed: 
The :source-paths, :resource-paths, and :asset-paths must not overlap.
    (empty? (set/intersection paths parents))

Since boot already read everything under “test”, it does not want to merge things from “test/clj”. Fair enough, so I had to change the structure a bit to make it work:

|~test/
| |~clj/
| | `+tapp/
| |~cljs/
| | `+tapp/
| |~core/
| | `~mount/
| |   |+test/
| |   `-test.cljc

Now I can give “test/core” + “test/clj” and “test/core” + “test/cljs” respectively.

ClojureScript is Clojure, but.. not Always

ClojureScript took some time to get right. Many examples helped a lot especially these three: boot-cljs-example, tenzing and boot-cljs-multiple-builds.

The concept of dividing “cljs” options between “xyz.cljs.edn” and “task options” did not sink in immediately, and required some code digging to figure out where to put what and how to make sure it is being used.

It ends up to be quite simple. Options that are provided via “xyz.cljs.edn” can be referenced from task options via ids option:

(cljs :optimizations :advanced :ids #{"mount"})

would mean that it would look for mount.cljs.edn file within the classpath. That file should point to the entry point of the ClojureScript app. In case of the mount example app it would just be:

{:require  [app.example]}

where init-fns and compiler-options can be also added.

Testing ClojureScript

“mount does doo” for ClojureScript testing, and boot-cljs-test does it as well.

I would expect it to pick up “xyz.cljs.edn” files in the same way as “boot-cljs”, but it does it a bit differently. It is not all that obvious at first, but looking at the code I saw that it has a different name for ids, it calls it out-id. It also does not just take an “id”, it takes an “id” + “.js”, as I saw from the code.

So to get it to work is quite simple:

(tcs/test-cljs :out-file "mount.js"))

which would look for the same mount.cljs.edn file within the classpath.

Power it Up


There were other discoveries, like

* tasks are functions, but not really, they take arguments in the particular format and they better return a fileset

* tasks: “comp us please”. They like to be (comp ..)ed. Otherwise no go.

* there were others, but I liked Pods the most.

At this point I got all up and pumping, deployed to CircleCi using boot to build and run tests, published to Clojars as snapshot and release, etc.

One of the greatest things that I loved while debugging dependencies is boot show -p, it’s amazing!

Get up! Boot yourself up! Enjoy the runtime!


21
Dec 15

Functional Programming for Humans

A couple of years ago, maybe even three, we had our Chariot Day. It’s a conference within our small company where we, developers, talk to us developers.

Not everyone at that point did functional programming, and it was fun to go over the ultimate FP power and some of the “why”s.

So Mr. Dan and I sat down and created a talk to lure “non functional” people in. We talked about assignment, concurrency, equality, thinking in sequences, took over the enterprise by writing Yelp and Trading Forecast in both: imperative and functional style, took on some design patterns, etc..

We had great discussions during the talk, but years went by and it remained to be internal. This is now fixed:


11
Dec 15

Super Powers and Their Mutable Friends

After releasing my bullet proof time series database most of the world’s high frequency companies started converting to it. In less than a day major Fortune 7.3 billion players adopted their solutions and embraced the simplicity and greatness of what my Clojure time series database delivered.

So what now? When all the money is made and the adoption rate is higher than I could ever predicted.. What now? Well, now it’s time to fix it, because it’s, um, broken.

Keys to Time


Here is the data example for the current broken solution:

(def events
  {1449088877092 {:GOOG {:bid 762.74 :offer 762.79}}
   1449088876590 {:AAPL {:bid 116.60 :offer 116.70}}
   1449088877601 {:MSFT {:bid 55.22 :offer 55.27}}
   1449088877203 {:TSLA {:bid 232.57 :offer 232.72}}
   1449088875914 {:NFLX {:bid 128.95 :offer 129.05}}
   1449088870005 {:FB {:bid 105.96 :offer 106.6}}})

It is a map: say we have a couple of events coming in at the exact same millisecond:

(def events [
  {:ts 1449088877203 :ticker :GOOG :event-id 1}    ;; <<
  {:ts 1449088876590 :ticker :AAPL :event-id 2}
  {:ts 1449088877601 :ticker :MSFT :event-id 3}
  {:ts 1449088877203 :ticker :TSLA :event-id 4}    ;; <<
  {:ts 1449088875914 :ticker :NFLX :event-id 5}
  {:ts 1449088870005 :ticker :FB   :event-id 6}])

Notice that Tesla and Google have the same timestamp. So the (sorted-map-by) would not work here, as it would re assoc them. Of course a custom comparator can be used that will not treat “the same keys as the same”, but then there is a problem with key collisions.

Natural Numbers


So here I present to you a massively refactored solution with its codebase experiencing a two fold increase. The one and only: “The Time Series Database in One Line of Clojure 2.0”, or simply “The Time Series Database in 2.0 Lines of Clojure”.

I’ll format the first line for a better readability:

(defn ts [{t1 :ts} {t2 :ts}] 
  (if-not (= t1 t2) 
    (compare t1 t2)
    1))

This is a simple comparator with a twist: when it sees two timestamps that are the same, it lies.

Now on to the second line, a “database codebase conclusion”, as I call it:

(def db (sorted-set-by ts))

And.. done.

Action!


Some tools and queries from a previous 1.0 product:

;; database with data
(defn with [db data] (reduce conj db data))
 
;; find data before a timestamp
(defn before [db ts] (subseq db <= {:ts ts}))
 
;; find data after a timestamp
(defn after [db ts] (subseq db >= {:ts ts}))

Let’s look at the database with data:

=> (with db events)
 
#{{:ts 1449088870005, :ticker :FB, :event-id 6}
  {:ts 1449088875914, :ticker :NFLX, :event-id 5}
  {:ts 1449088876590, :ticker :AAPL, :event-id 2}
  {:ts 1449088877203, :ticker :GOOG, :event-id 1}  ;; << same
  {:ts 1449088877203, :ticker :TSLA, :event-id 4}  ;; << timestamp
  {:ts 1449088877601, :ticker :MSFT, :event-id 3}}

slicing and dicing:

(before (with db events) 1449088876592)
 
({:ts 1449088870005, :ticker :FB, :event-id 6} 
 {:ts 1449088875914, :ticker :NFLX, :event-id 5} 
 {:ts 1449088876590, :ticker :AAPL, :event-id 2})
(after (with db events) 1449088876592)
 
({:ts 1449088877203, :ticker :GOOG, :event-id 1} 
 {:ts 1449088877203, :ticker :TSLA, :event-id 4} 
 {:ts 1449088877601, :ticker :MSFT, :event-id 3})

Super Hero Friends


While it is nice to be able to slice a sorted set with a lying comparator, at times, it may not be desirable to do so.

But every super hero has a true friend. Spiderman, for instance, has many. So does “The Time Series Database in 2.0 Lines of Clojure”. The friend’s name is multim and it’s also a Super.


02
Dec 15

Time Series Database in One Line of Clojure

If you ever worked in the financial sector, specifically high frequency trading, a time series database is a well known tool that orders up all those quotes, orders, trades for financial pleasure.

The are many of these databases available. The Wall Street being The Wall Street would of course primarily use proprietary ones, since, well, it’s proprietary :), but giving them a credit: they do outperform open source ones by a lot, at least presently (talking about millions per second).

Disrupting Time Series Business


So I decided to write an open source time series database that will outperform them all not necessarily by performance, but definitely by clarity and size. Get ready for this one line.

If you read this far that means you are ready, so let’s begin by creating a database:

(def db (sorted-map-by >))

Oh, by the way we are done. It’s the one and only: The Time Series Database.

Map is King of Data


Let’s use it. First we’ll need some data:

(def data
  {1449088877092 {:GOOG {:bid 762.74 :offer 762.79}}
   1449088876590 {:AAPL {:bid 116.60 :offer 116.70}}
   1449088877601 {:MSFT {:bid 55.22 :offer 55.27}}
   1449088877203 {:TSLA {:bid 232.57 :offer 232.72}}
   1449088875914 {:NFLX {:bid 128.95 :offer 129.05}}
   1449088870005 {:FB {:bid 105.96 :offer 106.6}}})

The format is simple {timestamp data}.

Now a query to have a database as a value with this data:

(defn with [db data] (merge db data))

And finally some time based queries, like before and after:

(defn before [database ts] (into {} (subseq database > ts)))
(defn after [database ts] (into {} (subseq database < ts)))

done.

Action!


(before (with db data) 1449088877091)
 
{1449088876590 {:AAPL {:bid 116.6, :offer 116.7}},
 1449088875914 {:NFLX {:bid 128.95, :offer 129.05}},
 1449088870005 {:FB {:bid 105.96, :offer 106.6}}}
(after (with db data) 1449088877091)
 
{1449088877601 {:MSFT {:bid 55.22, :offer 55.27}},
 1449088877203 {:TSLA {:bid 232.57, :offer 232.72}},
 1449088877092 {:GOOG {:bid 762.74, :offer 762.79}}}

Beware, you, other time series databases!

P.S. Of course there is a possibility of events that came in at the exact same millisecond, so here is another line that solves it


24
Nov 15

Clojure Libraries in The Matrix

Clojure universe is mostly built on top of libraries rather than “frameworks” or “platforms”, which makes it really flexible and lots of fun to work with. Any library can be swapped, contributed to, or even created from scratch.

There are several things that make libraries great. The quality of its solution is of course the main focus which delivers the most value, but there are others. The one I’d like to mention is not how much a library does, but how little it should.

I like apples, you like me, you like apples


Dependencies are often overlooked when developing libraries. There are quite a few libraries that suffer from depending on something for either convenience, or for its built in example, or just in case, etc.

This results in downloading the whole maven repository when working on the project that depends on just a few of such libraries.

This also could create conflicts between the dependencies libraries bring and the real project required dependencies.

We can do better, and we should.

Those people don’t know what they are doing


The reason I bring it up is not because I am tired of these libraries, or it is time for a rant, but it is simply because I do it myself. And usually by the time I notice I did it, it requires significant rework to make sure developers that use/depend on my libraries do not bring “apples” that I like and they might not.

Useful vs. The Core


A great example of this is me including an excellent clojure/tools.logging as a top level dependency of mount. Mount manages application state lifecycle, and it would only make sense if every time a state is started or stopped, mount would log it:

dev=> (mount/start)
14:34:10.813 [nREPL-worker-0] INFO  mount.core - >> starting..  app-config
14:34:10.814 [nREPL-worker-0] INFO  mount.core - >> starting..  conn
14:34:10.838 [nREPL-worker-0] INFO  mount.core - >> starting..  nyse-app
14:34:10.844 [nREPL-worker-0] INFO  mount.core - >> starting..  nrepl
:started
 
dev=> (mount/stop-except #'app.www/nyse-app)
14:34:47.766 [nREPL-worker-0] INFO  mount.core - << stopping..  nrepl
14:34:47.766 [nREPL-worker-0] INFO  mount.core - << stopping..  conn
14:34:47.766 [nREPL-worker-0] INFO  mount.core - << stopping..  app-config
:stopped

It’s useful, right? Of course it is. As a developer that depends on mount, you don’t have to do it, it is already there for you, very informative and clean.

But here is the catch:

* what if you don’t like the way it logs it?
* what if you don’t want it to log at all?
* what if you use a different library for logging?
* etc..

In other words: “what if you don’t like or need apples and you eat bananas instead?”.

It ends up that “useful” is most of the time a red flag. Stop and think whether this “useful” feature is really the core piece of functionality, or is a bolted on “nice to have”.

Novelty Freshness of Refactoring


While it is not desired to have extra dependencies, and the above idea to include logging was not great, what was great are new thoughts during refactoring:

“Ok, I’ll remove logging, but now mount users won’t know anything about states..”

“Maybe they can use something like (states-with-deps) that would give them the current state of the application”:

dev=> (states-with-deps)
 
({:name app-config, :order 1, 
                    :started? true
                    :suspended? false
                    :ns #object[clojure.lang.Namespace 0x6e126efc "app.config"], 
                    :deps ()} 
 
 {:name conn, :order 2, 
              :started? true
              :suspended? false
              :ns #object[clojure.lang.Namespace 0xf1a66a6 "app.nyse"], 
              :deps ([app-config #'app.config/app-config])} 
 
 {:name nrepl, :order 3, 
               :started? true
               :suspended? false
               :ns #object[clojure.lang.Namespace 0x2c134117 "app"], 
               :deps ([app-config #'app.config/app-config])})

“That’s not bad, but what if they start/stop states selectively, or they suspended/resumed some states.. no visibility”

“Well, it’s simple, why not just return all the states that were affected by a lifecycle method?”

And that’s what I did. But I did not go through this thought process when I had logging in, since logging created an illusion of visibility and control, while in reality it gave “an ok” visibility and no control.

The solution just returns a vector of states that were affected:

dev=> (mount/start)
{:started [#'app.config/app-config 
           #'app.nyse/conn 
           #'app/nrepl
           #'check.suspend-resume-test/web-server
           #'check.suspend-resume-test/q-listener]}

The cool additional thing, and the reason it is a vector and not a set, is these states are in the vector in the order they were touched, in this case “started”.

Rules of The Matrix


While I made a mistake, I am glad I did. It gave me lots of food for thought as well as made me do some other cool tricks with robert hooke to demonstrate how to bring the same logging back if needed.

It does feel great to only depend on the Clojure itself, and a tiny tools.macro, which I use a single function from, and could potentially just grab from there, and cut my dependencies to The One.