Scala: Fun with CanBuildFrom

As I found out through trying.. It may not be an easy task to explain Scala’s CanBuildFrom.

Before I dive into a quick gist, I think it’d be helpful to mention the best explanation of what happens behind the CanBuildFrom’s scenes that can be found on Stack Overflow in this answer.

The gist is, Scala has multiple layers of collections extending different capabilities. Let’s look at one such capability: TraversableLike, that most of the collections implement. Since let’s be honest, a collection is not very useful if it cannot be traversed. One of the most “famous” methods from TraversableLike is “map”:

def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
  val b = bf(repr)
  b.sizeHint(this) 
  for (x <- this) b += f(x)
  b.result
}

Despite of the fact that it is “Scala looking”, it is actually quite simple => takes each element of a collection that it is called on, applies a provided function “f” to each element of the collection, and returns another collection “That” as the result.

The interesting bit here is:

... (implicit bf: CanBuildFrom[Repr, B, That]) ...

which seems a bit awkward (aren’t all Scala implicits..). “implicit” just means that a Scala compiler will search for this type “CanBuildFrom[Repr, B, That]” anywhere in the “scope”. In this case it’ll first look whether there is an “implicit CanBuildFrom[Repr, B, That]..” defined on the collection that the “map” is invoked on, then it’ll look in its super type/class, etc.. until it finds it.

Once it finds it, it’ll use that as a “builder” for “That” resulting collection. The way it looks for it though is not just “let me look if “CanBuildFrom” is there”, but also “let me look if “CanBuildFrom” is there that is parametrized with a given ‘Repr’ (e.g. collection) and ‘B’ (element type)”.

Here is a quick example. Let’s say we have a BitSet:

scala> import scala.collection.immutable.BitSet
import scala.collection.immutable.BitSet
 
scala> val bits = BitSet( 42, 84, 126 )
bits: scala.collection.immutable.BitSet = BitSet(42, 84, 126)

Once we map over this BitSet with a function (“/ 2L”) that produces something different than “Int”s as elements, a BitSet can no longer handle the result (BitSet can only have Ints as its elements) hence a Scala compiler jumps to a super class of a BitSet, which is a Set, and uses its “CanBuildFrom”, since it is a bit more generic:

implicit def canBuildFrom[A]: CanBuildFrom[Coll, A, Set[A]] = setCanBuildFrom[A]

Here “A” matches a “Long” that is now (after a map was applied) a type of resulting elements:

scala> val aintBits = bits.map( _ / 2L )
aintBits: scala.collection.immutable.Set[Long] = Set(21, 42, 63)

But we want our BitSet back.. Give me my BitSet back I say:

scala> val bitsAgain = aintBits.map( _.toInt )
val bitsAgain = aintBits.map( _.toInt )
bitsAgain: scala.collection.immutable.Set[Int] = Set(21, 42, 63)

But no, it does not.. And how would it know I need a BitSet. Hmm.. Give me my BitSet I urge you:

scala> val bitsAgain = aintBits.map( _.toInt ).asInstanceOf[BitSet]
val bitsAgain = aintBits.map( _.toInt ).asInstanceOf[BitSet]
java.lang.ClassCastException: scala.collection.immutable.Set$Set3 cannot be cast to scala.collection.immutable.BitSet
                              ... ...

But no, it does not..

Logically if “CanBuildFrom” is what got us a Set from a BitSet in the first place, can it be used to get a BitSet back?

Well, let’s see. We know that we have a Set of Longs (Set[Long]), where each element after applying a map function “toInt” is of type “Int”, and we need a BitSet back. Let’s create our own “CanBuildFrom” that does just that:

scala> import scala.collection.generic.CanBuildFrom
import scala.collection.generic.CanBuildFrom
 
scala> val setToBitSetBuilder = new CanBuildFrom[Set[Long], Int, BitSet] { def apply(from: Set[Long]) = this.apply(); def apply() = BitSet.newBuilder }
setToBitSetBuilder: java.lang.Object with scala.collection.generic.CanBuildFrom[Set[Long],Int,scala.collection.immutable.BitSet] = $anon$1@60bc1caa

Now let’s use it:

scala> val bitsAgain = aintBits.map( _.toInt )( setToBitSetBuilder )
val bitsAgain = aintBits.map( _.toInt )( setToBitSetBuilder )
bitsAgain: scala.collection.immutable.BitSet = BitSet(21, 42, 63)

And woo hoo, the “bitsAgain” is truly a BitSet again. What really happened, a Scala compiler was looking for an implicit “CanBuildFrom” for a collection “Set[Long]” and the (resulting) element type “Int”. And we just handed such a thing (“setToBitSetBuilder”) to it. “setToBitSetBuilder” just returns a “builder” that is used to build a resulting collection. In this case we use Scala’s own “BitSet.newBuilder”.

To make it more readable, a pimp my library pattern can be later used => aintBits.to[BitSet].

This is rather a quick overview of what “CanBuildFrom” is, and it does not really discuss a function currying which is used by “map(A)(B):C”, skims over implicits, etc.. But it gives a little insight to where and how “CanBuildFrom” can be used.