Scala serialization pickle

Let us begin our story with some structure.

1
2
3
case class Pupper(name: String)

case class Doggies(oldPuppers: Seq[Pupper])

Next we obtain some data.

1
val input = Doggies(Seq(Pupper("Adolf"), Pupper("Ben")))

And now we want to serialize them. I firmly believed that Scala’s case classes and collections are serializable. So I tried “classic” serialization:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import java.io.{ByteArrayInputStream, ByteArrayOutputStream, ObjectInputStream, ObjectOutputStream}

val serialized = {
val bos = new ByteArrayOutputStream()
val out = new ObjectOutputStream(bos)
out.writeObject(input)
val r = bos.toByteArray
out.close()
bos.close()
r
}
val unserialized = {
val bis = new ByteArrayInputStream(serialized)
val in = new ObjectInputStream(bis)
val r = in.readObject()
in.close()
bis.close()
r
}

I expected that previous code would run smoothly and the following would hold:

1
input == unserialized

Oh, boy, I was so wrong. It crashed on deserialization on a quite mysterious thing:

1
java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field Doggies.oldPuppers of type scala.collection.Seq in instance of Doggies

Honestly, I am stumped by it. I just didn’t expect such basic feature like serialization to be a problem in Scala.

Luckily there are other ways, one might even say better ways. Like for example a library called scala/pickling (link). Neccessary build modifications:

1
2
3
resolvers += Resolver.sonatypeRepo("snapshots")

libraryDependencies += "org.scala-lang.modules" %% "scala-pickling" % "0.10.2-SNAPSHOT"

I must say it is very straighforward and consise to use it. I like it a lot. Observe:

1
2
3
4
import scala.pickling.Defaults._, scala.pickling.binary._

val pickled: Array[Byte] = input.pickle.value
val unserialized = pickled.unpickle[AnyRef]

Minor issue is that I was unable to make it working in REPL, but that’s not really anything serious. The following assert actually passes, yey!

1
assert(input == unserialized)

The pickling way is more concise, produces smaller binary blobs and (according to their page) is faster than “classic” serialization.

Addendum

Working code is available at GitHub - just over the horizon there. Code was tested against current Scala version - 2.11.8. Short log follows. It has obviously some issue with deserializing List which contains non-primitive values.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
-- Running tests with data List(1).

Using classic on List(1)
List(1) ==> 210 B ==> List(1)
Test succeeded.

Using pickling on List(1)
List(1) ==> 95 B ==> List(1)
Test succeeded.


-- Running tests with data IntSeqHolder(List(1)).

Using classic on IntSeqHolder(List(1))
IntSeqHolder(List(1)) ==> 270 B ==> IntSeqHolder(List(1))
Test succeeded.

Using pickling on IntSeqHolder(List(1))
IntSeqHolder(List(1)) ==> 111 B ==> IntSeqHolder(List(1))
Test succeeded.


-- Running tests with data List(Some(1)).

Using classic on List(Some(1))
List(Some(1)) ==> 289 B ==> List(Some(1))
Test succeeded.

Using pickling on List(Some(1))
List(Some(1)) ==> 109 B ==> List(Some(1))
Test succeeded.


-- Running tests with data List(Pupper(Collin)).

Using classic on List(Pupper(Collin))
Test failed: ClassNotFoundException - Pupper

Using pickling on List(Pupper(Collin))
List(Pupper(Collin)) ==> 98 B ==> List(Pupper(Collin))
Test succeeded.


-- Running tests with data Doggies(List(Pupper(Adolf), Pupper(Ben))).

Using classic on Doggies(List(Pupper(Adolf), Pupper(Ben)))
Test failed: ClassCastException - cannot assign instance of scala.collection.immutable.List$SerializationProxy to field Doggies.oldPuppers of type scala.collection.Seq in instance of Doggies

Using pickling on Doggies(List(Pupper(Adolf), Pupper(Ben)))
Doggies(List(Pupper(Adolf), Pupper(Ben))) ==> 168 B ==> Doggies(List(Pupper(Adolf), Pupper(Ben)))
Test succeeded.