Scala: Should I use a method or a function?
This question is often raised on forums, and there is a great deal of confusion in the context of the functional programming what’s the difference is and what should be used where. Scala has two function-like entities:
Both of them can be used in a functional context:
class Test {
# Method
def m(x: Int) = x + 1
# Function
val f = (x: Int) => x + 1
# Using function
def x = println(List(1, 2, 3).map(f))
# Using method
def y = println(List(1, 2, 3).map(m))
}
Similar, isn’t it? Let’s look closer.
m
here is a method, the very same class method we are familiar with in Java. It’s not an object, can’t be passed anywhere and has no value. In fact it belongs to the class Test
and only makes sense when calling it as testInstance.m(...)
. And indeed, decompiled bytecode shows it as public int m(int);
in class Test
.
But we pass it to map
, don’t we? First, let’s comment out def y
and look at class files:
Test$$anonfun$1.class Test.classs
Nothing too interesting: our Test
class and anonymous function f
in Test$$anonfun$1
. Now let’s return def y
back:
Test$$anonfun$1.class Test$$anonfun$y$1.class Test.class
Wait, what is Test$$anonfun$y$1
? When Scala sees a method in a functional context it defines anonymous function class (a class derived from Function1
in this particular example), which has apply
method, and that apply
method simply calls testInstance.m(...)
(testInstance
is passed as an argument to anonymous function constructor so it can be used later). Almost the same thing happens with f
, an anonimous function class is defined with the corresponding apply
method. There is one big difference though if we look at bytecode for x
and y
:
x:
24: invokevirtual #37 // Method scala/Predef$.wrapIntArray:([I)Lscala/collection/mutable/WrappedArray;
27: invokevirtual #41 // Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala ...
30: aload_0
31: invokevirtual #43 // Method f:()Lscala/Function1;
34: getstatic #33 // Field scala/collection/immutable/List$.MODULE$:Lscala/collection/immutable/List$;
37: invokevirtual #47 // Method scala/collection/immutable/List$.canBuildFrom:()Lscala/collection/generic...
40: invokevirtual #53 // Method scala/collection/immutable/List.map:(Lscala/Function1;Lscala/collection...
43: invokevirtual #57 // Method scala/Predef$.println:(Ljava/lang/Object;)V
46: return
y:
24: invokevirtual #36 // Method scala/Predef$.wrapIntArray:([I)Lscala/collection/mutable/WrappedArray;
27: invokevirtual #40 // Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)Lscala ...
30: new #59 // class Test$$anonfun$y$1
33: dup
34: aload_0
35: invokespecial #63 // Method Test$$anonfun$y$1."<init>":(LTest;)V
38: getstatic #32 // Field scala/collection/immutable/List$.MODULE$:Lscala/collection/immutable/List$;
41: invokevirtual #46 // Method scala/collection/immutable/List$.canBuildFrom:()Lscala/collection/generic ...
44: invokevirtual #52 // Method scala/collection/immutable/List.map:(Lscala/Function1;Lscala/collection ...
47: invokevirtual #56 // Method scala/Predef$.println:(Ljava/lang/Object;)V
Every time I call y
a new instance of class Test$$anonfun$y$1
is created and initialized! While in x
nothing like this happens as anonymous function was instantiated when I declared f
, so the code simply calls this.f
to retrieve pre-initialized value.
Given this my suggestion would be to use explicit functions whenever higher-order functions/methods are involved and not rely on implicit conversions, and use methods in all other cases.
Update: Scala 2.12-M4 and Java 8 lambdas
Does it “no longer matter” with introduction of Scala 2.12 which compiles to Java 8 lambdas? Let’s look at Java 8 bytecode. The good part is that Scala 2.12 doesn’t create class definitions for anonymous functions, instead it’s using factories to create dynamic classes on the fly. At least JAR becomes smaller and loading potentially faster. Instead Scala creates methods in class Test
which are called from a Function1
object:
// f
public static final int Test$$$anonfun$1(int);
0: iload_0
1: iconst_1
2: iadd
3: ireturn
// wrapper for m
public final int Test$$$anonfun$2(int);
0: aload_0
1: iload_1
2: invokevirtual #86 // Method m:(I)I
5: ireturn
Compiler is smart enough to optimize out the instance of Test
in the first case (i.e. the method is static
) which saves it from passing an object reference in this particular case. Unfortunately in the second case this
is required since we are going to call m
. Let’s look how these functions are called:
public Test();
0: aload_0
1: invokespecial #89 // Method java/lang/Object."<init>":()V
4: aload_0
5: invokedynamic #95, 0 // InvokeDynamic #1:apply$mcII$sp:()Lscala/runtime/java8/JFunction1$mcII$sp;
10: putfield #25 // Field f:Lscala/Function1;
13: return
public void x();
...
24: invokevirtual #41 // Method scala/Predef$.wrapIntArray:([I)Lscala/collection/mutable/WrappedArray;
27: invokevirtual #45 // Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)...
30: aload_0
31: invokevirtual #47 // Method f:()Lscala/Function1;
34: getstatic #37 // Field scala/collection/immutable/List$.MODULE$:Lscala/collection/...
vs
public void y();
24: invokevirtual #41 // Method scala/Predef$.wrapIntArray:([I)Lscala/collection/mutable/WrappedArray;
27: invokevirtual #45 // Method scala/collection/immutable/List$.apply:(Lscala/collection/Seq;)...
30: aload_0
31: invokedynamic #83, 0 // InvokeDynamic #0:apply$mcII$sp:(LTest;)Lscala/runtime/java8/JFunction1$mcII$sp;
36: getstatic #37 // Field scala/collection/immutable/List$.MODULE$:Lscala/collection/immutable/List$;
First of all, what is this invokedynamic
call? JVM is using a factory to create a lightweight object with apply
method. Factory arguments look like this (for f
):
#70 (I)I
#92 invokestatic Test.Test$$$anonfun$1:(I)I
#70 (I)I
#75 3
#76 1
#78 scala/Serializable
#79 0
Basically it tells JVM how to initialize an object. But invokedymanic
is not just initializing the object, it actually creates a code snippet based on the factory data and replaces invokedynamic
with it so it doesn’t need to touch the factory again. Well, it’s still not an object but inline code, so every type invokedynamic
is called it will initialize a new lightweigth object with apply
method. So the mechanism is the same as in 2.11, only lambdas are much lighter and don’t require full classes.
What about those cycles? The difference is still there. x
is simply using f
initialized when constructing Test
object. y
is initializing lightweight anonymous function every time y
is called, and also wrapping a call to m
in another method, namely Test$$$anonfun$2(int)
.
Does it all really matter? Well, I wouldn’t be eager to replace methods used in higher-order method/functions in existing code (unless it’s in performance-crytical code like Spark, but even then benchmarks should probably be done first). But whenever I have a choice between two options and there is no other reason except for historical I would use functions.
“Entities must not be multiplied beyond necessity” (c) Occam
Update: Performance
Let’s do some simple benchmarking (2.12.0-M4):
class Test {
val f = (x: Int) => x + 1
def m(x: Int) = x + 1
def x = (1 to 1000).map(f)
def y = (1 to 1000).map(m)
}
object Test {
val test = new Test
val samples = 10000
def sqr(x: Double) = x * x
def main(args: Array[String]) = {
val results = Array.fill[Long](samples)(0)
var i = 0
while(i < samples) {
var i2 = 0
val t0 = System.nanoTime()
while(i2 < 1000) {
test.x // test.y
i2 += 1
}
results(i) = System.nanoTime() - t0
i += 1
}
println(s"min=${results.min}")
}
}
Minimum will probably take into account warm-up and occasional GC. Let’s even run it 10 times and take minimum again. Here’s the result:
x: 6669954
y: 6962329 (+4.4%)
Not much and it’s a special case, but it’s noticeable.