Performance with Scala Arrays and Lists
As I continue to tinker with Scala, I was wondering about the performance differences between an Array and List. This post will detail what I've found, but as always YMMV and I could be doing it all wrong. If there's a better (in this case, better == faster) way to do this in Scala, please let me know.
My application performs a lot of collection iteration as it combines the values of two collections into a new collection by addition. For instance, I need to combine
Intuition tells me that the Array will perform better, but this is Scala, and Lists reign supreme. So we'll go head to head.
For each test, I wanted to write a function that combined the two collections using tail recursion.
First up, I am adding two lists together while forming a third. One problem here is, due to the way the algorithm is structured, the resulting list is built backwards. So there's a call to
To call it:
Next up, I add two Arrays into a List. The guess here is that accessing the arrays by index will help speed it up.
To call it:
This should be the fastest.
To call it:
I ran each function 1 million times and captured the times with
The results are in, and sure enough, on average, the third option (pure Arrays) is the fastest.
* Test 1 - 1172 ms
* Test 2 - 781 ms
* Test 3 - 687 ms
So, for my purposes, using Arrays results in faster execution. However, if you are looking to do traditional functional programming, you should write your methods to create zero side effects. Using Arrays like this seems anti-functional programming.
My application performs a lot of collection iteration as it combines the values of two collections into a new collection by addition. For instance, I need to combine
[1,2]
and [3,4]
into [4,6]
. I wanted to find out if the collections should be an Array or List.Intuition tells me that the Array will perform better, but this is Scala, and Lists reign supreme. So we'll go head to head.
For each test, I wanted to write a function that combined the two collections using tail recursion.
Test One - Two Lists Into a Third
First up, I am adding two lists together while forming a third. One problem here is, due to the way the algorithm is structured, the resulting list is built backwards. So there's a call to
reverse
at the end. (Question: How to rewrite this using normal List methods such as ::
without having to call reverse
at the end?)
def add(x: List[Long], y: List[Long],
agg: List[Long]): List[Long] = x match {
case Nil => agg.reverse
case x1 :: xs => y match {
case y1 :: ys => add(xs, ys, x1 + y1 :: agg)
}
}
To call it:
add(List(1,2), List(3,4), Nil)
Test Two - Two Arrays Into a List
Next up, I add two Arrays into a List. The guess here is that accessing the arrays by index will help speed it up.
def add2(x: Array[Long], y: Array[Long], agg: List[Long],
counter: Int): List[Long] = {
if (counter == 0) agg
else add2(x, y, x(counter-1) + y(counter-1) :: agg, counter-1)
}
To call it:
add2(Array(1,2), Array(3,4), Nil, 2)
Test Three - Two Arrays Into a Third Array
This should be the fastest.
def add3(x: Array[Long], y: Array[Long], agg: Array[Long],
i: Int): Array[Long] = {
if (i == x.length) agg
else {
agg(i) = x(i) + y(i)
add3(x, y, agg, i+1)
}
}
To call it:
add3(Array(1,2), Array(3,4), new Array(2), 0)
Methodology
I ran each function 1 million times and captured the times with
System.currentTimeMillis
. I ran the entire test suite five times to generate an average. I am running Scala 2.5.1 on Java 1.6 on Windows XP. I have a Pentium 4 2.8GHz with 2GB RAM.Results
The results are in, and sure enough, on average, the third option (pure Arrays) is the fastest.
* Test 1 - 1172 ms
* Test 2 - 781 ms
* Test 3 - 687 ms
So, for my purposes, using Arrays results in faster execution. However, if you are looking to do traditional functional programming, you should write your methods to create zero side effects. Using Arrays like this seems anti-functional programming.