I Second That Emotion

So Tim Bray finds out that Erlang IO is slow. I can attest to this fact, as my recent work on reading large files in Erlang has shown that IO and string manipulation is much slower than I would have wanted.

Yes, like Bray, my file reading is single threaded (although, what I do with the line is very multi-threaded) so I suppose using a single thread for Erlang isn't very Erlang-like in the first place.

In the meantime, I'm porting my OLAP cube generator to Scala. The assumption (and shortly, hopefully proof) is that the JVM can do file IO much better than Erlang, yet I can still take advantage of Scala's Actors to retain my concurrency.

Update: OK, some numbers and code. This is a benchmark for Erlang and Scala to read in a file line by line.

First, the Erlang code:


process_file2(Filename) ->
{ok, File} = file:open(Filename, read),
process_lines2(File).

process_lines2(File) ->
case io:get_line(File, '') of
eof -> file:close(File);
_ -> process_lines2(File)
end.


Now the Scala code:


object LineReader {

def foreachline(in: BufferedReader, f: String => Unit): Unit = {
val line = in.readLine()
if (line == null) return
else f(line)
foreachline(in, f)
}

def forLines(filename: String, f: String => Unit) = {
val in = new BufferedReader(new FileReader(filename))
foreachline(in, f)
in.close()
}

}


OK, so these aren't exactly the same. The Scala example is dispatching to a function, so Scala is even at a disadvantage.

The timings, three runs each, on my MacBook Pro 2.2 Ghz Intel Core 2 Duo. Erlang is the BEAM emulator 5.5.5 and Scala is 2.6 running on JDK 1.5 on Mac OS X. Erlang code was compiled with HIPE.

I am reading in a 1028071833 bytes file with 10037355 lines.




















CodeRun 1Run 2Run 3
Erlang205.830 sec208.999 sec207.454 sec
Java36.094 sec39.917 sec34.337 sec

Popular posts from this blog

The 29 Healthiest Foods on the Planet

Lists and arrays in Dart

Converting Array to List in Scala