I Second That Emotion
So Tim Bray finds out that Erlang IO is slow. I can attest to this fact, as my recent work on reading large files in Erlang has shown that IO and string manipulation is much slower than I would have wanted.
Yes, like Bray, my file reading is single threaded (although, what I do with the line is very multi-threaded) so I suppose using a single thread for Erlang isn't very Erlang-like in the first place.
In the meantime, I'm porting my OLAP cube generator to Scala. The assumption (and shortly, hopefully proof) is that the JVM can do file IO much better than Erlang, yet I can still take advantage of Scala's Actors to retain my concurrency.
Update: OK, some numbers and code. This is a benchmark for Erlang and Scala to read in a file line by line.
First, the Erlang code:
Now the Scala code:
OK, so these aren't exactly the same. The Scala example is dispatching to a function, so Scala is even at a disadvantage.
The timings, three runs each, on my MacBook Pro 2.2 Ghz Intel Core 2 Duo. Erlang is the BEAM emulator 5.5.5 and Scala is 2.6 running on JDK 1.5 on Mac OS X. Erlang code was compiled with HIPE.
I am reading in a 1028071833 bytes file with 10037355 lines.
Yes, like Bray, my file reading is single threaded (although, what I do with the line is very multi-threaded) so I suppose using a single thread for Erlang isn't very Erlang-like in the first place.
In the meantime, I'm porting my OLAP cube generator to Scala. The assumption (and shortly, hopefully proof) is that the JVM can do file IO much better than Erlang, yet I can still take advantage of Scala's Actors to retain my concurrency.
Update: OK, some numbers and code. This is a benchmark for Erlang and Scala to read in a file line by line.
First, the Erlang code:
process_file2(Filename) ->
{ok, File} = file:open(Filename, read),
process_lines2(File).
process_lines2(File) ->
case io:get_line(File, '') of
eof -> file:close(File);
_ -> process_lines2(File)
end.
Now the Scala code:
object LineReader {
def foreachline(in: BufferedReader, f: String => Unit): Unit = {
val line = in.readLine()
if (line == null) return
else f(line)
foreachline(in, f)
}
def forLines(filename: String, f: String => Unit) = {
val in = new BufferedReader(new FileReader(filename))
foreachline(in, f)
in.close()
}
}
OK, so these aren't exactly the same. The Scala example is dispatching to a function, so Scala is even at a disadvantage.
The timings, three runs each, on my MacBook Pro 2.2 Ghz Intel Core 2 Duo. Erlang is the BEAM emulator 5.5.5 and Scala is 2.6 running on JDK 1.5 on Mac OS X. Erlang code was compiled with HIPE.
I am reading in a 1028071833 bytes file with 10037355 lines.
Code | Run 1 | Run 2 | Run 3 |
---|---|---|---|
Erlang | 205.830 sec | 208.999 sec | 207.454 sec |
Java | 36.094 sec | 39.917 sec | 34.337 sec |