Erlang Fun Results
As I code more and more in Erlang, I'm very interested how to achieve the best performance possible. I'm very new to the language, so I'm still unsure what the best practices are concerning performance. Tonight I was testing the effects of using a fun() in an algorithm.
My test concerns reading a tab delimited text file, tokenizing it, and converting any numbers into integers. I've split the program into two conceptual parts: 1) the file IO and line reading, and 2) the handling of the line. I wanted to test the performance differences between using a fun() for the line handling vs just including the line handling code directly.
My test text file is 10037355 lines long and 1028071833 bytes big. I compiled my code using HIPE.
The quick answer is, using a fun() is slightly slower than not using it (which is to be expected). For my particular test, using a fun() was approximately 20% slower.
Here's the code. I stole the erlang timing functions from David King (thanks David!).
(I'm just showing the difference.)
Of course, it's easy to argue that Erlang isn't the best language for string manipulation. But this part of the application is hardly the bottleneck, so I'm willing to take the bloat in order to take advantage of the concurrency later on.
Next up, I'll do timing experiments testing if tail recursion speeds anything up.
My test concerns reading a tab delimited text file, tokenizing it, and converting any numbers into integers. I've split the program into two conceptual parts: 1) the file IO and line reading, and 2) the handling of the line. I wanted to test the performance differences between using a fun() for the line handling vs just including the line handling code directly.
My test text file is 10037355 lines long and 1028071833 bytes big. I compiled my code using HIPE.
The quick answer is, using a fun() is slightly slower than not using it (which is to be expected). For my particular test, using a fun() was approximately 20% slower.
test | Run 1 (sec) | Run 2 (sec) |
---|---|---|
fun() | 484.631 | 485.380 |
without fun() | 404.017 | 403.632 |
Here's the code. I stole the erlang timing functions from David King (thanks David!).
Using a fun()
time_takes(Mod,Fun,Args) ->
Start=erlang:now(),
Result = apply(Mod,Fun,Args),
Stop=erlang:now(),
io:format("~p~n",[time_diff(Start,Stop)]),
Result.
time_diff({A1,A2,A3}, {B1,B2,B3}) ->
(B1 - A1) * 1000000 + (B2 - A2) + (B3 - A3) / 1000000.0 .
handle_line(Line, SplitOn) ->
L = string:tokens(string:strip(Line, both, $\n), " "),
{Dimensions, Measures} = lists:split(SplitOn, L),
lists:map(fun(X) -> {I,_} = string:to_integer(X), I end, Measures).
process_file(Filename, Proc) ->
{ok, File} = file:open(Filename, read),
process_lines(File, Proc, 0).
process_lines(File, Proc, LineNum) ->
case io:get_line(File, '') of
eof -> file:close(File);
Line ->
Proc(Line),
process_lines(File, Proc, LineNum + 1)
end.
Including Code Directly (no fun())
(I'm just showing the difference.)
process_lines(File, Proc, LineNum) ->
case io:get_line(File, '') of
eof -> file:close(File);
Line ->
L = string:tokens(string:strip(Line, both, $\n), " "),
{Dimensions, Measures} = lists:split(10, L),
lists:map(fun(X) -> {I,_} = string:to_integer(X), I end, Measures),
process_lines(File, Proc, LineNum + 1)
end.
Of course, it's easy to argue that Erlang isn't the best language for string manipulation. But this part of the application is hardly the bottleneck, so I'm willing to take the bloat in order to take advantage of the concurrency later on.
Next up, I'll do timing experiments testing if tail recursion speeds anything up.