OLAP Cube Construction with Erlang
I've managed to build the parallel OLAP cube constructor in Erlang. This program achieves parallelization through creating a process for every dimension in the OLAP cube. Each process manages the file that holds the dimension data. Messages are passed from the first dimension all the way down to the last dimension which stores the measures themselves. To further parallelize things, you can partition any dimension using a modulus, which creates another file and process. This helps get around the 2 GB limit for dets tables.
I also have basic path based querying working, which is also parallelized through sending the query message through each dimension. While the querying itself isn't parallel for a particular client, it will theoretically scale to handle many clients.
When I move to move traditional querying to generate a traditional tabular result set, I will be able to parallelize the query for a single client.
I'll post the working code once I can choose a suitable license. I'm very interested to hear feedback, as I'm very much still an Erlang n00b.
Next up I'll generate some performance numbers to see if this thing will actually perform in the real world.
I have to say, functional programming is great when your solution is algorithmic. Previous implementations of mine were done in Java or Ruby, which are object oriented. The classes and object obscured the algorithm, which in the case of OLAP cubes is the primary focus.
I also have basic path based querying working, which is also parallelized through sending the query message through each dimension. While the querying itself isn't parallel for a particular client, it will theoretically scale to handle many clients.
When I move to move traditional querying to generate a traditional tabular result set, I will be able to parallelize the query for a single client.
I'll post the working code once I can choose a suitable license. I'm very interested to hear feedback, as I'm very much still an Erlang n00b.
Next up I'll generate some performance numbers to see if this thing will actually perform in the real world.
I have to say, functional programming is great when your solution is algorithmic. Previous implementations of mine were done in Java or Ruby, which are object oriented. The classes and object obscured the algorithm, which in the case of OLAP cubes is the primary focus.