Showing posts from August, 2007

links for 2007-09-01

The Structure of Information Networks (Jon Kleinberg)
Note to self, quit job, read all of these.
(tags: informationnetworks)

Using Gmail to Relay Email

I'm preparing to move everything I host in house to save on monthly fees. My bandwidth needs are very low, but I need a lot of storage for home movies. Running a small linux server off my cable modem makes the most sense now.

Sending email from this server is a bit tricky, but with the help of some excellent tutorials, I've managed to tell postfix to relay all email through gmail. Google Apps For Your Domain is hosting my email, so this works out perfectly. (note: I'm only using regular gmail for my relay so far. I will try to use my google apps account soon.)

If you want to use gmail to relay email, check out the Gmail Relay Emails for Postfix on Redhat tutorial or the Gmail Relay Emails for Postfix on Ubuntu tutorial. Note that if you are running Ubuntu, you need to download the Thawte root SSL certificates, as outlined in a comment at that tutorial.

As a side note, I'm using DNS Park for my DNS hosting. DNS Park will host two free domains for you, and supports …

Push or Pull? Stateless or Stateful?

So Bill says XMPP matters and is "pushing" a push model for message delivery.

Am curious, though, if this debate can be re-framed as Stateless or Stateful? A stateless messaging system would map to a Pull strategy, placing the burden on the client to actively poll or pull its messages. A stateful system would map to Push, where the server maintains a connection for every subscribed client.

Stateful systems are hard to scale over the internet. One reason is because there's a limit to the number of TCP connections I can maintain open at any one time. What's the limit? Not sure, probably OS and configuration specific, but is that a limit that I'll easily hit? If I'm not actively maintaining an open connection, can the system still be called Push?

An aside on Push vs Pull. Push might make for faster reacting systems, but I know that Pull is usually the way I want to process information. The more Push events I have, the less I get done, the less I can focus, …

The Gettysburg Powerpoint Presentation

What if Lincoln used Powerpoint? (file under How Not To Do It)

Of course, if you're looking for an example of how to do powerpoint presentation right, don't miss Dick Hardt giving the OSCON 2005 Keynote on Identity 2.0. That's how you do it.

e c30ac536947f7330943f8de9c33f70ef2d5994e7

e is a stack for the data web. Not only is this all in Ruby and uses RDF, but it's some of the most bare code I've seen in a while.

You had me at "data web".

And +10 for using the file system as a data store instead of a database.

Too Much Data

Bill de hÓra writes that Phat Data is the challenge of the future. Couldn't agree more. My recent work with data warehouses certainly has shown me that managing and accessing terabytes of data is non trivial.

We've learned a few things, most importantly, "Denormalize and aggregate." Avoiding I/O is the most important step to take. And we've achieved some pretty decent performance numbers with a traditional relational database. However, as Bill points out, we're using it as a big indexed file system.

But having SQL and the numerous tools that support SQL has been critical to our success. I can't imagine solving these problems with proprietary tools. Sure, it's possible. Google did it, but they have more PhD's than you can shake a stick at. Plus some mega clusters.

While multi-core CPUs are a welcomed upgrade, what I really want is multi-spindle hard drives. Call me when I can emulate a google cluster in my desktop. What's lacking is a c…

Yes, database normalization is good

So InfoQ has collected a few blog posts which ask Data normalization, is it really that good?

Of course it's good, as long as you have requirements which dictate this optimization. If your application requires extremely fast writes, and this can happen in a heavy loaded OLTP system, then data normalization is your savior. If your application requires extremely fast reads, like OLAP systems, then of course data normalization is a killer.

These competing requirements are exactly why you have database systems optimized for either read or write. This is why large systems will maintain an operational system conforming to OLTP principles, and reporting systems conforming to OLAP principles.

Remember, traditional database systems are row oriented. This architecture is itself an optimization for OLTP and normalized data. Read mostly (or read only) systems can be column oriented, which organize the data on disk to optimize reads. For instance, Google's BigTable is an implementation …

links for 2007-08-09

Floatutorial: Step by step CSS float tutorial
In depth tutorial on using CSS floats.
(tags: csshtml)

Quiz on Friday

links for 2007-08-08

DHTML calendar, built with Ruby on Rails in mind.
(tags: rubyonrailscalendar)

Calculating Combinations In Ruby From Erlang

Well, thanks to the many people (here and here) that provided their versions of an erlang way to calculate combinations, I've really begun to open my mind to how to think functionally.

To help me understand what is going on, I've converted the basic idea into a Ruby version of calculation combinations. This uses recursion like the erlang versions do.

class Array
def head_tail
[self.first, self.tail]

def tail

def combos(list)
return [[]] if list.empty?
h, t = list.head_tail
t_combos = combos(t)
t_combos.inject([]) {|memo, obj| memo << [h] + obj} + t_combos

c = combos([1,2,3,4])
require 'pp'
pp c

As you can see, I added a bit of erlangism to the Array class, by adding a method to get the head and tail of an array.

Let's run through this.

On the first call to combos([1,2,3,4]) we jump over the first line (the exit in our recursion). We generate the head and tail, which in this case is 1 and [2,3,4] respectively. We immed…

Calculating Combinations the Erlang Way

If you recall, I wrote some Ruby code to calculate combinations of values in lists. I needed to create a list of all combinations of values, where each combination had between 0 and N number of values, where N is equal to length of the source list. (I'm not sure I'm explaining that correctly, but refer to my previous post for examples).

Here's my first shot at how to do this in erlang. It look longer to find math:pow and how to convert a float to an integer in erlang than to write the actual code.


combos(L) -> combos(L, bit_masks(length(L))).

combos(L, [BH|BT]) ->
[mask_list(L, BH)|combos(L, BT)];
combos(_, []) -> [].

mask_list([H|T], [BH|BT]) ->
case (BH) of
1 -> [H|mask_list(T, BT)] ;
0 -> mask_list(T, BT)
mask_list([], []) -> [].

bit_masks(NumColumns) ->
bit_masks(0, round(math:pow(2, NumColumns))-1, NumColumns).

bit_masks(Max, Max, NumColumns) ->
[padl(NumColumns, bl(Max))];

bit_masks(X, Max, NumColumn…