Minification is not enough, you need tree shaking

In which the virtues of automated mechanical arboreal pruning are extolled over quaint manual labor, as applied to web development build processes.

The setup

Ever notice how the primary bit of marketing for many traditional web programming libraries is their download size? Why is that?

Check this out:

jQuery claims it is only 32kb minified.

Zepto claims it is less than a quarter the size of jQuery.

Dojo claims its nano core is 3.8kb.

Why does size matter so much for these libraries? Your first instinct is probably, "because the more bytes you shuttle across the wire, the slower the app starts up." Yes, this is true. I'd also say you're wrong. The primary reason that size matters for these libraries is because traditional web development has no intelligent or automated way to prune unused code so you can ship only the code that is used over the wire.

The web is full of links, yet web dev has no linker

The web development workflow is missing a linking step. A linker's job is to combine distinct project files into a single executable. A smart linker will only include the symbols and code that are actually used by the application, thus pruning unused code. The traditional web developer does not have an intelligent linker.

It's 2013, and the job of micro-managing web development libraries is still being done by humans. Humans: the same people that brought you this little gem. Web developers need machines and tools to take care of linking and minifying so they can get back to comparing traditional web development libraries based on actual feature sets instead of how femto they are.

I want my web programming language to offer enough structure and intelligent tools to take care of pruning, minification, and more. This is why I dig Dart, because it has the structure (classes, libraries, packages, type annotations, metadata, etc) and the tools (dart2js) for a modern development workflow.

Don't just prune unused code, shake it off

Dart tools support tree shaking, a technique to "shake" off unused code, thus shrinking the size of the deployed application. I can import rich libraries chock full of useful goodness into my application, but only the functions I actually use will be included in my generated output. Awesome!

The source application goes through a tree-shaking compiler and its output is smaller.

Real code example, shaken not stirred

Consider this simple Dart library, ironically named embiggen. There are two top-level functions in this library, embiggen and unembiggen.

 library embiggen;  
   
 String embiggen(String msg) {  
  if (msg == null) {  
   throw new ArgumentError("must not be null");  
  }  
    
  return msg.toUpperCase();  
 }  
   
 String unembiggen(String msg) {  
  if (msg == null) {  
   throw new ArgumentError("must not be null");  
  }  
    
  return msg.toLowerCase();  
 }  

Here is the main program, which uses only embiggen:

 import 'package:embiggen/embiggen.dart';  
   
 main() {  
  var args = new Options().arguments;  
  if (args.length == 0) {  
   print("Usage: dart embiggen.dart phrase");  
   return;  
  }  
    
  var phrase = args[0];  
    
  print(embiggen(phrase));  
 }  
   

I love embiggen, but I'm less entralled with unembiggen and will never use it. Do I have to search for nano-embiggen?! Nay! Let the linker do it's tree-shaking magic.

Run the main application through the dart2js tool, which supports tree-shaking for both JavaScript and Dart outputs. Note there is no command-line option for tree shaking, because dart2js is always tree shaking. For simplicity's sake, let's generate Dart.

 dart2js --output-type=dart embiggen.dart  

Gaze into the tree-shook generated output (reformatted to make it easy to read):

main() {
  var args=new Options().arguments;
  if (args.length == 0) {
    print("Usage: dart embiggen.dart phrase");
    return;
  }
  var phrase=args[0];print(embiggen(phrase));
}

String embiggen(String msg) {
  if (msg == null) {
    throw new ArgumentError("must not be null");
  }
  return msg.toUpperCase();
}

(note: The actual output is actually all on one line, with white space removed. I reformatted the code above to make it easy to read.)

Notice how the embiggen function is included, but unembiggen is nowhere to be seen, even though I imported the library. The tree, it is shaken!

But is this the best we can do? The dart2js tool also supports minification with the --minify flag.

 dart2js --minify --output-type=dart embiggen.dart  

The minified, single-line, tree-shook generated output:

 main(){var A=new Options().arguments;if(A.length==0){print("Usage: dart
embiggen.dart phrase");return;}var C=A[0];print(B(C));}
B( A){if(A==null){throw new ArgumentError("must not be null");}
return A.toUpperCase();}  

Both outputs have unused code eliminated, and the minified version also replaces variable names. This is exactly the kind of help I want from my tools.

Why this works

The structure of Dart programs cannot change after compilation. In other words, Dart does not support altering class structure during runtime. Dart also does not have extreme dynamism like eval(). Dart compilers and linkers can assume more about the structure of the program, and thus can be more aggressive about tree shaking and minifications.

Moral of the story

I believe that web developers need a better workflow that automates tree shaking, dead code elimination, minification, and more. Stop caring how big a library is, and instead let a tool or build step produce the smallest output possible for you, ideally by tree shaking the application.

One option to consider is Dart, with its structured language and intelligent tools, like a tree-shaking and minifying compiler. With dart2js, you can import entire libraries, regardless of size, and generate only the code that is required to run the program.

Regardless of what language you use, demand more from your tools.

 

Acknowledgements

Thanks to Bob Nystrom's OSCON presentation from 2012, from which I humbly embraced-and-extended the setup of this post.
2 comments

Popular posts from this blog

  • Sponsor:  Register today for  New Game, the conference for HTML5 game developers . Learn from Mozilla, Opera, Google, Spil, Bocoup, Mandreel, Subsonic, Gamesalad, EA, Zynga, and others at this intimate and technically rich conference. Join us for two days of content from developers building HTML5 games today. Nov 1-2, 2011 in San Francisco.  Register now ! This is the second article in a Box2D series, following the Box2D Orientation article. The Box2DWeb port of Box2D contains a nice example to show off the basics of integrating physics simulations into your web app. This post will provide a walkthrough of the example, explaining the high level concepts and code. First, let's see the example in action. The code for the above is open source and available on GitHub. It was adapted from Box2DWeb's example . Animating Before we look at Box2D, it's important to understand how the above simulation is animated. You might think setInterval or setTimeout is
  • In which I port a snazzy little JavaScript audio web app to Dart , discover a bug, and high-five type annotations. Here's what I learned. [As it says in the header of this blog, I'm a seasoned Dart developer. However, I certainly don't write Dart every day (I wish!). Don't interpret this post as "Hi, I'm new to Dart". Instead, interpret this post as "I'm applying what I've been documenting."] This post analyzes two versions of the same app, both the original (JavaScript) version and the Dart version. The original version is a proxy for any small JavaScript app, there's nothing particularly special about the original version, which is why it made for a good example. This post discusses the differences between the two implementations: file organization, dependencies and modules, shims, classes, type annotations, event handling, calling multiple methods, asynchronous programming, animation, and interop with JavaScript libraries. F
  • Angular and Polymer, sitting in a DOM tree, B-i-n-d-i-n-g. First comes components, Then comes elements, Then comes the interop with the node dot bind. Angular , a super heroic MVC framework, and Polymer , polyfills and enhancements for custom elements built on top of Web Components, can live harmoniously in the same app. This post shows you how to connect Angular-controlled components to Polymer-controlled elements via data binding. And we do it all in Dart . Angular and Polymer I get asked "Should I use Angular or Polymer?" a lot. My answer is, "Yes". That is, both libraries have distinct strengths, and you can use both in the same app. Polymer excels at creating encapsulated custom elements. You can use those custom elements in any web app or web page, regardless if that app is built with Angular, Ember, etc. Angular excels at application engineering, with dependency injection, end-to-end testability, routing, and services. Here are som