The Architecture of Open Source Applications (Volume 2)

Mike Kamermans

Originally developed by Ben Fry and Casey Reas, the Processing programming language started as an open source programming language (based on Java) to help the electronic arts and visual design communities learn the basics of computer programming in a visual context. Offering a highly simplified model for 2D and 3D graphics compared to most programming languages, it quickly became well-suited for a wide range of activities, from teaching programming through writing small visualisations to creating multi-wall art installations, and became able to perform a wide variety of tasks, from simply reading in a sequence of strings to acting as the de facto IDE for programming and operating the popular "Arduino" open source hardware prototyping boards. Continuing to gain popularity, Processing has firmly taken its place as an easy to learn, widely used programming language for all things visual, and so much more.

The basic Processing program, called a "sketch", consists of two functions: setup and draw. The first is the main program entry point, and can contain any amount of initialization instructions. After finishing setup, Processing programs can do one of two things: 1) call draw, and schedule another call to draw at a fixed interval upon completion; or 2) call draw, and wait for input events from the user. By default, Processing does the former; calling noLoop results in the latter. This allows for two modes to present sketches, namely a fixed framerate graphical environment, and an interactive, event-based updating graphical environment. In both cases, user events are monitored and can be handled either in their own event handlers, or for certain events that set persistent global values, directly in the draw function.

Processing.js is a sister project of Processing, designed to bring it to the web without the need for Java or plugins. It started as an attempt by John Resig to see if the Processing language could be ported to the web, by using the—at the time brand new—HTML5 <canvas> element as a graphical context, with a proof of concept library released to the public in 2008. Written with the idea in mind that "your code should just work", Processing.js has been refined over the years to make data visualisations, digital art, interactive animations, educational graphs, video games, etc. work using web standards and without any plugins. You write code using the Processing language, either in the Processing IDE or your favourite editor of choice, include it on a web page using a <canvas> element, and Processing.js does the rest, rendering everything in the <canvas> element and letting users interact with the graphics in the same way they would with a normal standalone Processing program.

17.1. How Does It Work?

Processing.js is a bit unusual as an open source project, in that the code base is a single file called processing.js which contains the code for Processing, the single object that makes up the entire library. In terms of how the code is structured, we constantly shuffle things around inside this object as we try to clean it up a little bit with every release. Its design is relatively straightforward, and its function can be described in a single sentence: it rewrites Processing source code into pure JavaScript source code, and every Processing API function call is mapped to a corresponding function in the JavaScript Processing object, which effects the same thing on a <canvas> element as the Processing call would effect on a Java Applet canvas.

For speed, we have two separate code paths for 2D and 3D functions, and when a sketch is loaded, either one or the other is used for resolving function wrappers so that we don't add bloat to running instances. However, in terms of data structures and code flow, knowing JavaScript means you can read processing.js, with the possible exception of the syntax parser.

Unifying Java and JavaScript

Rewriting Processing source code into JavaScript source code means that you can simply tell the browser to execute the rewritten source, and if you rewrote it correctly, things just work. But, making sure the rewrite is correct has taken, and still occasionally takes, quite a bit of effort. Processing syntax is based on Java, which means that Processing.js has to essentially transform Java source code into JavaScript source code. Initially, this was achieved by treating the Java source code as a string, and iteratively replacing substrings of Java with their JavaScript equivalents. (For those interested in an early incarnation of the parser, it can be found at here, running from line 37 to line 266.) For a small syntax set, this is fine, but as time went on and complexity added to complexity, this approach started to break down. Consequently, the parser was completely rewritten to build an Abstract Syntax Tree (AST) instead, first breaking down the Java source code into functional blocks, and then mapping each of those blocks to their corresponding JavaScript syntax. The result is that, at the cost of readability, Processing.js now effectively contains an on-the-fly Java-to-JavaScript transcompiler. (Readers are welcome to peruse this code, up to line 19217.)

Here is the code for a Processing sketch:

    void setup() {
      smooth(); }

    void draw() {
      float f = frameCount*PI/frameRate;
      float d = 10+abs(60*sin(f));
      ellipse(mouseX, mouseY, d,d); }

And here is its Processing.js conversion:

    function($p) {
        function setup() {
            $p.size(200, 200);
            $p.smooth(); }
        $p.setup = setup;

        function draw() {
            $p.fill(255, 10);
            $p.rect(-1, -1, $p.width + 1, $p.height + 1);
            var f = $p.frameCount * $p.PI / $p.__frameRate;
            var d = 10 + $p.abs(60 * $p.sin(f));
            $p.fill(0, 100, 0, 50);
            $p.ellipse($p.mouseX, $p.mouseY, d, d); }
        $p.draw = draw; }

This sounds like a great thing, but there are a few problems when converting Java syntax to JavaScript syntax:

  1. Java programs are isolated entities. JavaScript programs share the world with a web page.
  2. Java is strongly typed. JavaScript is not.
  3. Java is a class/instance based object-oriented language. JavaScript is not.
  4. Java has distinct variables and methods. JavaScript does not.
  5. Java allows method overloading. JavaScript does not.
  6. Java allows importing compiled code. JavaScript has no idea what that even means.

Dealing with these problems has been a tradeoff between what users need, and what we can do given web technologies. The following sections will discuss each of these issues in greater detail.

17.2. Significant Differences

Java programs have their own threads; JavaScript can lock up your browser.

Java programs are isolated entities, running in their own thread in the greater pool of applications on your system. JavaScript programs, on the other hand, live inside a browser, and compete with each other in a way that desktop applications don't. When a Java program loads a file, the program waits until the resource is done loading, and operation resumes as intended. In a setting where the program is an isolated entity on its own, this is fine. The operating system stays responsive because it's responsible for thread scheduling, and even if the program takes an hour to load all its data, you can still use your computer. On a web page, this is not how things work. If you have a JavaScript "program" waiting for a resource to be done loading, it will lock its process until that resource is available. If you're using a browser that uses one process per tab, it will lock up your tab, and the rest of the browser is still usable. If you're using a browser that doesn't, your entire browser will seem frozen. So, regardless of what the process represents, the page the script runs on won't be usable until the resource is done loading, and it's entirely possible that your JavaScript will lock up the entire browser.

This is unacceptable on the modern web, where resources are transferred asynchronously, and the page is expected to function normally while resources are loaded in the background. While this is great for traditional web pages, for web applications this is a real brain twister: how do you make JavaScript idle, waiting for a resource to load, when there is no explicit mechanism to make JavaScript idle? While there is no explicit threading in JavaScript, there is an event model, and there is an XMLHTTPRequest object for requesting arbitrary (not just XML or HTML) data from arbitrary URLS. This object comes with several different status events, and we can use it to asynchronously get data while the browser stays responsive. Which is great in programs in which you control the source code: you make it simply stop after scheduling the data request, and make it pick up execution when the data is available. However, this is near impossible for code that was written based on the idea of synchronous resource loading. Injecting "idling" in programs that are supposed to run at a fixed framerate is not an option, so we have to come up with alternative approaches.

For some things, we decided to force synchronous waiting anyway. Loading a file with strings, for instance, uses a synchronous XMLHTTPRequest, and will halt execution of the page until the data is available. For other things, we had to get creative. Loading images, for instance, uses the browser's built-in mechanism for loading images; we build a new Image in JavaScript, set its src attribute to the image URL, and the browser does the rest, notifying us that the image is ready through the onload event. This doesn't even rely on an XMLHTTPRequest, it simply exploits the browser's capabilities.

To make matters easier when you already know which images you are loading, we added preload directives so that the sketch does not start execution until preloading is complete. A user can indicate any number of images to preload via a comment block at the start of the sketch; Processing.js then tracks outstanding image loading. The onload event for an image tells us that it is done transferring and is considered ready to be rendered (rather than simply having been downloaded but not decoded to a pixel array in memory yet), after which we can populate the corresponding Processing PImage object with the correct values (width, height, pixel data, etc.) and clear the image from the list. Once the list is empty, the sketch gets executed, and images used during its lifetime will not require waiting.

Here is an example of preload directives:

    /* @pjs preload="./worldmap.jpg"; */

    PImage img;

    void setup() {
      img = loadImage("worldmap.jpg"); }

    void draw() {
      image(img,0,0); }

For other things, we've had to build more complicated "wait for me" systems. Fonts, unlike images, do not have built-in browser loading (or at least not a system as functional as image loading). While it is possible to load a font using a CSS @font-face rule and rely on the browser to make it all happen, there are no JavaScript events that can be used to determine that a font finished loading. We are slowly seeing events getting added to browsers to generate JavaScript events for font download completion, but these events come "too early", as the browser may need anywhere from a few to a few hundred more milliseconds to actually parse the font for use on the page after download. Thus, acting on these events will still lead to either no font being applied, or the wrong font being applied if there is a known fallback font. Rather than relying on these events, we embed a tiny TrueType font that only contains the letter "A" with impossibly small metrics, and instruct the browser to load this font via an @font-face rule with a data URI that contains the font's bytecode as a BASE64 string. This font is so small that we can rely on it being immediately available. For any other font load instruction we compare text metrics between the desired font and this tiny font. A hidden <div> is set up with text styled using the desired font, with our tiny font as fallback. As long as the text in that <div> is impossibly small, we know the desired font is not available yet, and we simply poll at set intervals until the text has sensible metrics.

Java is strongly typed; JavaScript is not.

In Java, the number 2 and the number 2.0 are different values, and they will do different things during mathematical operations. For instance, the code i = 1/2 will result in i being 0, because the numbers are treated as integers, whereas i = 1/2.0, i = 1.0/2, and even i = 1./2. will all result in i being 0.5, because the numbers are considered decimal fractions with a non-zero integer part, and a zero fractional part. Even if the intended data type is a floating point number, if the arithmetic uses only integers, the result will be an integer. This lets you write fairly creative math statements in Java, and consequently in Processing, but these will generate potentially wildly different results when ported to Processing.js, as JavaScript only knows "numbers". As far as JavaScript is concerned, 2 and 2.0 are the same number, and this can give rise to very interesting bugs when running a sketch using Processing.js.

This might sound like a big issue, and at first we were convinced it would be, but you can't argue with real world feedback: it turns out this is almost never an issue for people who put their sketches online using Processing.js. Rather than solving this in some cool and creative way, the resolution of this problem was actually remarkably straightforward; we didn't solve it, and as a design choice, we don't intend to ever revisit that decision. Short of adding a symbol table with strong typing so that we can fake types in JavaScript and switch functionality based on type, this incompatibility cannot properly be solved without leaving much harder to find edge case bugs, and so rather than adding bulk to the code and slowdown to execution, we left this quirk in. It is a well-documented quirk, and "good code" won't try to take advantage of Java's implicit number type casting. That said, sometimes you will forget, and the result can be quite interesting.

Java is a class/instance-based object-oriented language, with separate variable and method spaces; JavaScript is not.

JavaScript uses prototype objects, and the inheritance model that comes with it. This means all objects are essentially key/value pairs where each key is a string, and values are either primitives, arrays, objects, or functions. On the inheritance side, prototypes can extend other prototypes, but there is no real concept of "superclass" and "subclass". In order to make "proper" Java-style object-oriented code work, we had to implement classical inheritance for JavaScript in Processing.js, without making it super slow (we think we succeeded in that respect). We also had to come up with a way to prevent variable names and function names from stepping on each other. Because of the key/value nature of JavaScript objects, defining a variable called line, followed by a function like line(x1,y1,x2,y2) will leave you with an object that uses whatever was declared last for a key. JavaScript first sets object.line = "some value" for you, and then sets object.line = function(x1,y1,x2,y2){…}, overriding what you thought your variable line was.

It would have slowed down the library a lot to create separate administration for variables and methods/functions, so again the documentation explains that it's a bad idea to use variables and functions with the same name. If everyone wrote "proper" code, this wouldn't be much of a problem, as you want to name variables and functions based on what they're for, or what they do, but the real world does things differently. Sometimes your code won't work, and it's because we decided that having your code break due to a naming conflict is preferable to your code always working, but always being slow. A second reason for not implementing variable and function separation was that this could break JavaScript code used inside Processing sketches. Closures and the scope chain for JavaScript rely on the key/value nature of objects, so driving a wedge in that by writing our own administration would have also severely impacted performance in terms of Just-In-Time compilation and compression based on functional closures.

Java allows method overloading; JavaScript does not.

One of Java's more powerful features is that you can define a function, let's say add(int,int), and then define another function with the same name, but a different number of arguments, e.g. add(int,int,int), or with different argument types, e.g. add(ComplexNumber,ComplexNumber). Calling add with two or three integer arguments will automatically call the appropriate function, and calling add with floats or Car objects will generate an error. JavaScript, on the other hand, does not support this. In JavaScript, a function is a property, and you can dereference it (in which case JavaScript will give you a value based on type coercion, which in this case returns true when the property points to a function definition, or false when it doesn't), or you can call it as a function using the execution operators (which you will know as parentheses with zero or more arguments between them). If you define a function as add(x,y) and then call it as add(1,2,3,4,5,6), JavaScript is okay with that. It will set x to 1 and y to 2 and simply ignore the rest of the arguments. In order to make overloading work, we rewrite functions with the same name but different argument count to a numbered function, so that function(a,b,c) in the source becomes function$3(a,b,c) in the rewritten code, and function(a,b,c,d) becomes function$4(a,b,c,d), ensuring the correct code paths.

We also mostly solved overloading of functions with the same number but differently typed arguments, as long as the argument types can be seen as different by JavaScript. JavaScript can tell the functional type of properties using the typeof operator, which will return either number, string, object or function depending on what a property represents. Declaring var x = 3 followed by x = '6' will cause typeof x to report number after the initial declaration, and string after reassignment. As long as functions with the same argument count differ in argument type, we rename them and switch based on the result of the typeof operation. This does not work when the functions take arguments of type object, so for these functions we have an additional check involving the instanceof operator (which returns the name of the function that was used to create the object) to make function overloading work. In fact, the only place where we cannot successfully transcompile overloaded functions is where the argument count is the same between functions, and the argument types are different numerical types. As JavaScript only has one numerical type, declaring functions such as add(int x, int y), add(float x, float y) and add(double x, double y) will clash. Everything else, however, will work just fine.

Java allows importing compiled code.

Sometimes, plain Processing is not enough, and additional functionality is introduced in the form of a Processing library. These take the form of a .jarchive with compiled Java code, and offer things like networking, audio, video, hardware interfacing and other exotic functions not covered by Processing itself.

This is a problem, because compiled Java code is Java byte code. This has given us many headaches: how do we support library imports without writing a Java byte code decompiler? After about a year of discussions, we settled on what may seem the simplest solution. Rather than trying to also cover Processing libraries, we decided to support the import keyword in sketches, and create a Processing.js Library API, so that library developers can write a JavaScript version of their library (where feasible, given the web's nature), so that if they write a package that is used via import, native Processing will pick the .jarchive, and Processing.js will instead pick, thus ensuring that things "just work". This functionality is slated for Processing.js 1.4, and library imports is the last major feature that is still missing from Processing.js (we currently support the import keyword only in the sense that it is removed from the source code before conversion), and will be the last major step towards parity.

Why Pick JavaScript if It Can't Do Java?

This is not an unreasonable question, and it has multiple answers. The most obvious one is that JavaScript comes with the browser. You don't "install" JavaScript yourself, there's no plugin to download first; it's just there. If you want to port something to the web, you're stuck with JavaScript. Although, given the flexibility of JavaScript, "stuck with" is really not doing justice to how powerful the language is. So, one reason to pick JavaScript is "because it's already there". Pretty much every device that is of interest comes with a JavaScript-capable browser these days. The same cannot be said for Java, which is being offered less and less as a preinstalled technology, if it is available at all.

However, the proper answer is that it's not really true that JavaScript "can't do" the things that Java does; it can, it would just be slower. Even though out of the box JavaScript can't do some of the things Java does, it's still a Turing-complete programming language and it can be made to emulate any other programming language, at the cost of speed. We could, technically, write a full Java interpreter, with a String heap, separate variable and method models, class/instance object-orientation with rigid class hierarchies, and everything else under the Sun (or, these days, Oracle), but that's not what we're in it for: Processing.js is about offering a Processing-to-the-web conversion, in as little code as is necessary for that. This means that even though we decided not to make it do certain Java things, our library has one huge benefit: it can cope with embedded JavaScript really, really well.

In fact, during a meeting between the Processing.js and Processing people at Bocoup in Boston, in 2010, Ben Fry asked John Resig why he used regular expression replacement and only partial conversion instead of doing a proper parser and compiler. John's response was that it was important to him that people be able to mix Processing syntax (Java) and JavaScript without having to choose between them. That initial choice has been crucial in shaping the philosophy of Processing.js ever since. We've worked hard to keep it true in our code, and we can see a clear payoff when we look at all the "purely web" users of Processing.js, who never used Processing, and will happily mix Processing and JavaScript syntax without a problem.

The following example shows how JavaScript and Processing work together.

    // JavaScript (would throw an error in native Processing)
    var cs = { x: 50,
               y: 0,
               label: "my label",
               rotate: function(theta) {
                         var nx = this.x*cos(theta) - this.y*sin(theta);
                         var ny = this.x*sin(theta) + this.y*cos(theta);
                         this.x = nx; this.y = ny; }};

    // Processing
    float angle = 0;

    void setup() {
      strokeWeight(15); }

    void draw() {
      angle += PI/frameRate;
      while(angle>2*PI) { angle-=2*PI; }
      jQuery('#log').text(angle); // JavaScript (error in native Processing)
      cs.rotate(angle);           // legal JavaScript as well as Processing
      point(cs.x, cs.y); }

A lot of things in Java are promises: strong typing is a content promise to the compiler, visibility is a promise on who will call methods and reference variables, interfaces are promises that instances contain the methods the interface describes, etc. Break those promises and the compiler complains. But, if you don't—and this is a one of the most important thoughts for Processing.js—then you don't need the additional code for those promises in order for a program to work. If you stick a number in a variable, and your code treats that variable as if it has a number in it, then at the end of the day var varname is just as good as int varname. Do you need typing? In Java, you do; in JavaScript, you don't, so why force it in? The same goes for other code promises. If the Processing compiler doesn't complain about your code, then we can strip all the explicit syntax for your promises and it'll still work the same.

This has made Processing.js a ridiculously useful library for data visualisation, media presentation and even entertainment. Sketches in native Processing work, but sketches that mix Java and JavaScript also work just fine, as do sketches that use pure JavaScript by treating Processing.js as a glorified canvas drawing framework. In an effort to reach parity with native Processing, without forcing Java-only syntax, the project has been taken in by an audience as wide as the web itself. We've seen activity all over the web using Processing.js. Everyone from IBM to Google has built visualisations, presentations and even games with Processing.js—Processing.js is making a difference.

Another great thing about converting Java syntax to JavaScript while leaving JavaScript untouched is that we've enabled something we hadn't even thought about ourselves: Processing.js will work with anything that will work with JavaScript. One of the really interesting things that we're now seeing, for instance, is that people are using CoffeeScript (a wonderfully simple, Ruby-like programming language that transcompiles to JavaScript) in combination with Processing.js, with really cool results. Even though we set out to build "Processing for the web" based on parsing Processing syntax, people took what we did and used it with brand new syntaxes. They could never have done that if we had made Processing.js simply be a Java interpreter. By sticking with code conversion rather than writing a code interpreter, Processing.js has given Processing a reach on the web far beyond what it would have had if it had stayed Java-only, or even if it had kept a Java-only syntax, with execution on the web taken care of by JavaScript. The uptake of our code not just by end users, but also by people who try to integrate it with their own technologies, has been both amazing and inspiring. Clearly we're doing something right, and the web seems happy with what we're doing.

The Result

As we are coming up to Processing.js 1.4.0, our work has resulted in a library that will run any sketch you give it, provided it does not rely on compiled Java library imports. If you can write it in Processing, and it runs, you can put it on a webpage and it will just run. Due to the differences in hardware access and low level implementations of different parts of the rendering pipeline there will be timing differences, but in general a sketch that runs at 60 frames per seconds in the Processing IDE will run at 60 frames per second on a modern computer, with a modern browser. We have reached a point where bug reports have started to die down, and most work is no longer about adding feature support, but more about bug fixing and code optimization.

Thanks to the efforts of many developers working to resolve over 1800 bug reports, Processing sketches run using Processing.js "just work". Even sketches that rely on library imports can be made to work, provided that the library code is at hand. Under favourable circumstances, the library is written in a way that lets you rewrite it to pure Processing code with a few search-replace operations. In this case the code can be made to work online virtually immediately. When the library does things that cannot be implemented in pure Processing, but can be implemented using plain JavaScript, more work is required to effectively emulate the library using JavaScript code, but porting is still possible. The only instances of Processing code that cannot be ported are those that rely on functionality that is inherently unavailable to browsers, such as interfacing directly with hardware devices (such as webcams or Arduino boards) or performing unattended disk writes, though even this is changing. Browsers are constantly adding functionality to allow for more elaborate applications, and limiting factors today may disappear a year from now, so that hopefully in the not too distant future, even sketches that are currently impossible to run online will become portable.

17.3. The Code Components

Processing.js is presented and developed as a large, single file, but architecturally it represents three different components: 1) the launcher, responsible for converting Processing source to Processing.js flavoured JavaScript and executing it, 2) static functionality that can be used by all sketches, and 3) sketch functionality that has to be tied to individual instances.

The Launcher

The launcher component takes care of three things: code preprocessing, code conversion, and sketch execution.


In the preprocessing step, Processing.js directives are split off from the code, and acted upon. These directives come in two flavours: settings and load instructions. There is a small number of directives, keeping with the "it should just work" philosophy, and the only settings that sketch authors can change are related to page interaction. By default a sketch will keep running if the page is not in focus, but the pauseOnBlur = true directive sets up a sketch in such a way that it will halt execution when the page the sketch is running on is not in focus, resuming execution when the page is in focus again. Also by default, keyboard input is only routed to a sketch when it is focussed. This is especially important when people run multiple sketches on the same page, as keyboard input intended for one sketch should not be processed by another. However, this functionality can be disabled, routing keyboard events to every sketch that is running on a page, using the globalKeyEvents = true directive.

Load instructions take the form of the aforementioned image preloading and font preloading. Because images and fonts can be used by multiple sketches, they are loaded and tracked globally, so that different sketches don't attempt multiple loads for the same resource.

Code Conversion

The code conversion component decomposes the source code into AST nodes, such as statements and expressions, methods, variables, classes, etc. This AST then expanded to JavaScript source code that builds a sketch-equivalent program when executed. This converted source code makes heavy use of the Processing.js instance framework for setting up class relations, where classes in the Processing source code become JavaScript prototypes with special functions for determining superclasses and bindings for superclass functions and variables.

Sketch Execution

The final step in the launch process is sketch execution, which consists of determining whether or not all preloading has finished, and if it has, adding the sketch to the list of running instances and triggering its JavaScript onLoad event so that any sketch listeners can take the appropriate action. After this the Processing chain is run through: setup, then draw, and if the sketch is a looping sketch, setting up an interval call to draw with an interval length that gets closest to the desired framerate for the sketch.

Static Library

Much of Processing.js falls under the "static library" heading, representing constants, universal functions, and universal data types. A lot of these actually do double duty, being defined as global properties, but also getting aliased by instances for quicker code paths. Global constants such as key codes and color mappings are housed in the Processing object itself, set up once, and then referenced when instances are built via the Processing constructor. The same applies to self-contained helper functions, which lets us keep the code as close to "write once, run anywhere" as we can without sacrificing performance.

Processing.js has to support a large number of complex data types, not just in order to support the data types used in Processing, but also for its internal workings. These, too, are defined in the Processing constructor:


The final piece of functionality in the static code library is the instance list of all sketches that are currently running on the page. This instance list stores sketches based on the canvas they have been loaded in, so that users can call Processing.getInstanceById('canvasid') and get a reference to their sketch for page interaction purposes.

Instance Code

Instance code takes the form of p.functor = function(arg, …) definitions for the Processing API, and p.constant = … for sketch state variables (where p is our reference to the sketch being set up). Neither of these are located in dedicated code blocks. Rather, the code is organized based on function, so that instance code relating to PShape operations is defined near the PShape object, and instance code for graphics functions are defined near, or in, the Drawing2D and Drawing3D objects.

In order to keep things fast, a lot of code that could be written as static code with an instance wrapper is actually implemented as purely instance code. For instance, the lerpColor(c1, c2, ratio) function, which determines the color corresponding to the linear interpolation of two colors, is defined as an instance function. Rather than having p.lerpColor(c1, c2, ratio) acting as a wrapper for some static function Processing.lerpColor(c1, c2, ratio), the fact that nothing else in Processing.js relies on lerpColor means that code execution is faster if we write it as a pure instance function. While this does "bloat" the instance object, most functions for which we insist on an instance function rather than a wrapper to the static library are small. Thus, at the expense of memory we create really fast code paths. While the full Processing object will take up a one-time memory slice worth around 5 MB when initially set up, the prerequisite code for individual sketches only takes up about 500 KB.

17.4. Developing Processing.js

Processing.js is worked on intensively, which we can only do because our development approach sticks to a few basic rules. As these rules influence the architecture of Processing.js, it's worth having a brief look at them before closing this chapter.

Make It Work

Writing code that works sounds like a tautological premise; you write code, and by the time you're done your code either works, because that's what you set out to do, or it doesn't, and you're not done yet. However, "make it work" comes with a corollary: Make it work, and when you're done, prove it.

If there is one thing above all other things that has allowed Processing.js to grow at the pace it has, it is the presence of tests. Any ticket that requires touching the code, be it either by writing new code or rewriting old code, cannot be marked as resolved until there is a unit or reference test that allows others to verify not only that the code works the way it should, but also that it breaks when it should. For most code, this typically involves a unit test—a short bit of code that calls a function and simply tests whether the function returns the correct values, for both legal and illegal function calls. Not only does this allow us to test code contributions, it also lets us perform regression tests.

Before any code is accepted and merged into our stable development branch, the modified Processing.js library is validated against an ever-growing battery of unit tests. Big fixes and performance tests in particular are prone to passing their own unit tests, but breaking parts that worked fine before the rewrite. Having tests for every function in the API, as well as internal functions, means that as Processing.js grows, we don't accidentally break compatibility with previous versions. Barring destructive API changes, if none of the tests failed before a code contribution or modification, none of the tests are allowed to fail with the new code in.

The following is an example of a unit test verifying inline object creation.

    interface I {
      int getX();
      void test(); }

    I i = new I() {
      int x = 5;
      public int getX() {
        return x; }
      public void test() {
        x++; }};


    _checkEqual(i.getX(), 6);
    _checkEqual(i instanceof I, true);
    _checkEqual(i instanceof Object, true);

In addition to regular code unit tests, we also have visual reference (or "ref") tests. As Processing.js is a port of a visual programming language, some tests cannot be performed using just unit tests. Testing to see whether an ellipse gets drawn on the correct pixels, or whether a single-pixel-wide vertical line is drawn crisp or smoothed cannot be determined without a visual reference. Because all mainstream browsers implement the <canvas> element and Canvas2D API with subtle differences, these things can only be tested by running code in a browser and verifying that the resulting sketch looks the same as what native Processing generates. To make life easier for developers, we use an automated test suite for this, where new test cases are run through Processing, generating "what it should look like" data to be used for pixel comparison. This data is then stored as a comment inside the sketch that generated it, forming a test, and these tests are then run by Processing.js on a visual reference test page which executes each test and performs pixel comparisons between "what it should look like" and "what it looks like". If the pixels are off, the test fails, and the developer is presented with three images: what it should look like, how Processing.js rendered it, and the difference between the two, marking problem areas as red pixels, and correct areas as white. Much like unit tests, these tests must pass before any code contribution can be accepted.

Make It Fast

In an open source project, making things work is only the first step in the life of a function. Once things work, you want to make sure things work fast. Based on the "if you can't measure it, you can't improve it" principle, most functions in Processing.js don't just come with unit or ref tests, but also with performance (or "perf") tests. Small bits of code that simply call a function, without testing the correctness of the function, are run several hundred times in a row, and their run time is recorded on a special performance test web page. This lets us quantify how well (or not!) Processing.js performs in browsers that support HTML5's <canvas> element. Every time an optimization patch passes unit and ref testing, it is run through our performance test page. JavaScript is a curious beast, and beautiful code can, in fact, run several orders of magnitude slower than code that contains the same lines several times over, with inline code rather than function calls. This makes performance testing crucial. We have been able to speed up certain parts of the library by three orders of magnitude simply by discovering hot loops during perf testing, reducing the number of function calls by inlining code, and by making functions return the moment they know what their return value should be, rather than having only a single return at the very end of the function.

Another way in which we try to make Processing.js fast is by looking at what runs it. As Processing.js is highly dependent on the efficiency of JavaScript engines, it makes sense to also look at which features various engines offer to speed things up. Especially now that browsers are starting to support hardware accelerated graphics, instant speed boosts are possible when engines offer new and more efficient data types and functions to perform the low level operations that Processing.js depends on. For instance, JavaScript technically has no static typing, but graphics hardware programming environments do. By exposing the data structures used to talk to the hardware directly to JavaScript, it is possible to significantly speed up sections of code if we know that they will only use specific values.

Make It Small

There are two ways to make code small. First, write compact code. If you're manipulating a variable multiple times, compact it to a single manipulation (if possible). If you access an object variable multiple times, cache it. If you call a function multiple times, cache the result. Return once you have all the information you need, and generally apply all the tricks a code optimiser would apply yourself. JavaScript is a particularly nice language for this, since it comes with an incredible amount of flexibility. For example, rather than using:

if ((result = functionresult)!==null) {
  var = result;
} else {
  var = default;

in JavaScript this becomes:

var = functionresult || default

There is also another form of small code, and that's in terms of runtime code. Because JavaScript lets you change function bindings on the fly, running code becomes much smaller if you can say "bind the function for line2D to the function call for line" once you know that a program runs in 2D rather than 3D mode, so that you don't have to perform:

if(mode==2D) { line2D() } else { line3D() }

for every function call that might be either in 2D or 3D mode.

Finally, there is the process of minification. There are a number of good systems that let you compress your JavaScript code by renaming variables, stripping whitespace, and applying certain code optimisations that are hard to do by hand while still keeping the code readable. Examples of these are the YUI minifier and Google's closure compiler. We use these technologies in Processing.js to offer end users bandwidth convenience—minification after stripping comments can shrink the library by as much as 50%, and taking advantage of modern browser/server interaction for gzipped content, we can offer the entire Processing.js library in gzipped form in 65 KB.

If All Else Fails, Tell People

Not everything that can currently be done in Processing can be done in the browser. Security models prevent certain things like saving files to the hard disk and performing USB or serial port I/O, and a lack of typing in JavaScript can have unexpected consequences (such as all math being floating point math). Sometimes we're faced with the choice between adding an incredible amount of code to enable an edge case, or mark the ticket as a "wontfix" issue. In such cases, a new ticket gets filed, typically titled "Add documentation that explains why…".

In order to make sure these things aren't lost, we have documentation for people who start using Processing.js with a Processing background, and for people who start using Processing.js with a JavaScript background, covering the differences between what is expected, and what actually happens. Certain things just deserve special mention, because no matter how much work we put into Processing.js, there are certain things we cannot add without sacrificing usability. A good architecture doesn't just cover the way things are, it also covers why; without that, you'll just end up having the same discussions about what the code looks like and whether it should be different every time the team changes.

17.5. Lessons Learned

The most important lesson we learned while writing Processing.js is that when porting a language, what matters is that the result is correct, not whether or not the code used in your port is similar to the original. Even though Java and JavaScript syntax are fairly similar, and modifying Java code to legal JavaScript code is fairly easy, it often pays to look at what JavaScript can natively do and exploit that to get the same functional result. Taking advantage of the lack of typing by recycling variables, using certain built-in functions that are fast in JavaScript but slow in Java, or avoiding patterns that are fast in Java but slow in JavaScript means your code may look radically different, but has the exact same effect. You often hear people say not to reinvent the wheel, but that only applies to working with a single programming language. When you're porting, reinvent as many wheels as you need to obtain the performance you require.

Another important lesson is to return early, return often, and branch as little as possible. An if/then statement followed by a return can be made (sometimes drastically) faster by using an if-return/return construction instead, using the return statement as a conditional shortcut. While it's conceptually pretty to aggregate your entire function state before calling the ultimate return statement for that function, it also means your code path may traverse code that is entirely unrelated to what you will be returning. Don't waste cycles; return when you have all the information you need.

A third lesson concerns testing your code. In Processing.js we had the benefit of starting with very good documentation outlining how Processing was "supposed" to work, and a large set of test cases, most of which started out as "known fail". This allowed us to do two things: 1) write code against tests, and 2) create tests before writing code. The usual process, in which code is written and then test cases are written for that code, actually creates biased tests. Rather than testing whether or not your code does what it should do, according to the specification, you are only testing whether your code is bug-free. In Processing.js, we instead start by creating test cases based on what the functional requirements for some function or set of functions is, based on the documentation for it. With these unbiased tests, we can then write code that is functionally complete, rather than simply bug-free but possibly deficient.

The last lesson is also the most general one: apply the rules of agile development to individual fixes as well. No one benefits from you retreating into dev mode and not being heard from for three days straight while you write the perfect solution. Rather, get your solutions to the point where they work, and not even necessarily for all test cases, then ask for feedback. Working alone, with a test suite for catching errors, is no guarantee of good or complete code. No amount of automated testing is going to point out that you forgot to write tests for certain edge cases, or that there is a better algorithm than the one you picked, or that you could have reordered your statements to make the code better suited for JIT compilation. Treat fixes like releases: present fixes early, update often, and work feedback into your improvements.