Asynchronous JavaScript with Callbacks

This is a brief introduction to asynchronous JavaScript using Async.js and callbacks. Async.js is a very common library that makes it easier to do a variety of tasks using JavaScript.

Functions in JavaScript

In JavaScript, like other languages, a function is a re-usable block of code that accepts arguments, does something, and returns a value. In JavaScript, functions are objects, like every other value (e.g. numbers, strings, arrays, etc). Like all objects, they can have properties and methods, and can be assigned to variables to be passed around the program.

In JavaScript, functions can be anonymous or named. Here are examples of each, using conventional function declaration syntax:

function foobar() {
  return "bar";
}

function () {
  return "hello world";
}

The second function (which is anonymous) wasn’t assigned to a variable, so there is no way to call it. Anonymous functions are typically used in 3 ways: immediately invoked function expressions (IIFE), assigned to a variable, or passed as an argument to another function. Let’s look at an example of each of those:

// IIFE
(function() {
  console.log("I will be executed immediately");
})();

// Assigning a function to a variable
var foobar = function() {
  return "bar";
};

// Passing a function to another function as a callback
setTimeout(function() {
  console.log("I will be executed in 2 seconds.");
}, 2000);

IIFEs are a very common pattern to isolate your code, as variables or functions defined within your IIFE will be private, and will not pollute the global namespace. Since function expressions return the Function object, you’re essentially just invoking it immediately. This does require the function declaration to be wrapped in parenthesis to work correctly.

The second example is also very common. Here, we’re assigned an unnamed function to the variable “foobar”. Then we can call the function like we would a named function, using “foobar()”. This method of declaring functions has two important gotchas to remember. Variable function definitions aren’t “hoisted“, meaning they can only be called AFTER they are defined (which isn’t true for named functions). The other thing to remember is that the function will show as anonymous in any callstacks (e.g. from an exception), which can make debugging more difficult.

The final example is essentially the same as the second example. We have a function, setTimeout, which takes two arguments. We are simply passing a function as the first argument. This function will be called by setTimeout, after a time has passed (the second argument).

Since function declarations always return a Function object, you can use a named function with each of the above examples, which can make debugging easier or could be used to write a recursive function:

var foobar = function bar(i) {
  console.log(i);

  if (i < 5) {
    bar(i + 1);
  }
};

foobar(0);

The above example will log 0 through 5.

So what’s a callback?

A callback function is simply a function that is passed to another function as a parameter, to be executed by the other function at some point. This callback pattern is extremely common in JavaScript, and you’re unlikely to get much done without using it.

Here is a very simple example of a callback using jQuery:

$('#my_button').on('click', function(e) {
  e.preventDefault();
  console.log('You clicked the button!');
}):

In the above example, the ‘on’ function takes two arguments, an event to listen for (the ‘click’ event), and a callback to execute when the event is fired. The callback will not be executed immediately, it will instead be executed whenever the button is clicked. It may be called more than once, or not at all.

Callbacks are frequently used in JavaScript since most operations on asynchronous.

Essentially, you’re just passing a function around as a named variable when you use the callback pattern. The callback may or may not be called by the function you pass it to. It could be called immediately, or at some point in the future.

JavaScript Event Loop

Ok, now we know what functions and callbacks are, let’s see how they get used. Let’s look at a very simple asynchronous callback example:

console.log('Before timeout');
setTimeout(function() {
  console.log('Timeout callback');
}, 0);
console.log('After timeout');

// Before timeout
// After timeout
// Timeout callback

If you’re used to procedural programs, this may seem a bit strange. JavaScript is based around something called the event loop. The simple version is that JavaScript runs a loop, and on each iteration (or tick) of this loop, one event will be processed. This event could be a timeout completing, an IO operation returning, an incoming HTTP request, etc. To help with this, JavaScript utilizes a message queue. Certain operations will insert a message into the queue (e.g. setTimeout), and if there is a handler registered for the message, it will be executed.

In our above example, on tick 1, we do the following:

  1. Log “before timeout”
  2. Add a message handler (the message will be added to the message queue sometime after your timeout)
  3. Log “after timeout”

Now at some point in the future, JavaScript will insert a message into the queue to call your timeout handler. This will happen on a future “tick”.

Let’s look at another example using Node.js’ nextTick function:

console.log('A');

process.nextTick(function() {
 console.log('B');
});

process.nextTick(function() {
 console.log('C');
});

console.log('D');

In this example, we’ll see A, then D, then B, then C printed, and this will take 3 ticks. Node.js only handles one event per tick, and the nextTick function adds our callback to JavaScript’s event queue.

Each tick can run an arbitrary number of functions. In this example, we run 4. Two functions run immediately, and print a message to the console. After they are done printing to the console, they are 100% complete. The other two (nextTick calls) add our callback to the event queue. Since we’re calling nextTick twice, two items will be added to the event queue. Once the current ticket is done, Node will look at the event queue to see if anything is there. Since we have two items, the event queue loop will run through two more iterations, and then it will exit.

Asynchronous operations

Since JavaScript is single threaded using an event loop, we wouldn’t want to do blocking operations (e.g. reading a file) in the main loop. This would stop the loop until the blocking operation finished, meaning none of our other code could run. While this is how many languages work in their default configuration (e.g. Ruby), JavaScript & Node.js were designed to be run as a single process using non blocking operations only.

Let’s look at two ways of reading a file in Node.js, and the time it takes to do each.

Synchronous operation:

const fs = require('fs');

var f1 = fs.readFileSync('foo.txt');
console.log('read foo.txt');

var f2 = fs.readFileSync('bar.txt');
console.log('read bar.txt');

var f3 = fs.readFileSync('acme.txt');
console.log('read acme.txt');

// read foo.txt
// read bar.txt
// read acme.txt
// Completed in 4.67ms

Asynchronous operation:

const fs = require('fs');

var f1, f2, f3;

fs.readFile('foo.txt', function(err, data) {
  f1 = data;
  console.log('read foo.txt');
});

fs.readFile('bar.txt', function(err, data) {
  f2 = data;
  console.log('read bar.txt');
});

fs.readFile('acme.txt', function(err, data) {
  f3 = data;
  console.log('read acme.txt');
});

// read acme.txt
// read foo.txt
// read bar.txt
// Completed in 1.83ms

As you see, the asynchronous operation was faster, since all 3 files were read in parallel. If this was a GUI application or a web server, our process would have been free to handle user input or new requests while this was happening. However with the synchronous operation, the GUI would have been frozen or the web server would have hanged until the synchronous operations completed, before processing the new input.

The way this works under the hood is that the readFile function registers a message handler (our callback) to run when we receive a message indicating the file is read. Then we kick off an asynchronous, non blocking file read operation. Once the file has been fully read, a message is added to the message queue. Node sees a message handler was registered, and adds the handler to the event queue. The next time a tick happens, a callback is popped off the event queue and run. Eventually our callback for the file will be run, and we can manipulate the data.

What if I need things in a specific order?

What do you do if you need things to run in a specific order? You could use synchronous operations, but we’ve explained why that’s a bad idea. We can use a simple asynchronous queue. In our example below, we want to rename a file, but in a specific order (e.g. like logrotate). If we execute in the wrong order, by calling each asynchronous function at the same time, we could rename log.4 to log.5 before we’ve renamed log.5! That would be bad.

const fs = require('fs');
var queue = ['log.1', 'log.2', 'log.3', 'log.4', 'log.5'];

(function rotateFiles() {
  var filename = queue.pop();
  var fileparts = filename.split(/./);
  var newname = fileparts[0] + '.' + (parseInt(fileparts[1]) + 1);

  fs.renameFile(filename, newname, function(err) {
    if (err) {
      return console.log('There was an error rotating ' + filename);
    }
    
    if (queue.length > 0) {
      rotateFiles();
    } else {
      console.log('All files rotated!');
    }
  });
})();

Here, we create an array of each of the files we want to rotate. We want to rename ‘log.5’ to ‘log.6’, ‘log.4’ to ‘log.5’, etc. We then have an IIFE that pops an item off the end of the queue, renames it, then when the asynchronous rename is complete, calls itself again if there are still files remaining.

Doing this is tedious though. You could write a library of functions to help you, but luckily someone has already done that! Async.js is a ubiquitous library with a huge number of functions to help you control the flow of your asynchronous operations. Check out their readme to see what functions are available!

Managing control flow using Async.js

The async library has too many functions to cover, so we’ll cover the four most common ones. If you’re unsure which one to use, here’s a handy reference table depending on what you’re trying to do:

Same Action Different Actions
Any Order Async.each Async.parallel
In Order Async.eachSeries Async.waterfall

To elaborate, there are typically four different scenarios you’ll want to work with. You’ll either want to:

  1. Execute the same function on multiple items
  2. Execute different functions

Then for each of those scenarios, you’ll either want to do it with a specific order (serial operation), or in any order (parallel). A couple examples:

  • Running multiple unrelated database queries (any order, different functions, Async.parallel)
  • Running multiple successive API calls, each one depending on the last (specific order, different functions, Async.waterfall)
  • Rotating log files by renaming files in a specific order (specific order, one function, Aync.eachSeries)
  • Read multiple files (any order, same function, Async.each)

Async.each & Async.eachSeries

Async.each and Async.eachSeries take the same arguments. The first argument is the array of items to operate on. The second argument is a function to call on each argument. The third argument is a function to call when all other functions are complete. The third argument will also be called if there are errors.

Each time your second argument is called, it will be passed one item from your array, as well as a callback. When you are done doing whatever it is that you’re doing, you must call the callback to tell Async.js that you’re done. You can either tell Async that an error occurred, or pass it some data which will be passed to your final callback.

async.each(fileList, function(item, cb) {
  fs.readFile(item, function(err, data) {
    // Do something with data
    cb(null, data);
  });
}, function(err, files) {
  // files is an array containing the result of your callbacks
});

The reason you must call the callback is because your operations are presumed to be asynchronous, so the library has no way to know when you’re done processing an item, unless you explicitly tell it.

Async.parallel & Async.waterfall

Async.parallel & Async.waterfall also take the same arguments as each other, the only difference being the concurrency model. The first argument is an array of functions to execute, and the second argument is a function to call when everything else is done. Much like each/eachSeries, each function will be passed a callback.

In the case of waterfall, whatever extra arguments you pass to the callback will be sent to the next function. If any callback sets the first parameter (the error parameter), the rest of the functions will not be called.

async.waterfall([
  function(callback) {
    callback(null, 'one', 'two');
  },
  function(arg1, arg2, callback) {
    // arg1 now equals 'one' and arg2 now equals 'two'
    callback(null, 'three');
  },
  function(arg1, callback) {
    // arg1 now equals 'three'
    callback(null, 'done');
  }
], function (err, result) {
 // result now equals 'done'
});

For more information on async.js, please checkout the README!

One Reply to “Asynchronous JavaScript with Callbacks”

  1. Why is there no comment?

    This is an amazing tutorial, many thanks!
    Didactically very good as opposed to many other tutorials I have come across.

Leave a Reply