Saturday 23 January 2010

Introduction to Map Reduce using jQuery

(Solved: How to avoid the "A script on this page is causing Internet Explorer to run slowly." message)

When writing the spellayt spell checking plug-in, I came across an interesting problem with long running JavaScript code – the browser would timeout and display a message box to the user with the text "A script on this page is causing Internet Explorer to run slowly." Even worse, the dialog gives the user the option to cancel the script, potentially breaking your page.

The solution was to use a timer to simulate a background thread, but the solution wasn’t generic and felt far from elegant. This week, with the approval of Google’s Distributed MapReduce patent, I realised that a simpler non-distributed map reduce algorithm would enable the running of code both asynchronously and over long periods of time in JavaScript without complex state management and stack issues.

Map reduce works by taking key/value pairs and running them through a function to create intermediate key/value pairs until all the data has been processed (map). The data is then filtered (reduced) to obtain a result. Joel Spolsky (Joel on Software) has a nice article that explains the fundamentals behind map reduce.

Sieve of Eratosthenes

The Sieve of Eratosthenes is an algorithm for finding prime numbers, this Wikipedia article has a  visual explanation of how it works. Its a great example of something that is processor intensive but can be broken up into smaller steps.

A standard JavaScript implementation (using Euler’s optimisation) would look something like this:

function eratosthenesSieve(upperBound) {

   var upperBoundSquareRoot = Math.floor(Math.sqrt(upperBound), 0);
   var isComposite = [];

   for (var m = 2; m <= upperBoundSquareRoot; m++) {

      if (!isComposite[m]) {

         _primes.push(m);

         for(var k = m * m; k <= upperBound; k += m) {
            isComposite[k] = true;
         }
      }
   }

   for(m = upperBoundSquareRoot; m <= upperBound; m++) {
      if (!isComposite[m]) _primes.push(m);
   }
}

The isComposite variable contains an array of boolean indicating whether is element is prime (false) or not (true). The _primes array also contains the prime numbers.

Redefining the problem as a set of Map / Reduce functions

The great thing about map/reduce is that once the problem is defined as a set of functions, it doesn’t matter how the map/reduce is implemented, you should always get the same results. Usually, each for loop will result in a matching $.map or $.reduce function. Reorganising the eratosthenesSieve function to use map reduce requires the following steps:

  • Map an array containing the initial values – in this case the index of the array is the number we are testing, and a boolean will indicate whether it is a prime or composite number.
  • Map each prime up to the square root of the upper bound. For each Map that is a prime, strike out it’s square multiplied by the remaining numbers until the upper bound is reached.
  • Reduce the array by totalling the number of primes (false values) left.
  • Execute the functions to return the result.

The code now looks like this:

function eratosthenesSieve2(upperBound) {

   //Create an array of booleans representing prime = false
   var composites = new Array(upperBound);

   //Set all to prime (false) to start
   $.map(composites, function(index, value) {
      return false;
   });

   //Calculate the sqr root of the upper bound
   var upperBoundSquareRoot = Math.floor(Math.sqrt(upperBound), 0);

   //For each item, calculate the non primes
   $.map(composites, function(index, value) {

      if (index < 2 || index > upperBoundSquareRoot) return value;

      //If the array item is a prime then map all non primes
      if (value == false) {

         var k = index * index;

         $.map(composites, function(index2, value2) {

            if (index2 == k) {

               k += index;
               return true;
            }
            return value2;
         });
      }
      return value;
   });
                
   //Count only the primes
   $.reduce(composites, 0, function(index, value, result) {

      if (index >= 2 && value == false) result++;
      return result;
   });

   return $.execute();
}

The first $.map function initialises the composites array. The second and third map $.map strikes out multiples of primes. The $.reduce function counts the remaining primes by accumulating the result variable. Note the extra work we have to do to simulate the functionality of the for loop and when we haven’t performed any processing we return the same value that was passed into the function.

Finally, we $.execute the functions. If you attach a debugger to the script, you’ll notice the functions only start executing once this method is called.

Download the plug-in code from the Map Reduce plug-in page.

Next time, I’ll dive into the details of implementing the Map Reduce plug-in and the pitfalls encountered along the way.

No comments: