HTML5 Threading with Web Workers and Data Storage with IndexedDB

Wallace B. McClure | August 23, 2013

Download the Code Sample

In a previous article, I looked at the HTML5 <video> tag and History object, two of the many features that developers have started to implement to give users of their apps new input controls, jQuery Mobile features, location and mapping, and much more. In this article, I continue exploring HTML5 features with a look at Web Workers, which let you speed up your client-side applications, and Indexed DB, a client-side data storage mechanism that is the preferred storage approach going forward in HTML5. (Before Indexed DB—also known as the Indexed Database API—work was focused on a standard named WebSQL. That work was discontinued in the fall of 2010. IndexedDB is the result of follow-on work to create a standard for data storage in Web browsers.)

Note: The APIs for HTML5 vary slightly across different versions of browsers and different implementations. Because HTML5 is still in the recommendation phase at the W3C, you should think of it as a draft at this point. Because these APIs may change before they become a final standard, I cover only the version of HTML5 implemented in Internet Explorer 10. Every opportunity will be made to test the code across other browsers to verify that it works.

HTML5 Web Workers

Back in the day, Web sites were just containers of HTML content. Making changes to the UI required a trip to the server, and the server would return a new page of content that the browser presented to the user. This was clumsy and not very bandwidth friendly. Then AJAX (Asynchronous JavaScript and XML) showed up, and that enabled scripts to call out asynchronously to services across the Web. AJAX showed how Web applications could make use of asynchronous operations, and asynchronous operations, like threading, improve the user experience by allowing multiple operations to be completed at the same time. Various computer operating systems have supported the use of threads since before I started programming, so this is a fairly well-known feature among developers. Until HTML5 and the Web Workers standard, however, JavaScript has not had the ability to work with background threads, only the asynchronous operations provided by the XMLHttpRequest object that AJAX uses. Web Workers are a great step forward. With the inclusion of support for Web Workers in browsers, Web applications can provide more of the responsive feel that users expect in their applications.

Note: There are two types of Web Workers: Dedicated Workers and Shared Workers. For the purposes of this article, I use Web Workers (or just Workers) to refer to Dedicated Workers.

HTML5 Web Workers is a specification that allows long-running scripts to perform in the background. These scripts run independently of the user interface and any scripts running there. This means that long-running tasks aren’t accidently interrupted. Much like threads in the .NET Framework, Web Workers take a while to create and are fairly heavyweight. As a result, you should not create too many of them. In general, Workers are expected to be used for an extended period of time.

Web Workers are useful for many of the same operations that threads in a desktop application perform. These include:

  • Prefetching data for later use.
  • Spell checking or similar checking of data in forms.
  • Background processing (which I illustrate in this article).
  • Processing large amounts of data, such as a large array or a large JSON object.

Let’s start by looking at Web Worker–related objects that developers need to work with, along with the APIs that developers have to use:

  • Worker(file.js) The Worker object is a JavaScript file that runs on a separate and isolated thread. Given the requirement that the object run on a separate thread, the JavaScript needs to be loaded from a separate file.
  • postMessage(object) The postMessage method sends an object to the Web Worker. In the Web Worker JavaScript file, postMessage returns a value to the calling HTML page. The postMessage method can be used to pass single values as well as more complicated objects. The example in this article sends the “start” command to the Web Worker.
  • Onmessage The Web Worker exposes this event to handle messaging and communication back to the DOM elements.
  • Onerror The Web Worker exposes this event to allow the JavaScript file to process errors from a Web Worker.

Here’s an example of some HTML code that calls a Web Worker in an operation that calculates prime numbers (see Figure 1). When the page loads, a check occurs to ensure that the browser supports Web Workers. If the browser supports them, a Worker is created, events are wired up, and the execution begins with the postMessage method call.

<p>The highest prime number discovered so far is: 
    <output id="result"></output></p>
<button id="stop" />
<script>
var stopped = true;
var ww = Modernizr.webworkers;
var worker;
var stop = "Stop Processing";
var stopBtn;
stopBtn = document.getElementById("stop");
stopBtn.innerHTML = stop;
if (ww) {
  worker = new Worker('@Href("~/Scripts/calc.js")');
  worker.onmessage = function (event) {
    document.getElementById('result').innerHTML = event.data;
  };
  worker.onerror = function (error) {
    document.getElementById("result").innerHTML = error.message;
  }
  worker.postMessage({"cmd":"start"});
  stopBtn.addEventListener("click", quitProcessing, false);
  stopBtn.enabled = true;
}
else {
  document.getElementById('result').innerHTML = 
    "This browser does not support Web Workers.";
}
function quitProcessing(e) {
  if (worker != null) {
      worker.terminate();
  }
}
</script>

Note: You may notice a library or object in this code called Modernizr. Modernizr is a JavaScript library that is used to test a browser for the ability to perform certain pieces of functionality. A link to an MSDNMagazine article on Modernizr is available in the references at the end of this article.

Now let’s look at the Web Worker file calc.js. In this file, the code handles the onmessage event. The function checks to be sure that a defined value has been passed in. With a defined value passed in, the process to calculate whether a number is a prime number begins.

self.addEventListener('message', function (e) {
  var data = e.data;
  switch (data.cmd) {
    case 'start':
    var n = 1;
    search:
      while (true) {
        try{
        n += 1;
        for (var i = 2; i <= Math.sqrt(n) ; i += 1)
        if (n % i == 0)
          continue search;
        // found a prime!
          postMessage(n);
        }
          catch( err ){
          throw err;
          }
        }
    break;
    case 'stop':
    self.close();
    break;
    default:
    self.postMessage("Command posted: " + data.msg);
  }
}, false);

Worker Example
Figure 1. Worker Example

Note: In my code I show different ways of doing things. For example, some events are wired up using an assignment of an event while others have been wired up using the addEventListener syntax available in the DOM. This has been done to show options and not to show favoritism to one mechanism over another.

I’ve shown a simple example of how to implement Web Workers, but there are many other ways that developers can implement threading to improve the user experience. Any algorithm that can be divided into smaller parts is a candidate for using Web Workers.

IndexedDB for Your Data Access Enjoyment

If you’ve been working with HTML5 (or been reading up on it), you’ve mostly heard of several Web-based client-side technologies related to storage:

  • Session storage A key-value data storage mechanism designed for the current session.
  • Local storage A key-value data store designed for access across Web browsing sessions.
  • WebSQL A Web standard for storing data in a SQL-based relational database with JavaScript. (Work on the WebSQL specification standard was stopped in November 2010 because of a lack of independent database implementations—only SQLite was available as a database at the time. The specification did not move forward to become a W3C recommendation.)
  • IndexedDB A proposed Web standard interface for a local database of records that can hold simple values and hierarchical objects. IndexedDB was initially proposed by Oracle in September 2009.

Developers working in the database world are familiar with Structured Query Language (SQL) in some form through their work in SQL, LINQ, Entity Framework, Hibernate, or similar approaches. Over the past few years, a new data storage movement, called NoSQL, has been making some headway, and IndexedDB borrows many ideas from the NoSQL movement. (For more information on NoSQL, check the references listed at the end of this article.) Just like in a common NoSQL data store, IndexedDB uses key-value pairs. The key-value pairs are saved in the data store and allow for relatively quick lookups.

One of the first things you should know about IndexedDB is that it has two official sets of APIs that effectively mirror each other: the first set is synchronous, and the other asynchronous. Only the asynchronous APIs are available in Internet Explorer (or any browser). The synchronous APIs are designed for running in Web Workers and in other situations where the code path is already asynchronous. Because developers will work with asynchronous APIs, they need to be aware of when events fire, how to handle these events, and how these events can be chained together.

Create a Database

Creating a database in IndexedDB is fairly simple. The window object now has an IndexedDB object within it, and the IndexedDB object contains the API calls for IndexedDB. (Older versions of browsers might contain versions of the IndexedDB object that are prefixed, and they most likely contain versions of the API that are not up to date.)

Creating and opening a database are performed by the same operation. The database is opened via a call to the .open method, which takes two parameters: a database name and the version number that will be used. Keep in mind that the call to .open is asynchronous. It returns an object, and your code needs to handle the events on the object. The events that the .open method can fire are:

  • Onsuccess The onsuccess method is called when the database is successfully opened. Within the onsuccess event, a reference to the opened database can be obtained via the .result property of the object returned.
  • Onerror The onerror method is called when some type of problem occurs opening the database. (The onerror method should not happen very often.) The error can be obtained via the .target property.
  • Onblocked The onblocked method is called when access to the database is somehow blocked. Typically, onblocked occurs when a database is open in another browser tab.
  • Onupgradeneeded The onupgradeneeded event is called when a database is upgraded from one version to the next. For example, when an application has been sufficiently updated as to require a new version of the database, a new integer is passed in the .open command. When this new value is passed in and the .open command is called the first time, the onupgradeneeded event is called. This gives the application an opportunity to perform necessary updates.

Here’s some code that shows an example of creating and opening a database:

var request = indexedDB.open("team", 1);
request.onsuccess = function (evt) {
  db = request.result;
};
request.onerror = function (evt) {
  console.log("IndexedDB error: " + evt.target);
};

Adding Records

After a database is open and available, the next step is to add some records to the database. In this example I have the five starting members of my children’s basketball team from a few years ago. The data includes:

  • Num (an integer) A team member’s jersey number. The num property will be used as the primary key in this example.
  • Name (a string) The player’s name.
  • Age (an integer) The player’s age.
  • Position (a string) The player’s position on the team.

Let’s walk through the steps of adding some records:

  1. Open the data store. In the sample code included with this project, the button for running the creation code should be clicked when no data store currently exists; otherwise, an error will occur in the other parts of the example.
  2. In the onupgradeneeded event, a call is made to create an object store. The object store is named “players” and is defined to have a keypath of the num property. You can think of an object store as equivalent to a table in SQL Server. A keypath is conceptually similar to a primary key in a relational database.
  3. Create any necessary indexes. The example uses two indexes:
    • An index is created on the player’s name. This index is set not to be unique.
    • An index is created on the player’s position. There are five players, each with a separate position. For demonstration purposes, this index is set to be unique.
  4. The final step is to iterate through the data that exists and add that data to the object store. You do this via the .add(object) method. In this example, the object added is a properly formatted JSON object. For CRUD style operations, IndexedDB exposes the following methods:
    • .put(object, key) is used for updating an object.
    • .delete(key) for deleting an object.
    • .get(key) for reading an object.

Here’s the sample code. Notice that I’ve included the onupgradeneeded event.

var request = indexedDB.open("team", 1);
request.onsuccess = function (evt) {
  db = request.result;
};
request.onerror = function (evt) {
  console.log("IndexedDB error: " + evt.target);
};
request.onupgradeneeded = function (evt) {
  var players = [{ num: 10, name: "kMac", age: 16, position: "point guard" },
    { num: 12, name: "Brad", age: 15, position: "small forward" },
    { num: 23, name: "Josh", age: 15, position: "shooting guard" },
    { num: 32, name: "Patrick", age: 15, position: "power forward" },
    { num: 42, name: "Elo", age: 16, position: "center" }]
    // Create an objectStore to hold information about our team. We're 
    // going to use "num" as our keypath because it's guaranteed to be 
    // unique on our team. 
    var objectStore = evt.currentTarget
        .result.createObjectStore("players", { keyPath: "num" });
    // Create an index to search players by name. We may have duplicates 
    // so we can't use a unique index. 
    objectStore.createIndex("name", "name", { unique: false });
    // Create an index to search players by position. We want to ensure that 
    // no two players have the same position, so use a unique index. 
    objectStore.createIndex("position", "position", { unique: true });
    // Store values in the newly created objectStore. 
    for (var i in players) {
    var request = objectStore.add(players[i]);
    request.onsuccess = function (event) {
      rd.innerHTML = "Added.";
    }
    request.onerror = function (event) {
    // somehow respond to the user that data has been added.
      // rd.innerHTML = "Error:" + event.message;
    }
  }
}

Querying for Data

So far we’ve seen how to create a database, create an object store, and add data to the object store. Now let’s take a look at how to query data. Here are the specific steps:

  1. Open a data store by using the .open() method call.
  2. In the onsuccess event, a transaction is created and then an object store reference is retrieved.
  3. With the object store reference, a reference to the index named num is created. (This num index is the num index created in the previous section about adding data.)
  4. In SQL, you add conditions to a SQL statement, but IndexedDB doesn’t have a query language for developers. In IndexedDB, queries are performed on an index by using an IDBKeyRange object. This object has four methods:
    • .only Used to return a single value.
    • .lowerBound Used to return a range of values beginning with the passed value.
    • .upperBound Used to return a range of values ending at the passed value.
    • .bound Used to return a range of values between values.
  5. Once the set of values to return is determined, a call is made to openCursor. This call returns a cursor object that has onsuccess and onerror events that your code can handle. Within the onsuccess method, there are three other important steps:
    1. Accessing the cursor that’s being processed. This cursor is referenced by the event.target.result property.
    2. Accessing a value that has been returned. You do this by getting the cursor’s .value.objectproperty property.
    3. Iterating through the cursor. This step is performed via the .continue method.

One thing to notice is the use of the .continue method to move to the next record in the cursor. There are also other methods for iterating through the cursor and operating on the records. These include:

  • advance(long) Used to move the cursor to the associated record.
  • delete Used to delete the current object.
  • update Used to update the current object.

Here’s the code that demonstrates our query:

var dbName = "team";
var request = indexedDB.open(dbName, 1);
request.onsuccess = function (e) {
  try{
    var db = e.target.result;
    var transaction = db.transaction(["players"]);
    var objectStore = transaction.objectStore("players");
    var index = objectStore.index("num");
    // Only match   
    // var singleKeyRange = IDBKeyRange.only(12);
    // Match anything past 10, including 10
    // var lowerBoundKeyRange = IDBKeyRange.lowerBound(10);
    // Match anything past 10, but don't include 10
    // var lowerBoundOpenKeyRange = IDBKeyRange.lowerBound(10, true);
    // Match anything up to, but not including, 32
    // var upperBoundOpenKeyRange = IDBKeyRange.upperBound(32, true);
    // Match anything between 9 & 32.
    var boundKeyRange = IDBKeyRange.bound(9, 32, true, true);
    var cursor1 = index.openCursor(boundKeyRange);
    var cursor2 = index.openCursor(IDBKeyRange.only(10));
    cursor1.onsuccess = function (event) {
      try{
        var cursor = event.target.result;
        var out = "";
        if (cursor) {
          // Do something with the matches.   
          // alert(cursor.value.name);
          cursor.continue();
        }
      }
      catch (err) {
        alert(err.message);
    }
  };
  cursor2.onsuccess = function(event){
    var cursor = event.target.result;
  }
  }
  catch (err) {
    alert(err.message);
  }
}

Opening a Transaction

You can open a transaction using one of several states:

  • Readonly This is the default opening mechanism; data can only be read.
  • Readwrite Allows for the reading and writing of data.
  • Vershionchange Allows for the reading and writing of data and for the creation of object stores and indexes.

Setting the Cursor Direction

In a SQL-based database, developers often use an “order by” command. An IndexedDB cursor uses the same concept. When you call openCursor in IndexedDB, you can pass in a second parameter. This parameter is a string and can have the following values:

  • Next Sets a cursor to be opened from the lower bound on the cursor and allows for it to be iterated to the final upper value. The cursor can include any duplicate values.
  • Nextunique Sets a cursor to be opened from the lower bound on the cursor and allows for it to be iterated to the final upper value. The cursor does not include duplicate values.
  • Prev Sets a cursor to be opened from the upper bound on the cursor and allows for it to be iterated to the final lower value. The cursor can include any duplicate values.
  • Prevunique Sets a cursor to be opened from the upper bound on the cursor and allows for it to be iterated to the final lower value. The cursor does not include any duplicate values.

Deleting a Record

Deleting a record is rather easy. All that’s needed is to call the delete method in the object store while passing in the key of the object that needs to be removed. As with the other methods in IndexedDB, this is an asynchronous call, and code can be called after the event fires.

var request = db.transaction(["players"], 
    "readwrite").objectStore("players").delete(12);
request.onsuccess = function (event) {
    // player has been removed.
};

A slightly different calling procedure is used in this example. Here, I’ve removed the chaining of events that was included previously.

Deleting a Database

If you need to delete a data store, use the deleteDatabase method, as shown here:

var dbd = window.indexedDB.deleteDatabase("team");
dbd.onsuccess = function (e) {
  rd.innerHTML = "deleted";
}
dbd.onerror = function (e) {
  rd.innerHTML = e;
}

Security and Privacy

Whenever data is stored in a browser, the question of security comes up (or at least it should). Thankfully, IndexedDB is governed by the same-origin policy, which means that features embedded in the browser can access only other features that originated from the same site. For example, an IndexedDB data store created by http://example.com/page1.html can be accessed only by another page from http://example.com.

Along with security concerns, privacy issues arise when you store any type of personally identifiable data on any device (PC, phone, tablet) outside a controlled data store. Take care when storing any type of personally identifiable information in this way. One thought is to use IndexedDB as a short term cache for application data. Remember, if a malicious hacker has physical access to a device, there is a high likelihood that given enough time, the hacker can get access to the data stored on the device.

Wrapping Up

In this article, I’ve introduced you to the HTML5 IndexedDB and Web Workers. Thanks to the great support for HTML5 features in Internet Explorer 10, Windows 8, Windows RT, and Windows Phone, developers have the opportunity to build some really fantastic apps. I hope that you find this article helpful.

References

About the Author

Wallace McClure is a redneck from Tennessee who somehow found his way to Atlanta and Georgia Tech. He was lucky enough to graduate from there twice, with a BS and MS in electrical engineering. He’s always loved things that were new and different, which led to his love of writing software (starting in COBOL and x86 asm), digging into Microsoft's Web technologies, jumping whole-hog into the .NET Framework 1.0 beta, following in love with mobile way back in 1999, and a whole host of things he probably shouldn't have done but did anyway. Somewhere along the way, he was contacted by someone representing a publisher that would eventually get purchased by John Wiley and Sons and folded into their Wrox division. Several books later, he’s run the gamut from software architecture, to scaling applications, ADO.NET, SQL Server, Oracle, Web, AJAX and mobile technologies. He’s worked for startup companies and many different organizations, all the way up through U.S. federal government agencies.

When not writing software, writing about software, talking about software or thinking he is a comedian, Wally can be found playing golf, in the gym or coaching basketball.

Find Wally on:

Many thanks to my friends Kevin Darty (@kdarty) and Stephen Long (@long2know) for reading through this article and providing some technology suggestions.