November 2012

Volume 27 Number 11

JavaScript Security - Web to Windows 8: Security

By Tim Kulp | November 2012

Years ago I thought it would be a good idea to learn how to play golf. Before I signed up for lessons at my local driving range, I had never picked up a golf club. At my first lesson, the instructor asked me if I had ever had lessons before or ever tried to play golf. When I told him no, he said, “Good! We won’t have any old habits to get out of your swing.”

Web developers transitioning from the browser to Windows Store apps bring certain habits with them. While Web developers can tap in to their existing knowledge of JavaScript, some capabilities are new and require a shift in thinking. Security is one such fundamental difference. Many Web developers are in the habit of handing applications security off to the server because of reasons such as, “Why bother? JavaScript can be easily bypassed.” On the Web client side, security features are seen as improving usability without adding value to the overall security of the Web application.

With Windows 8, JavaScript plays an important part in the overall security of your app by providing the tools necessary to secure data, validate input and separate potentially malicious content. In this article, I show you how you can adjust some of the habits you bring from Web development so that you can produce more secure Windows Store apps using HTML5, JavaScript and the security features of the Windows Runtime.

Input Validation

Web developer says:JavaScript validation is for usability and doesn’t add to the application’s security.

Windows 8 developer says:Validation with HTML5 and JavaScript is your first line of defense against malicious content getting in to your app.

For traditional Web applications, JavaScript is often just a gateway to the server. All important actions with the data, such as input validation and storage, occur on the server. Malicious attackers can disable JavaScript on their browser or directly by submitting handcrafted HTTP requests to circumvent any client-side protections. In a Windows Store app, the developer can’t rely on a server to clean user input prior to acting on the data because there’s no server. When it comes to input validation, JavaScript and HTML5 are on their own.

In software security, input validation is a critical component for data integrity. Without it, attackers can use every input field as a possible attack vector into the Windows Store app. In the second edition of “Writing Secure Code” (Microsoft Press, 2003), authors Michael Howard and Steve Lipner provide what has become a mantra for managing input: “All input is evil until proven otherwise.”

You shouldn’t trust data until it’s proven to conform to “known good” data. When building an app, the developer knows what data from a specific field should look like (that is, an allow list) or at the very least what it shouldn’t have (that is, a deny list). In the world of input validation, always use an allow list when possible to restrict input to known good data. By allowing only data that you know is good, you reduce the possibility of missing a new or unknown way to represent bad data.

Constrain, Reject and Sanitize

How do developers reduce risk to their users by limiting input to known good data? They use the three stages of input validation shown in Figure 1 to reduce the risk of malicious content getting in to their app.

Input Validation (Image Based on Figure 4.4 from Chapter 4, “Design Guidelines for Secure Web Applications,” of “Improving Web Application Security: Threats and Countermeasures” at bit.ly/emYI5A)
Figure 1 Input Validation (Image Based on Figure 4.4 from Chapter 4, “Design Guidelines for Secure Web Applications,” of “Improving Web Application Security: Threats and Countermeasures” at bit.ly/emYI5A)

Input validation begins with constraining data to what is “known good.” Web developers familiar with HTML5 can use their existing knowledge of its new input types and attributes to constrain data coming into their Windows Store apps. The key difference between the Web and Windows 8 is that a Windows Store app doesn’t have a server behind the scenes that checks input. Constraining data must happen in HTML5 or JavaScript.

Using HTML5, each field can easily be constrained to known good data. To illustrate examples in this article, I use the fictitious Contoso Health app, which stores personal health information for users. The Profile page of this app captures the user’s name, e-mail address, weight and height and provides a notes field for general information. As the developer, I know (in general) what good data looks like for each of these fields:

  • Name: Alphabetic characters with a few special characters not to exceed 45 total characters. The name criteria is based on the target market for the app, the U.S. market.
  • E-mail Address: Input must be a valid e-mail address format.
  • Weight and Height: Numbers with associated labels to show data is in feet and inches and in pounds.
  • Notes: HTML content using the standard Contoso HTML editor.

For the Name input element, I need to limit what characters are valid for the field as well as how long the value can be. I can do this using two new attributes of the input tag: pattern and title.

Pattern is a regular expression to which the data entered must conform. MSHTML (the rendering engine used for HTML5 apps in Windows 8) verifies that data entered into the field matches the regular expression. If the user enters data that doesn’t conform to the regular expression pattern, submitting the form will fail and the user will be directed to correct the invalid field. For example, the Name field can be composed of alpha characters and spaces, and it must be three to 45 characters long. The following pattern value supports this:

<input type="text" id="txtName" name="txtName"
  pattern="^[A-Za-z ]{3,45}$" title="" />

Title is used to inform the user of what the system is expecting. In this case, something such as “Name must be three to 45 characters long using alphabetic characters or spaces” would explain the expected pattern. Nothing is more frustrating to users than having invalid input without knowing what valid input is. Be nice to your users and let them know what’s allowed. The title attribute does just that; it’s the explanation message that shows what’s expected in the field.

Patterns for the data fields including acceptable characters and length can be difficult to determine. You can find sample regular expressions in many great online resources, but always consult with your organization’s security team to see whether there is a standard to which you must conform. If you don’t have a security team or if your security team doesn’t have standards, resources such as RegExLib.com provide an excellent library of regular expressions you can use for your data validations.

Some fields are specific data types, such as numbers, dates and e-mail addresses. HTML5 comes to the rescue again with an army of new input types, such as email, phone, date, number and many more. Using these data input types, MSHTML checks whether what the user entered is valid data, without any regular expressions or JavaScript code necessary. The input element’s type attribute handles the new data types. (You can find more types and their uses at bit.ly/OH1xFf.) For example, to capture an e-mail address for the Profile page, I would set the type attribute to be email, as in the following example:

<input type="email" id="txtEmail" name="txtEmail" />

This field accepts a value only if it conforms to the format of a valid e-mail address. If MSHTML doesn’t recognize input as a valid e-mail address, a validation error displays on the field when the user attempts to submit the form. Using the new input types of HTML5 constrains the data to what you’re expecting without the hassle of complex JavaScript validation.

Some of the new input types also allow range restrictions using the new min and max attributes. As an example, because of the business rules, the people in our app must have a height between 3 and 8 feet. The following range restrictions can be used on the height field:

<input type="number" id="txtHeight" name="txtHeight" min="3" max="8" />

The examples provided use the four techniques to constrain data with the HTML5 input tag. By validating length (using a pattern), format (again, using the pattern), data type (using the new input types) and range (using min/max), you can constrain the data to be known good data. Not all attributes and types prompt you to correct them prior to submission. Make sure you validate your form’s contents with the checkValidity method (bit.ly/SgNgnA) just as you would Page.IsValid in ASP.NET. You might be wondering whether you can constrain data like this just by using JavaScript. Yes, you can, but using the HTML5 attributes reduces the overall code the developer needs to manage by handing all the heavy lifting over to the MSHTML engine.

Reject denies known bad (that is, a deny list) input. A good example of reject is creating a deny list of IP addresses that can’t connect to your Web application. Deny lists are useful when you have a somewhat fixed scope defined for what you want to block. As an example, consider sending e-mail to a group such as your development team and then specifically removing individuals from the development team e-mail list. In this example, you know which e-mail addresses you want to deny from the development team list. For secure software, you want to focus on constrain (an allow list) over reject (a deny list). Always remember that known bad data changes constantly as attackers find ever more creative ways to circumvent software defenses. In the preceding example, imagine new developers joining the development team and needing to vet whether they should be included in the e-mail. Constraints are much easier to manage in the long run and provide a more maintainable list as opposed to the thousands of items in a deny list.

Sometimes data contains both known good and known bad data. An example of this is HTML content. Some tags are approved to display while others are not. The process of filtering out or disabling the known bad data and allowing the approved data is known as sanitizing the input. The notes field in the Contoso Health app is a great example of this. Users can enter HTML tags through an HTML editor, but only certain HTML tags are rendered when the input is displayed in the app. Sanitizing input takes data that could be malicious and makes it safe by stripping unsafe content and rendering inert what isn’t explicitly approved. Windows Store apps can do this if you set the value of an HTML element using innerText (instead of innerHTML), which renders the HTML content as text instead of interpreting it as HTML. (Note that if the app sets the innerText of a script tag to JavaScript, executable script is produced.) JavaScript also provides another useful tool for sanitization: toStaticHTML.

Here’s sample code from the Profile page’s btnSave_Click handler:

function btnSave_Click(args) {
  var taintedNotes = document.getElementById("txtNotes").value;
  var sanitizedNotes = window.toStaticHTML(taintedNotes);
  document.getElementById("output").innerHTML = sanitizedNotes;
}

If the user enters the string

<strong>testing!</strong><script>alert("123! ");</script>

to txtNotes, the window.toStaticHTML method strips out the script tag and leaves only the approved strong tag. Using toStaticHTML strips any tag that isn’t on the approved safe list (another example of using an allow list), as well as any attribute that is unknown. Only known good data is kept in the output of the toStaticHTML method. You can find a complete listing of approved tags, attributes, CSS rules and properties at bit.ly/KNnjpF.

Input validation reduces the risk of malicious content entering the system. Using HTML5 and toStaticHTML, the app can restrict input to known good data and remove or disable possibly malicious content without server intervention.

Now that Contoso Health is getting valid data, what do we do with sensitive data such as medical or financial information?

Sensitive Data Storage

Web developer says:Never store sensitive data on the client because secure storage is unavailable.

Windows 8 developer says: Sensitive data can be encrypted and securely stored through the Windows Runtime.

In the previous section, the Contoso Health app retrieved general profile information. As development continues, a medical history form is requested by the business sponsor. This form captures medical events that occur throughout a user’s life, such as the most recent doctor’s visit. Old rules for Web development say that storing sensitive information such as a user’s medical history on the client is a bad idea because of the possible exposure of the data. In Windows Store app development, sensitive data can be stored locally using the security features of the Windows Runtime.

To protect the user’s medical history, Contoso Health uses the WinRT Data Protection API. Encryption shouldn’t be the only part of your data-protection strategy (think Defense in Depth: layers of security instead of a single defense, such as using only encryption). Don’t forget other best practices surrounding sensitive data, such as accessing the data only when necessary and keeping sensitive data out of the cache. A great resource that lists many considerations for sensitive data is the MSDN Library article, “Improving Web Application Security: Threats and Countermeasures” (bit.ly/NuUe6w). Although this document focuses on Web development best practices, it provides a lot of excellent foundation knowledge that you can apply to any type of development.

The Medical History page in the Contoso Health app has a button named btnAddItem. When the user clicks btnAddItem, the app encrypts data entered into the Medical History form. To encrypt the Medical History information, the app uses the built-in WinRT Data Protection API. This simple encryption system allows developers to encrypt data quickly without the overhead of key management. Begin with an empty event handler for the btnAddItem click event. Then Contoso Health collects the form information and stores it in a JSON object. Inside the event handler, I add the code to quickly build out the JSON object:

var healthItem = {
  "prop1": window.toStaticHTML(document.getElementById("txt1").value),
  "prop2": window.toStaticHTML(document.getElementById("txt2").value)
};

The healthItem object represents the Medical History record the user has entered into the form. Encrypting healthItem begins with instantiating a DataProtectionProvider:

var dataProtectionProvider =
  Windows.Security.Cryptography.DataProtection.DataProtectionProvider(
  "LOCAL=user");

The DataProtectionProvider constructor (for encryption) takes a string argument that determines what the Data Protection is associated with. In this case, I’m encrypting content to the local user. Instead of setting it to the local user, I could set it to the machine, a set of Web credentials, an Active Directory security principle or a few other options. You can find a list of protection description options at the Dev Center topic, “Protection Descriptors” (bit.ly/QONGdG). Which protection descriptor you use depends on your app’s requirements. At this point, the Data Protection Provider is ready to encrypt the data, but the data needs a slight change. Encryption algorithms work with buffers, not JSON, so the next step is to cast healthItem as a buffer:

var buffer =
  Windows.Security.Cryptography.CryptographicBuffer.convertStringToBinary(
    JSON.stringify(healthItem),
    Windows.Security.Cryptography.BinaryStringEncoding.utf8);

CryptographicBuffer has many objects and methods to work with buffers used in encryption and decryption. The first of these methods is convertStringToBinary, which takes a string (in this case, the string version of the JSON object) and converts it to an encoded buffer. The encoding used is set with the Windows.Security.Cryptography.Binary­StringEncoding object. In this example, I use UTF8 as the encoding for my data. The convertStringToBinary method returns a buffer based on the string data and the encoding specified. With the buffer ready to be encrypted and the Data Protection Provider instantiated, I’m ready to call the protectAsync method to encrypt the buffer:

dataProtectionProvider.protectAsync(buffer).then(
  function (encryptedBuffer) {
     SaveBufferToFile(encryptedBuffer);
});

The encryptedBuffer argument is the output of the protectAsync method and contains the encrypted version of the buffer. In other words, this is the encrypted data ready for storage. From here, encryptedBuffer is passed to the SaveBufferToFile method, which writes the encrypted data to a file in the app’s local folder.

Encryption for healthItem boils down to three lines of code:

  1. Instantiate the Data Protection Provider.
  2. Convert the data to a buffer.
  3. Call protectAsync to encrypt the data.

Decrypting data is just as simple. The only changes are to use an empty constructor for the DataProtectionProvider and use the unprotectAsync method instead of the protectAsync method. The GetBufferFromFile method loads the encryptedBuffer variable from the file created in the SaveBufferToFile method:

function btnLoadItem_Click(args) {
  var dataProtectionProvider =
    Windows.Security.Cryptography.DataProtection.DataProtectionProvider();
  var encryptedBuffer = GetBufferFromFile();
  dataProtectionProvider.unprotectAsync(encryptedBuffer).then(
    function (decryptedBuffer) {
      // TODO: Work with decrypted data
    });
}

Can developers use encryption with non-WinRT JavaScript? Yes! Is it as easy as three lines of code that provide excellent data protection? No! There are numerous challenges to encryption best practices in the browser, such as keeping the secret key a secret, as well as managing the file size of the algorithms necessary to have quality encryption. The WinRT Data Protection API as well as the other cryptography tools provided in the Windows.Security.Cryptography namespace make protecting your data simple. Using the security features of the Windows Runtime, developers can store sensitive data in their Windows Store app with confidence while keeping their cryptographic keys easy to manage.

Local vs. Web Contexts

Web developer says: Web apps execute external script references in the same origin of the application that calls the scripts.

Windows 8 developer says: Windows Store apps separate the local app package from external script references.

Web 2.0 has trained developers that content can come from your site, someone else’s site (via mashup) or user interaction. On the Web, content is a virtual free-for-all, with developers consuming script references and API data from third parties. Content delivery networks (CDNs) and online services such as Bing Maps take away the overhead of managing code libraries or big data repositories, allowing Web applications to easily snap in functionality. Lower overhead is a good thing, but with this benefit comes some risk.

As an example, imagine one of Contoso’s partners in the health-software industry is Litware Inc. Litware is releasing a new Exercise API and has provided Contoso Health developers with keys to consume a daily exercise data feed. If Contoso Health were a Web application, the development team could implement the Exercise API using a script reference like the following:

<script src="https://api.litware.com/devkey/exercise.js"></script>

Developers at Contoso trust Litware to provide great content and know it has great security practices. Unfortunately, Litware’s servers were compromised by a disgruntled developer and exercise.js was altered to have a startup script that displays a pop-up with a message saying, “Contoso Health needs to run maintenance; please download the following maintenance application.” The user, thinking this message is legitimate, was just tricked into downloading malware. Contoso’s developers were baffled—Litware uses great validation, so how could this breach have happened?

On the Web, scripts referenced in the manner just described execute with the same origin as a script on the same site. That means exercise.js (running as JavaScript) has unquestioned access to the DOM tree as well as any script object. As illustrated earlier, this can lead to serious security issues. To mitigate this risk, Windows 8 breaks app resources into two contexts, as illustrated in Figure 2.

Local vs. Web Context Features (Mashed from “Features and Restrictions by Context” [bit.ly/NZUyWt] and “Secure Development with HTML5” [bit.ly/JOoMOS])
Figure 2 Local vs. Web Context Features (Mashed from “Features and Restrictions by Context” [bit.ly/NZUyWt] and “Secure Development with HTML5” [http://www.scribd.com/doc/66045883/Secure-Development-of-Metro-Apps-With-Html5])

The local context can access the Windows Runtime as well as any resource included in the app package (such as HTML, script, CSS and app data stored in the app state directories) but can’t access remote HTML, JavaScript or CSS (as in the earlier exercise.js example). The top-level app in Windows 8 always runs in the local context. In Figure 2, ms-appx:// is used to resolve content in the local context. This scheme is used to reference content in the app package running within the local context. Often a third slash follows (ms-appx:///) to reference the package’s full name. For Web developers, this approach is similar to using the file:// protocol, where a third slash references the local file system ( assumes file://END USER’s COMPUTER/ instead of file://REMOTE COMPUTER/).

The Web context allows developers to bring remote content into their Windows Store app through an iframe. Just like iframes in a Web browser, the content executing in the iframe is restricted from accessing resources outside of it, such as Windows Runtime and some features of Windows Library for JavaScript. (You can find a complete listing at bit.ly/PoQVOj.) The purpose of the Web context is to allow developers to reference third-party APIs such as Bing Maps or pull a library from a CDN into their app.

Using http:// or https:// as the source of an iframe automatically casts the contents of the iframe into the Web context. An iframe can also be a resource in the app package when you’re using ms-appx or ms-appx-web. When the source of an iframe references the ms-appx:// scheme, the iframe’s content runs in the local context. This allows developers to embed app package resources into an iframe while still having access to the features of the local context (such as Windows Runtime, Windows JavaScript API and so on). Another scheme available is ms-appx-web://, which allows local app package content to run in the Web context. This scheme is useful when you need to embed remote content within your markup, such as adding a Bing Search result (from the Bing Search API) of local hospitals based on the user’s location in the Contoso Health app. As a side note, whenever iframes are mentioned with HTML5, remember that you can use the sandbox attribute as extra protection for your app by limiting script execution of the content inside the iframe. You can find more information about the sandbox attribute at bit.ly/Ppbo1a.

Figure 3 shows the various schemes used in the local and Web contexts along with examples of their use.

Figure 3 Schemes with Context Examples

Scheme Content Location Context Example When Used
ms-appx:// App package Local   Load content into an iframe that needs to access the Windows Runtime or the full Windows JavaScript API.
ms-appx-web:// App package Web Use content from a remote source as part of your Windows Store app interface, such as displaying a mapping widget or search results.
http:// Remote Web Reference remote content such as a Web page or script file on another server.

Which context an iframe belongs to is based on how the content within it is referenced. In other words, the scheme determines the context. You can find more information about the schemes used in Windows 8 at bit.ly/SS711o.

Remember the Litware hack scenario that started this section? The Windows 8 separation of contexts will help constrain the cross-site scripting attack to the Web context where it doesn’t have access to either Windows Runtime or the app data for Contoso Health. In the Web context, modifying the local context isn’t an option. Communication between the contexts is possible, but you have control over what type of communication occurs.

Communicating Between Contexts

How does the top-level document communicate with an iframe running in the Web context? Using the postMessage features of HTML5, Windows Store apps can pass data between contexts. This allows developers to structure how the two origins communicate and to allow only known good providers (the allow list again) through to the local context. Pages that need to run in the Web context are referenced using an iframe with the src attribute set to http://, https:// or ms-appx-web://.

For the Contoso Health app, the system pulls fitness tips from the Litware Exercise API. Contoso Health’s development team has built the litwareHelper.html page, which is used to communicate with the Exercise API via the jQuery $ajax object. Because of the remote resource (exercise.js), litwareHelper.html needs to execute in the Web context, which means that it needs to run within an iframe. Setting up the iframe isn’t different than in any other Web application except for how the page is referenced. Because the litwareHelper.html page is part of the local app package but needs to run in the Web context, you load it using ms-appx-web:

<iframe id="litwareHelperFrame” src="ms-appx-web:///litwareHelper.html"></iframe>

The development team adds the following function to the local context page that sends the data request to the Web context page:

function btnGetFitTips_Click() {
  var msg = {
    term: document.getElementById("txtExerciseSearchTerm").value,
    itemCount: 25  }
  var msgData = JSON.stringify(msg);
  var domain = "ms-appx-web://" + document.location.host;
  try {
    var iframe = document.getElementById("litwareHelperFrame");
    iframe.contentWindow.postMessage(msgData, domain);
  }
  catch (ex) {
    document.getElementById("output").innerText = "Error has occurred!";
  }
}

The receiveMsg method processes the message from the local context. The argument of receiveMsg is the data provided to the postMessage event (in this case, the msgData variable) along with the message target, message origin and a few other pieces of information, as shown in Figure 4.

Figure 4 Processing with receiveMsg

function receiveMsg(e) {
  if (e.origin === "ms-appx://" + document.location.host) {
    var output = null;
    var parameters = JSON.parse(e.data);
    var url = "https://api.litware-exercise.com/data/" 
      + parameters.term +
      "/count/" + parameters.itemCount;
    var options = {
      dataType: "jsonp",
      jsonpCallback: "jsonpCallback",
      success: function (results) {
        output = JSON.stringify(results.items);
        window.parent.postMessage(output, "ms-appx://" 
        + document.location.host);
      },
      error: function (ex) {
        output = ex;
      }
    };
    $.ajax(url, options);
  }
}

The first step in receiveMsg checks the origin of the postMessage. This is a critical security check to ensure that the message is coming from where it’s supposed to be originating. Remember that e.origin checks the domain and scheme of who sent the postMessage, which is why you’re checking for ms-appx (the local context address). After gathering the JSON data from the Litware API, the app passes the results back to the window.parent using a postMessage command. Notice in receiveMsg that the domain is set to ms-appx. This is the “to” address of where the message is going and shows that the data is returning to the local context. Data from the iframe needs to be consumed by resources in the local context. The dev team adds the processResult function to consume the data from the Web context back into the local context:

function processResult(e) {
  if (e.origin === "ms-appx-web://" + document.location.host) {
    document.getElementById("output").innerText = e.data;
  }
}

Again, always check the origin of the message event to ensure that only data from approved locations (that is, locations that are registered in the allow list) is being processed. Notice that the origin is the Web context scheme: ms-appx-web in the processResult method. The switch between schemes can be a gotcha that developers can overlook and wonder where their message went during debugging.

Finally, to receive data from the Web context back to the local context page, you add an event handler for the message event. In the app.onactivated method, add the event listener to the window object:

window.addEventListener('message', processResult, false);

Separating the local and Web contexts by default reduces the risk of accidentally executing code from a source outside the Windows Store app. Using postMessage, developers can provide a communication channel between external script and the local scripts that compose an app.

Web to Windows 8: New Tools for Old Habits

Web developers now have access to familiar tools and new tools they can use to build secure Windows Store apps. Using existing skills, such as HTML5 input validation, ensures the integrity of data entering the app. New tools such as the Data Protection API (new for Windows Runtime) protect user’s confidential data with strong encryption that’s simple to implement. Using postMessage allows apps to tap into the thousands of JavaScript libraries and legacy code on the Web while keeping users safe from unintended code injections. All these elements work together to bring something important that’s often dismissed in JavaScript: security.

Windows 8 gives Web developers the chance to rethink some of their old habits. JavaScript is no longer a façade for the server, dismissed as an enhancement to usability and nothing more. JavaScript, the Windows Runtime and MSHTML provide the tools necessary to build security features into your Windows Store apps—no server necessary. As Web developers, we have a vast skillset to draw on, but we need to keep an eye on our old habits and turn them into opportunities to learn the new world of Windows 8.


Tim Kulp leads the development team at FrontierMEDEX in Baltimore, Md. You can find Kulp on his blog at seccode.blogspot.com or follow him on Twitter at Twitter.com/seccode, where he talks code, security and the Baltimore foodie scene.

Thanks to the following technical expert for reviewing this article: Scott Graham