A Microsoft Windows Azure primer: the basics

An excellent article about Windows Azure can be found at https://arstechnica.com.

 

Microsoft's Windows Azure cloud computing platform has been gaining steam since its launch to paying customers in February. Just last week it reached 10,000 customers; already, Azure is shaping up to be a strong contender in the nascent cloud computing market.

Though the cloud offerings available from companies like Google, Amazon, and Microsoft are broadly similar—they each offer the basic building blocks of "computation" (i.e. applications), and "storage"—the way in which these services are offered is quite different. There are other providers out there beyond these three, but these household names are broadly representative of the market, and arguably the most important in terms of market adoption and influence.

Cloud platform primer

At one end of the spectrum is Amazon's Elastic Compute Cloud product (EC2). EC2 gives users a range of operating system images that they can install into their virtual machine and then configure any which way they choose. Customers can even create and upload their own images. Software can be written in any environment that can run on the system image. Those images can be administered and configured using SSH, remote desktop, or whatever other mechanism is preferred. Want to install software onto the virtual machine? Just run the installer.

At the opposite end of the spectrum is Google's App Engine. App Engine software runs in a sandbox, providing only limited access to the underlying operating system. Applications are restricted to being written either in Java (or at least, languages targeting the JVM) or Python 2.5. The sandbox prevents basic operations like writing to disk or opening network sockets.

In the middle ground is Microsoft's Windows Azure. In Azure, there's no direct access to the operating system or the software running on top of it—it's some kind of Windows variant, optimized for scalability, running some kind of IIS-like Web server with a .NET runtime—but with far fewer restrictions on application development than in Google's environment. Though .NET is, unsurprisingly, the preferred development platform, applications can be written using PHP, Java, or native code if preferred. The only restriction is that software must be deployable without installation—it has to support simply being copied to a directory and run from there.

The storage systems are similarly varied. Amazon has four storage systems. There's the widely used Simple Storage Service (S3), which offers simple storage of name-value pairs, where the value can be a BLOB (Binary Large OBject) up to 5 GB in size. There's Elastic Block Store (EBS), which offers virtual block devices that can be formatted as if they were hard disks, and used to store regular files. Amazon's Relational Database Service (RDS) provides a MySQL database. Finally, there's SimpleDB.

SimpleDB is a non-relational database; it stores data in a non-relational table form. Like a conventional relational database, these tables allow the storage of entries (customers, orders, addresses, etc.) with each entry made up of a number of attributes (so, a customer might have a name, an address, and a telephone number). Unlike a typical database, new attributes can be added without having to update the existing entries in a table. There are no relationships between tables, nor do they support that mainstay of relational database programming, the join. If applications want to access data from one table that corresponds to entries in another table, they will have to do the lookup manually.

The reason for these reduced features is that by relaxing all the constraints that relational databases impose, the cloud provider has greater flexibility in query optimization and data storage. Though the limited features seem awfully spartan to someone with a relational database background, they are nonetheless rich enough for many applications.

Google's main storage offering is its datastore, which is broadly equivalent to SimpleDB. Google also has a service called blobstore, offering similar services to S3. The company's standard App Engine offering has no SQL database features at all, though a relational database is available for customers of the App Engine for Business platform, which is currently being previewed.

Microsoft's range of storage offerings is equivalent to Amazon's. Azure Storage includes both arbitrary binary storage, like S3, and table-based storage, like SimpleDB. It provides special support for storing VHD disk images which can then be mounted as drives, akin to Elastic Block Store. Finally, Microsoft has a version of SQL Server called SQL Azure that offers almost all the features of SQL Server to cloud applications.

Amazon and Microsoft both offer a queuing service, in addition to the conventional storage options. Queue services allow messages to be passed between applications in a reliable, asynchronous way. Producer applications put new messages into the queue, and consumer applications then read them off the queue at their leisure.

As a result of these different design decisions, the different platforms are optimized for different kinds of applications. EC2 makes it easy to deploy "regular" software into a cloud environment: install the software into an image, use EBS disks for data storage, and the application will be none-the-wiser. Conversely, applications on App Engine must be purpose-built for the platform.

Azure sits somewhere in the middle. Existing Web applications, in particular those using ASP.NET, should be easy enough to migrate to Azure. They can use regular SQL Server (or a close approximation thereof), they can use legacy native code if they have to, but the platform does away with many of the traditional management tasks—operating servers, administering databases.

It looks like this aspect of Azure appeals to many of Microsoft's customers, too; though scalability is probably the most-often cited benefit of the cloud, about half of Azure installations are single-instance. That is, the applications are deployed only onto a single virtual machine—Azure is being used as a way of reducing the operational cost of running conventional applications, not a way to scale up quickly and easily to multiple machines.

This certainly appeals to me. I'm not much of a Web app person, but the balance that Azure strikes between letting me use existing .NET skills and ignoring system administration is appealing. So let's take a look at Azure development in a bit more detail.

Full article can be found at https://arstechnica.com/microsoft/guides/2010/06/microsoft-azure-for-nubcakes.ars