Hadoop On WindowsAzure Updated

HadoopOnAzure allows a user to run Hadoop on Microsoft Windows Azure as a service.  It is currently in private CTP with limited capacity, and by invitation only.  We did add more capacity today, you may attempt to sign up this free service at https://connect.microsoft.com/SQLServer/Survey/Survey.aspx?SurveyID=13697  Please allow one week for us to approve your invitation.

To get a feel of what the service currently looks like, please take a look at these learning resources on WindowsAzure.com that I have authored a few months ago.  I would also love to get your feedback on addition content you are interested in for learning about Hadoop. 

Hadoop on Windows Azure Tutorials

Introduction to Hadoop on Windows Azure

Running Hadoop Jobs on Windows Azure and Analyzing the
Data with the Excel Hive Add-In

Hadoop on Windows Azure - Working With Data

Analyzing Twitter Movie Data with Hive (additional source at:   https://github.com/wenming/BigDataSamples/tree/master/twittersample     )

Simple Recommendation Engine using Apache Mahout

Additional Learning Resources

I have also given talks at techEd this year, one of the sessions discusses the use scenarios for big data and Hadoop.  (samples at https://github.com/wenming/BigDataSamples)

TEchED Talk Video: Learn Big Data Application Development on Windows Azure

 

Now, onto the announcement by Henry Zhang on our engineering team today:

We just updated the Hadoop on Azure site with SU3 bits. Please see below for a list of changes. 

If you create a new cluster now, you will be running on the 1.01 Hadoop core bits. We now provide access to the cluster dashboard on the master node directly for you to manage your cluster and schedule jobs. You can simply go to  https://<clustername>.cloudapp.net,  type in the cluster user name/password you created the cluster with and log in. You will find a familiar experience in the cluster dashboard as before. You will also find preview bits of the Powershell cmdlets and C# SDK for job submission to your Azure cluster. Both tool kits can be downloaded in the 'Download' tab while you are on your cluster. Feedback will be highly welcome!

SU3: Publicly Visible Improvements

Hadoop core revved from 0.20.203.1 to 1.0.1 on Azure

Hadoop on Azure dashboard running on cluster master node, accessible via https://<clustername>.cloudapp.net      

REST APIs for hadoop job submission, progress inquery, kill job (no web UI or
remote desktop needed) (check out the download link on your cluster)   

Powershell cmdlet and c# sdk support (SDK 1.0+cmdlets available in download
page on your cluster)    

Same familiar job submission experience in the web UI on dashboard backed up by
 REST API in the background

Component

Version (SU2)

Version (SU3)

Hadoop Core

0.20.203.1

1.0.1

Hive

0.7.1

0.8.1

Pig

0.8.1

0.9.3

Mahout

0.5

0.5

Pegasus

2

2

SQOOP

1.3.1

1.4.2

 

It took us a bit longer to get this release out the door

and we truly appreciate your patience and support. Please let us know if you
have any questions.

 

Wen-ming Ye

twitter:  @wenmingye