What are the different Hadoop components and versions available with HDInsight?
Find out about the different service levels offered by HDInsight, as well as the versions of different Hadoop components included.
HDInsight Standard and HDInsight Premium
Azure HDInsight provides the big data cloud offerings in two categories: Standard and Premium. The table below section lists the features that are available only as part of Premium. Features that are not explicitly called out in the table here are available as part of Standard.
The HDInsight Premium offering is currently in Preview and available only for Linux clusters.
|HDInsight Premium feature||Description|
|Domain-joined HDInsight clusters||Join HDInsight clusters to Azure Active Directory (AAD) domains for enterprise-level security. You can now configure a list of employees from your enterprise who can authenticate through Azure Active Directory to log on to HDInsight cluster. The enterprise admin can also configure role based access control for Hive security using Apache Ranger, thus restricting access to data to only as much as needed. Finally, the admin can audit the data accessed by employees, and any changes done to access control policies, thus achieving a high degree of governance of their corporate resources. For more information, see Configure domain-joined HDInsight clusters.|
Cluster types supported for Premium
The following table lists the HDInsight cluster type and Premium support matrix.
|Hadoop||Yes||Yes (HDInsight 3.5 only)|
|Interactive Hive (Preview)||Yes||No|
|R Server (Preview)||Yes||No|
This table will be updated as more cluster types are included in HDInsight Premium.
Pricing and SLA
For information on pricing and SLA for HDInsight Premium, see HDInsight pricing.
Hadoop components available with different HDInsight versions
Azure HDInsight supports multiple Hadoop cluster versions that can be deployed at any time. Each version choice creates a specific version of the Hortonworks Data Platform (HDP) distribution and a set of components that are contained within that distribution. The component versions associated with HDInsight cluster versions are itemized in the following table. Note that the default cluster version used by Azure HDInsight is currently 3.4, and, as of 09/14/2016, based on HDP 2.4.
The default version from the service may change without notice. We recommend that you specify the version when you create clusters using .NET SDK/Azure PowerShell and Azure CLI, if you have a version dependency.
|Component||HDInsight version 3.5||HDInsight version 3.4 (Default)||HDInsight Version 3.3||HDInsight Version 3.2||HDInsight Version 3.1||HDInsight Version 3.0|
|Hortonworks Data Platform||2.5||2.4||2.3||2.2||2.1.7||2.0|
|Apache Hadoop & YARN||2.7.3||2.7.1||2.7.1||2.6.0||2.4.0||2.2.0|
|Apache Hive & HCatalog||126.96.36.199.5||1.2.1||1.2.1||0.14.0||0.13.1||0.12.0|
|Apache Spark||1.6.2 + 2.0 (Linux only)||1.6.0 (Linux only)||1.5.2 (Linux only/Experimental build)||1.3.1 (Windows-only)|
Get current component version information
The component versions associated with HDInsight cluster versions may change in future updates to HDInsight. One way to determine the available components and to verify which versions are being used for a cluster is to use the Ambari REST API. The GetComponentInformation command can be used to retrieve information about a service component. For details, see the Ambari documentation. Another way to obtain this information is to log in to a cluster by using Remote Desktop and examine the contents of the "C:\apps\dist\" directory directly.
See HDInsight release notes for additional release notes on the latest versions of HDInsight.
Supported HDInsight versions
The following table lists the versions of HDInsight currently available, the corresponding Hortonworks Data Platform versions that they use, and their release dates. When known, their support expiration and deprecation dates are also provided. Please note the following:
- Highly available clusters with two head nodes are deployed by default for HDInsight 2.1 and above. They are not available for HDInsight 1.6 clusters.
- Once the support has expired for a particular version, it may not be available through the Azure portal. The following table indicates which versions are available on the Azure Classic Portal. Cluster versions will continue to be available using the
Versionparameter in the Windows PowerShell New-AzureRmHDInsightCluster command and the .NET SDK until its deprecation date.
|HDInsight Version||HDP Version||VM OS||High Availability||Release Date||Available on Azure portal||Support Expiration Date||Deprecation Date|
|HDI 3.5||HDP 2.5||Ubuntu 16||Yes||9/30/2016||Yes|
|HDI 3.4||HDP 2.4||Ubuntu 14.0.4 LTS||Yes||03/29/2016||Yes||12/29/2016||1/9/2018|
|HDI 3.3||HDP 2.3||Ubuntu 14.0.4 LTS or Windows Server 2012R2||Yes||12/02/2015||Yes||06/27/2016||07/31/2017|
|HDI 3.2||HDP 2.2||Ubuntu 12.04 LTS or Windows Server 2012R2||Yes||2/18/2015||Yes||3/1/2016||04/01/2017|
|HDI 3.1||HDP 2.1||Windows Server 2012R2||Yes||6/24/2014||No||05/18/2015||06/30/2016|
|HDI 3.0||HDP 2.0||Windows Server 2012R2||Yes||02/11/2014||No||09/17/2014||06/30/2015|
|HDI 2.1||HDP 1.3||Windows Server 2012R2||Yes||10/28/2013||No||05/12/2014||05/31/2015|
|HDI 1.6||HDP 1.1||No||10/28/2013||No||04/26/2014||05/31/2015|
HDI Version 3.2 and 3.3 nearing deprecation date
The support for HDI 3.2 cluster expired on 03/01/2016 and it will be deprecated on 04/01/2017. The support for HDI 3.3 cluster expired on 06/27/2016 and it will be deprecated on 07/31/2017. If you have an HDI 3.2 or HDI 3.3 Cluster, then upgrade your Cluster to HDI 3.5 (latest version) soon.
The service-level agreement for HDInsight cluster versions
The SLA is defined in terms of a "Support Window". A Support Window refers to the period of time that an HDInsight cluster version is supported by Microsoft Customer Service and Support. An HDInsight cluster is outside the Support Window if its version has a Support Expiration Date past the current date. A list of supported HDInsight cluster versions can be found in the table above. The support expiration date for a given HDInsight version X (once a newer X+1 version is available) is calculated as the later of:
- Formula 1: Add 180 days to the date HDInsight cluster version X was released.
- Formula 2: Add 90 days to the date HDInsight cluster version X+1 (the subsequent version after X) is made available in the Portal.
The Deprecation Date is the date after which the cluster version cannot be created on HDInsight. Starting July 31st 2017, you cannot resize a cluster after it's deprecation date.
Windows-based HDInsight cluster (including version 2.1, 3.0, 3.1, 3.2 and 3.3) run on Azure Guest OS Family 4, which uses the 64-bit version of Windows Server 2012 R2 and supports .NET Framework 4.0, 4.5, 4.5.1, and 4.5.2.
HDInsight Deprecation on Windows
Starting HDI version 3.4, we have released HDInsight only on Linux OS. Some of the offerings for HDInsight are available for Linux only – Apache Ranger, HDInsight applications, Azure Data Lake Store as primary FS etc. This has multiple advantages for customers
- We can bring open source big data technology faster to the market through HDInsight service
- There is a large community and ecosystem for support
- Active development by open source community for Hadoop and newer big data technologies
- HDInsight service can focus more on the big data open source technology
For continued investment on the open source big data technologies, future releases of HDInsight will be available only on Linux OS. There will not be any future release of HDInsight on Windows OS. The last release of HDInsight on Windows was HDI 3.3. The support for HDI 3.3 expired on 06/27/2016 and it will be deprecated on 07/31/2017. Refer this to migrate from a Windows based HDInsight cluster to a Linux-based cluster.
Hortonworks release notes associated with HDInsight versions
- HDInsight cluster version 3.4 uses a Hadoop distribution that is based on Hortonworks Data Platform 2.4. This is the default Hadoop cluster created when using the portal.
HDInsight cluster version 3.3 uses a Hadoop distribution that is based on Hortonworks Data Platform 2.3.
HDInsight cluster version 3.2 uses a Hadoop distribution that is based on Hortonworks Data Platform 2.2.
- HDInsight cluster version 3.1 uses a Hadoop distribution that is based on Hortonworks Data Platform 2.1.7.HDInsight 3.1 clusters created before 11/7/2014 were based on the Hortonworks Data Platform 2.1.1.
- HDInsight cluster version 3.0 uses a Hadoop distribution that is based on Hortonworks Data Platform 2.0.
- HDInsight cluster version 2.1 uses a Hadoop distribution that is based on Hortonworks Data Platform 1.3.
- HDInsight cluster version 1.6 uses a Hadoop distribution that is based on Hortonworks Data Platform 1.1.