200 questions with Azure HDInsight tags

Sort by: Updated
1 answer

Accessing adls gen2 from an esp enabled HDinsights cluster

Hallo All, I have configured an HDI Cluster with ESP enabled and from the Ambari and Azure Console I can see that all the services are running fine. But when trying to execute a sample pi job even to list files using hdfs dfs -ls , I am getting the…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-10-06T03:29:11.663+00:00
VIVEK B 1 Reputation point
answered 2021-10-12T15:21:39.267+00:00
VIVEK B 1 Reputation point
1 answer

Can we add new blob storages to an existing HDI cluster without r

Hi Team, I have a scenario where I have an existing cluster. I created 2 new blob storages. I wanted to add these storages to that existing HDI. Is there any way to add these storages to the HDI cluster and get effect without having to delete…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-10-01T17:57:42.57+00:00
Kinya Munini 1 Reputation point
commented 2021-10-11T09:00:50.483+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee
1 answer

How to model thousands of files from Azure Data Lake Gen 2 to Single dataset for analysis?

Hi, I have an initial 1000s of delimited files in Azure Data Lake Gen 2 storage account. I need to read all these files and create them as single dataset for analysis. This dataset must be preserved for future files. After these files are processed,…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,389 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,524 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,005 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,845 questions
asked 2021-09-21T21:30:45.83+00:00
Ramanathan Dakshina Murthy 21 Reputation points
commented 2021-09-28T04:30:39.677+00:00
svijay-MSFT 5,226 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Unable to access 101-hdinsight-linux-add-edge-node template

I am trying to Create an Edge node to HDInsight Cluster using link…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-24T10:06:03.717+00:00
Mahima Khatri 26 Reputation points
commented 2021-09-27T10:15:41.01+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee
0 answers

Can't access the Hadoop services on HDInsight 4.0

Hi, ppl I deployed the ESP Hadoop clusters in my VNET. But, I can't access to some Hadoop services as NameNode UI and Solr Ambari UI because I can't access to the FQDN:hn0-[clustername initial 6 characters].[AAD-DS DNS domain name] :…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-22T11:04:50.397+00:00
井上 樹/ITSUKI INOUE 1 Reputation point
commented 2021-09-27T10:08:37.617+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Import HDI external HIVE Metadata DB into Synapse

I have an HDI 4.0 cluster I am trying to turn off and move to Synapse. I have an external HIVE DB. How can I import HIVE tables+metadata? I know I can first recreate the metadata by hand and then read from ADLS. I want to automate/eliminate…

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,524 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-15T22:17:04.87+00:00
Pablo Barcenas 21 Reputation points
accepted 2021-09-22T01:06:12.86+00:00
Pablo Barcenas 21 Reputation points
1 answer

HDInsights Cluster with ESP

I know that in-order to create a Kerberized HDInsights Cluster, we have to enable the Azure Active Directory Domain Services. My question is, Will it be possible to create a Kerberized HDInsights Cluster with external KDC Server...? By external KDC…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-14T06:10:33.95+00:00
VIVEK B 1 Reputation point
commented 2021-09-21T09:43:54.517+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee
0 answers

Data Migration between Kerberized CDH and ESP enabled HDInsights.

Hi All, We have a requirement for transferring HDFS data residing in our on-premise Kerberised CDH Cluster(MIT KDC) into an HDInsights cluster with ESP Enabled. Source will be CDH and destination will be adls. How can I do that since I see issues…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-15T12:46:07.89+00:00
VIVEK B 1 Reputation point
commented 2021-09-20T06:05:35.347+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Need to delete HDICluster as of no use.

Hi, Need to know if there is any impact on storage or any other component when I decide to delete an HDI Cluster.

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-13T13:39:16.21+00:00
Mukul.Tanwar 21 Reputation points
accepted 2021-09-15T10:02:46.653+00:00
Mukul.Tanwar 21 Reputation points
0 answers

HDinsight cluster deletion failure

Hello, Currently, I am facing an issue in which my On-demand HDinsight cluster is not automatically deleted after the job execution and it is causing huge costs for me. I am looking for an automated process to delete the HDcluster if there is no…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-13T09:08:52.557+00:00
Rinshad R 1 Reputation point
commented 2021-09-15T09:22:06.75+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee
0 answers

Pyspark HDInsight DataFactory Eviroment Variable

I'm faced a problem with Pyspark, datafactory and HdInsight I create a HDInsight with 2 master and 2 slaves. I created environment variables in all server like sudo echo 'TEST=server' >> /etc/environment After that, in all server I…

.NET
.NET
Microsoft Technologies based on the .NET software framework.
3,502 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-09-08T21:57:22.217+00:00
ALBERTO JUNIOR 1 Reputation point
commented 2021-09-14T07:03:51.75+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee
0 answers

HDInsight cluster is failing when upgrading the version to 4.0

Hi Team, I am getting this error : Operation on target Txxxxxx failed: Hadoop job failed with exit code '2'. See…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,845 questions
Azure R Server for HDInsight
Azure R Server for HDInsight
An Azure service that provides predictive analytics, machine learning, and statistical modeling for big data.
13 questions
asked 2021-08-02T10:50:45.45+00:00
Kajol Dhosewan 1 Reputation point
commented 2021-08-18T06:28:40.037+00:00
Kajol Dhosewan 1 Reputation point
3 answers One of the answers was accepted by the question author.

What is the difference between : Azure Synapsis Analytics - Azure Databricks - Azure HD insight

Hello Everybody, I'm running a project where we need to propose an azure-based architecture to import data from an on-premises data warehouse (databases) to azure-based data platform. Data are aimed to be exposed to company operators through a web…

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,389 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,524 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,005 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,845 questions
asked 2020-08-06T12:48:15.723+00:00
CloudRock 366 Reputation points
answered 2021-08-12T18:40:38.117+00:00
Andrei Calin Juganaru 1 Reputation point
0 answers

Error with Hive Warehouse Connector Jar in Azure java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT

I have the following Azure env HDP Spark version 2.3.2.3.1.0.319-3 Hadoop Version : 3.1.0.319-3 Hive Version : 3.1.0 Can anyone please suggest which versions of jars need to be used with the above env configuration. I am using following Jar…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-07-30T13:45:09.27+00:00
Latif Mohammad Khan 1 Reputation point
commented 2021-08-11T01:23:31.147+00:00
HimanshuSinha-msft 19,386 Reputation points Microsoft Employee
1 answer

HDInsight cluster stuck in 'delete' state

I issued a delete request for an HDInsight cluster over 12 hours ago, and it now says 'Deleting'. Anyway to complete the deletion of the cluster?

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-07-30T12:35:29.703+00:00
EB 1 Reputation point
commented 2021-08-10T18:42:17.04+00:00
Saurabh Sharma 23,771 Reputation points Microsoft Employee
0 answers

How to do big data analysis on Append Block with HDInsight or any other alternative service on Azure? (source log data produced by Azure Monitor)

Hello, I enable the Diagnostic settings in Monitoring of storage account, and the log will be sent to another storage account. And the default type of the log JSON file is append blob rather than block blob, and seems the type can not be changed. …

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
2,909 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,005 questions
asked 2021-07-23T09:05:15.557+00:00
Ivy Feng 16 Reputation points
commented 2021-07-28T03:09:05.097+00:00
Ivy Feng 16 Reputation points
1 answer One of the answers was accepted by the question author.

Best practices for submitting spark batch jobs in Azure HDinsights.

Hi, I'm looking to submit my pyspark scripts in HDInsight. Currently, HDInsight provided Livy for job submission, using curl. However, If I want to productionize it, then what authentication mechanism to use. Also, How can I check the progress of…

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-07-19T08:22:48.617+00:00
Bipin Singh 21 Reputation points
accepted 2021-07-22T04:13:00.32+00:00
Bipin Singh 21 Reputation points
1 answer One of the answers was accepted by the question author.

How to access an external storage account in HDInsight cluster without access key?

How does the HDInsight cluster to read data from a private blob container not set as the default or additional storage account during the cluster's creation. We don't want to use access key. Can we add additional storage accounts after the cluster…

Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,803 questions
Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-07-14T09:55:18.937+00:00
Salander He 66 Reputation points Microsoft Employee
accepted 2021-07-16T03:04:53.627+00:00
Salander He 66 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

How To Create an Azure Cluster In Another Region Using Azure Data Factory

I'm Trying to change the region where the Hd Insight Linked Service create an on-demand cluster. I Want to change from East Us to East Us 2 (Or West US) But I can't do it I've added in the following places the Location key without success: …

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,845 questions
asked 2021-06-30T04:59:57.097+00:00
Joaquin Chemile 41 Reputation points
accepted 2021-07-03T15:37:07.52+00:00
Joaquin Chemile 41 Reputation points
0 answers

Llap is disabled and cant be enabled due to some reason and query with joins taking long to fetch the result

Llap is disabled in hdi4 manually and now query with joins taking 30 mins extra to fetch 30 millions rows

Azure HDInsight
Azure HDInsight
An Azure managed cluster service for open-source analytics.
200 questions
asked 2021-06-28T22:51:12.01+00:00
amrita 1 Reputation point
commented 2021-07-01T09:35:51.44+00:00
PRADEEPCHEEKATLA-MSFT 81,391 Reputation points Microsoft Employee