question

DonHeerschap-7424 avatar image
3 Votes"
DonHeerschap-7424 asked sowokie commented

Azure Synapse Spark Pool doesn't install packages with alternative index

I'm trying to install a pypi package which is installed through an alternative index from our own DevOps artifact.

When I use this requirements.txt:

 azure-common==1.1.25
 azure-core==1.9.0
 azure-functions==1.4.0
 azure-identity==1.4.0
 azure-keyvault-secrets==4.2.0
 azure-storage-blob==12.6.0
 azure-storage-file-datalake==12.2.0
 google-api-core
 google-api-python-client
 google-auth
 google-auth-httplib2
 googleapis-common-protos
 oauth2client==4.1.3
 pyodbc==4.0.30
 pandas==1.1.3
 pyarrow==1.0.1
 pyspark
 ipython

everything install fine (confirmed with checking through the following code):

 import pkg_resources
 for d in pkg_resources.working_set:
      print(d)

When I add a package with an alternative index to the end of the requirements.txt it installs none of the packages defined in the txt file (I anonimized the alternative URL bcs of security reasons):

 azure-common==1.1.25
 azure-core==1.9.0
 azure-functions==1.4.0
 azure-identity==1.4.0
 azure-keyvault-secrets==4.2.0
 azure-storage-blob==12.6.0
 azure-storage-file-datalake==12.2.0
 google-api-core
 google-api-python-client
 google-auth
 google-auth-httplib2
 googleapis-common-protos
 oauth2client==4.1.3
 pyodbc==4.0.30
 pandas==1.1.3
 pyarrow==1.0.1
 pyspark
 ipython
 -i https://<PAT_NAME>:<PAT_TOKEN>@pkgs.dev.azure.com/<ORG>/<TEAM>/_packaging/Sparkhouse/pypi/simple/
 sparkhouse

when I use the same requirements file for Azure Functions or locally they all install successfully.

Please tell me why it doesn't install with an alternative index line and/or is there any logs available somewhere of the spark pool in Azure Synapse of the package installation to check what is going wrong.

azure-synapse-analytics
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @DonHeerschap-7424 ,
Thanks for the ask and using the forum .

I am confident that the scenario of accessing the packages on a local instance is not supported but i have reached out to the internal team to get a confirmation on this same .To me architechture of Synapse and Azure function are very different and i could say that we are comparing Oranges to Apple .

Thanks
Himanshu

0 Votes 0 ·

Hi @HimanshuSinha-MSFT

Thanks for your reply and reaching out to the internal team. Just for clarification: I'm aware that Azure function is very different, it was just to show that there is no corrupt package or indepdency fail in the requirements.txt. That being said, I'm assuming/expecting Synapse uses either conda or pip (or something similar) to install the packages defined in the requirements.txt (similar to DataBricks, where the I can also install the package with the alternative index).


Also I tried it with a public alternative index (see below), and again without success

 -i https://pypi.tuna.tsinghua.edu.cn/simple
 pendulum

Hope to hear from you soon!


0 Votes 0 ·

1 Answer

HimanshuSinha-MSFT avatar image
1 Vote"
HimanshuSinha-MSFT answered sowokie commented

Hello @DonHeerschap-7424 ,
Thanks for the insights . I did heard back from the internal team and they confirmed that the alternative indexes are not support at this time .

On the ask about the logs , yes the time in working on something which will be rolled out in the very near future .
Thanks
Himanshu

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @HimanshuSinha-MSFT Can you confirm that this functionality is still not implemented? If it is can you point me to any documentation on it?

We have a Pip package in our Azure DevOps artifacts that we need to be able to point to and use.

1 Vote 1 ·