question

Hochmanj-5166 avatar image
0 Votes"
Hochmanj-5166 asked PRADEEPCHEEKATLA-MSFT commented

Connect to Spark Pools with VSCode

Does this tut still apply?

https://docs.microsoft.com/en-us/azure/synapse-analytics/spark/vscode-tool-synapse

I can't seem to get this to work following along at home.

I noticed that the tutorial says to use ms-python extension 2020.4.7. This version is a year old. I tried with 2020.4.7... as well as 2021.3.68.. which is the most recent. Neither one seems to work for me.

azure-synapse-analytics
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @Hochmanj-5166,

Thanks for the ask and using the Microsoft Q&A platform

When you say "Neither one seems to work for me", could you share what exactly you have tried and are you experiencing any error message while running the code?

0 Votes 0 ·
Hochmanj-5166 avatar image
0 Votes"
Hochmanj-5166 answered PRADEEPCHEEKATLA-MSFT commented

i seem to have gotten it to work but i'm not 100% sure how. I ended up using the command pallet and trying to throw settings at anything that came up related to spark and it eventually work.

i don't know if this was relevant or coincidental but it seemed to work after i selected "Spark / Hive: Link a cluster" and entered in the synapse url as a generic livy endpoint. It prompted me for a admin and password in formation for basic auth, i entered in dummy information and everything seemed to work after that.

Only thing i noticed is that mssparkutils is not in the namespace when running from vs code. Is there a way to import it from one of the modules like os or sys or other?

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @Hochmanj-5166,

Glad to know that your issue has resolved. You can accept it as answer. And thanks for sharing the solution, which might be beneficial to other community members reading this thread. Thank you.

Microsoft Spark Utilities (MSSparkUtils) is a builtin package to help you easily perform common tasks. You can use MSSparkUtils to work with file systems, to get environment variables, to chain notebooks together, and to work with secrets. MSSparkUtils are available in PySpark (Python), Scala, and .NET Spark (C#) notebooks and Synapse pipelines in Azure Synapse Analytics.

0 Votes 0 ·

Hello @Hochmanj-5166,

Following up to see if you got chance to accept it as answer.

0 Votes 0 ·
Hochmanj-5166 avatar image
0 Votes"
Hochmanj-5166 answered PRADEEPCHEEKATLA-MSFT commented

When using the most up to date extensions, it seems that some features in the python extension have since been split out into a separate jupyter extension. For example, "Python: Select Interpreter to Start Server" is now "Jupyter: Select Interpreter to Start Server."

In any event, when selecting "Synapse: PySpark Interactive " , it fails because several required packages are missing from the virtual environment created when following the tutorial instructions.

After manually installing those packages, it fails with the message:

"Failed to connect to remote Jupyter notebook. Check that the Jupyter Server URI setting has a valid running server specified. https://[clustername].dev.azuresynapse.com/ FetchError: request to https://[clustername].dev.azuresynapse.com/hub/api failed, reason: getaddrinfo ENOTFOUND [clustername].dev.azuresynapse.com

I've confirmed the cluster is running from the monitor tab of the workspace.


If i try using ms-python 2020.4 as the tut calls out. It seems to introduce more complications. It's not clear to me if i should also uninstall the seperate jupyter extension when using the older python extension. In general, after trying various configurations, I'll either get a simliar error as listed above, or I see an error that _ssl fails to import (although i'm able to import it in both my base repl and a repl inside the virtual env, so i'm not sure why that's happening).

Thanks for any help.

Also, i'm curious if it's possible to connect to the cluster with just jupyter-console.exe alone but i think i've seen in older posts that it's not currently possible.

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @Hochmanj-5166,

Thanks for sharing the detailed information.

This issue looks strange. For a deeper investigation and immediate assistance on this issue, if you have a support plan you may file a support ticket.

0 Votes 0 ·

Hello @Hochmanj-5166,

Did you get a chance to open a support ticket?

0 Votes 0 ·