question

ngelPrezMiguel-2774 avatar image
0 Votes"
ngelPrezMiguel-2774 asked PRADEEPCHEEKATLA-MSFT commented

Synapse Spark Pools: support adding a DataLake folder to PYTHONPATH

Hello everyone,

I'm developing a PySpark application using the typical Python structure, where we can import modules from other parts of the code into a script, such as:

 from another_module.another_filename import MyFunction

These are functions that are either related to our existing architecture or to business, so we can't get rid of them.

When we have full control of our environment, is as easy as adding the code folder to the PYTHONPATH, but I suspect it cannot be done with the current Synapse configuration.

Right now, the Spark Job Definition template allows you to specify reference files. However, it doesn't matter were you store your files in the DataLake because they all the files end at the same level without hiearchy preservation in a specific folder of each Spark worker, thus making impossible to follow the aforementioned line of code.

Should there by any way to overcome, please let me know.

Thank you


azure-synapse-analytics
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @ngelPrezMiguel-2774,

Thanks for the question and using MS Q&A platform.

We are reaching out to the internal team to get more details on this ask. I will be update you once I hear back from the team.

0 Votes 0 ·

Hello @ngelPrezMiguel-2774,

We are still awaiting for the response from the internal team. I will be update you once I hear back from the team.

Stay tuned!

0 Votes 0 ·

0 Answers