question

SampatVarun-0080 avatar image
0 Votes"
SampatVarun-0080 asked ·

Zipping and UnZipping files for Batch limitation in ADF

Hello,

I recently encountered the following error while running a Custom Activity on Azure Data Factory:

  • "Total size of resourceFiles cannot be more than 32768 characters"

For this error, I found many blog posts that state you must zip all the files, then unzip using the command for the Custom Activity. I came across the following posts:

  1. https://social.msdn.microsoft.com/Forums/en-US/0a191641-1e77-4eae-b33d-9a0a331628b5/advv2-custom-activity-with-many-dependencies-total-size-of-resourcefiles-cannot-be-more-than?forum=AzureDataFactory

  2. https://social.msdn.microsoft.com/Forums/en-US/ab57e810-94d7-4a48-a358-649f607c9717/azure-batch-resourcefiles-limitation?forum=azurebatch

  3. https://stackoverflow.com/questions/55995566/how-do-i-unzip-and-execute-a-batch-service-job-as-part-of-azure-data-factory

But I got a little confused and was hoping to get some clarification on the following questions:

  1. How do I run a Python script with some command line arguments (For example, my current command is "python main.py -o test123") along with unzipping files?

  2. When I unzip files on the Batch Node, should I have to delete the files once the Custom Activity is done running? How is that taken care of?

  3. Does any activity on ADF 'zip' files for you?


If there's existing documentation for this, please guide me toward it. If not, any help will be appreciated!

Thank you, in advance!

azure-data-factoryazure-storage-accountsazure-batch
· 1
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@SampatVarun-0080 did my answer help you, or do you need more assistance?
If my answer solved your issue, please mark as accepted answer. Otherwise, please let me know how I may better assist you.

1 Vote 1 ·

1 Answer

MartinJaffer-MSFT avatar image
1 Vote"
MartinJaffer-MSFT answered ·

Hello @SampatVarun-0080 and welcome to Microsoft Q&A. Thank you for your excellent question.

  1. according to the stackoverflow you referenced, it would look something like cmd /c "Unzip.exe myZippedStuff && python mainPythonScript argument1 argument2 argument3"

  2. Whether the node persists or not is determined by how you have configured the pool. You can have the instance get deleted once the task completes.

  3. A copy activity using 2 Binary datasets can be used to compress / decompress files. To do this, you need to indicate the compression type accordingly. On the unzipped side, it should be None. When Unzipping, make sure to leave the File part of the File path empty, otherwise not everything will be extracted. The contents will be written to a folder with the same name as the zipped file. Specifying a Directory will put it in a sub-folder.
    55852-image.png
    55861-image.png
    55799-image.png



image.png (71.9 KiB)
image.png (63.9 KiB)
image.png (40.8 KiB)
·
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.