question

KinmondSarahJSITIITYFA-8321 avatar image
0 Votes"
KinmondSarahJSITIITYFA-8321 asked EugeneLycenok-1882 commented

Error "Failure happened on 'sink' side SFTPServerNotReturnExpectedDataLength" when ingesting files from SFTP server

I'm getting a repeated error when attempting to ingest multiple PSV.GX files, in batch, from an SFTP server (LINUX) into Azure. Pipeline connection has been checked (all fine) and new connection created for the sake of testing to rule this out. Same error every time.

We've moved files from an existing folder on the SFTP server to a new folder and tried again. Again fails. If attempting to ingest a single file, this usually works. But due to nature of our project, we need to batch ingest a number of files daily.

Seems no limit on file size so this shouldn't be an issue? Strange thing is, this worked fine for a while, only just now started failing.

Seems to fully read the total number of files to be ingested (e.g. 45 files in a folder) but only writes 43 before failing with the error.

Also, the failure always specifies the latest dated file in that folder.

78262-image.png

Other point to note is if we manually download the PSV.GZ files from the SFTP server (LINUX) to a windows machine, then re-upload them, ingestion then works successfully. Obviously this is not a viable option or workaround going forward but... Anyone experienced the same issue? Any pointers or solutions? Many thanks!

azure-data-factory
image.png (125.2 KiB)
· 10
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @kinmondsarahjsitiityfa-8321,

Welcome to Microsoft Q&A and thanks for reaching out.

Could you please share the failed activity runID and failed pipeline run ID for further analysis? And also please confirm what is the integration runtime being used on source as well as sink?

Thanks

0 Votes 0 ·

Hi @kinmondsarahjsitiityfa-8321,

Just checking to see if you have got a chance the review my previous comment, if so could you please share few additional details as requested.

Thanks

0 Votes 0 ·

Hi,
Details are:

RunID - 10be500e-9c8d-41b5-9e48-e5b6824251cd
Pipeline Name - PL_SFTP_DATALAKE_IQN_TO_RAW_PROD_TEST
Integration Runtime – AutoResolveIntegrationRuntime ( Source and Sink)

PSV files have been compressed and are made available on the SFTP server using GZ format. This is crucial as some are sizeable and there is a need to retain & store historical files.

Any ideas? I've searched everywhere for some info/similar issues reported and can find nothing.
Thanks.
Sarah.

0 Votes 0 ·
KranthiPakala-MSFT avatar image KranthiPakala-MSFT KinmondSarahJSITIITYFA-8321 ·

Hi @kinmondsarahjsitiityfa-8321,

Thanks for the details and response. I will work with internal team to check on internal logs about this issue. In the mean time could you please give a try with Self Hosted Integration runtime and see if that makes any difference?

Let us know how it goes.

Thanks

0 Votes 0 ·
KranthiPakala-MSFT avatar image KranthiPakala-MSFT KinmondSarahJSITIITYFA-8321 ·

Hi @kinmondsarahjsitiityfa-8321,

After verifying with product team, they have confirmed that when reading the file: xxxx_xxxxxx_2021_03_16_17_35_04_C1900.psv.gz, the SFTP told ADF the file size is 8034674 bytes, but we only got 8005477 bytes, so ADF threw this exception SFTPServerNotReturnExpectedDataLength.


In the past we did encounter some of SFTP server return wrong size, causing copy to fail. Could you please confirm which file size is true.

And If that is the case, then it is a SFTP server side issue, and we cannot do much from ADF standpoint.

Please let us know if you have further query.

Thanks

0 Votes 0 ·

Hmm, this is strange.
The (compressed) GZ file size on the SFTP server states: 7.66MB
I download this file to my Windows machine. If I then check the Properties of that file in Windows Explorer, this is stated as:
Size: 7.63 MB (8,005,477 bytes)
Size on disk: 7.63 MB (8,007,680 bytes).

Could Azure have a problem processing compressed format files?

0 Votes 0 ·

Hi,
Many thanks for the guidance so far. We are speaking with our counterpart (owning & hosting the SFTP server) to check this.
Could it be that the compression process used to package the file (which in raw format is a PSV file) is somehow corrupting the file size?

If the SFTP server stores the file as a particular size, what part of the Azure connection and ingestion process could be misinterpreting the file size?
How does Azure read/determine the file size initially? Technically, how does this work?

Thanks.

0 Votes 0 ·
Show more comments

1 Answer

EugeneLycenok-1882 avatar image
0 Votes"
EugeneLycenok-1882 answered EugeneLycenok-1882 commented

Hi @kinmondsarahjsitiityfa-8321, @KranthiPakala-MSFT

Are there any updates on the issue?
We are experiencing exactly the same.
Not that winscp is working just fine with the same credentials.

Kind Regards
Eugene

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @KinmondSarahJSITIITYFA-8321 and @EugeneLycenok-1882

Did you find the solution for this by any chance. Any additional properties do we need to set up on source and sink side. My use case is it is comma delimited text file not able to be copied to blob from SFTP when file code is 101. Below is the command for reference for the file properties on SFTP. Any help will be appreciated.

put [source file] [destination file],101,50,50,200

Thanks,
Sridhar

1 Vote 1 ·

Hi @sridhar-4013,

Microsoft Support is looking into the issue on my side.
I'll give an update in this thread.

Kind Regards
Eugene

0 Votes 0 ·