question

PaulHernandez-8067 avatar image
0 Votes"
PaulHernandez-8067 asked PRADEEPCHEEKATLA-MSFT commented

Synapse data flow Sink to delta table - vacuum not working as expected

Hi everyone,

we are writing to a delta table in a synapse data flow using an inline dataset with the following settings:

128648-image.png


With overwrite set as table action we added a new snapshot to the target table every day making a full load.

We left the default vacuum value of 0 which means 30 day.

The day after every run the files from the previous load are marked for remove in the logs:

 {"remove":{"path":"part-00000-49dfde94-43b2-4444-a8b1-e683fb5552c5-c000.snappy.parquet","deletionTimestamp":1627905681270,"dataChange":true}}   

However, after a month files are not getting removed from the table location.

I analyzed which files should be removed using a dry run:

128755-image.png

There are around 4000 files to be deleted.

If I execute a vacuum in a notebook then the files are removed.

I would like to know why the vacuum from the data flow is not removing the data exceeding the threshold, am I missing something or is this a bug?

Any information will be appreciated.

Best regards,

Paul Hernandez

azure-synapse-analytics
image.png (24.2 KiB)
image.png (59.5 KiB)
· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @PaulHernandez-8067,

Thanks for brining this to our attention.

We are reaching out to the internal team to get more details on this behaviour. I will be update you once I hear back from the team.

0 Votes 0 ·

Hello @PaulHernandez-8067,

We are still awaiting for the response from the internal team. I will be update you once I hear back from the team.

1 Vote 1 ·

Hello @PaulHernandez-8067,

We haven't got any response from the internal team. For a deeper investigation and immediate assistance on this issue, if you have a support plan you may file a support ticket, else please do let us know.

0 Votes 0 ·
Show more comments

1 Answer

PaulHernandez-8067 avatar image
1 Vote"
PaulHernandez-8067 answered PRADEEPCHEEKATLA-MSFT commented

Hi everyone, hi @PRADEEPCHEEKATLA-MSFT ,

I have an interesting finding.

We changed the value of the vacuum from 0 (which is suposed to be 30 days) to 720 (also the hours in 30 days) and it worked.

It seems like the default value of "0" is not taking effect.

The answer at the moment is to set the value you want in hours and avoid the default configuration.

BR.
Paul

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @PaulHernandez-8067,

Glad to know that your issue has resolved. And thanks for sharing the solution, which might be beneficial to other community members reading this thread.

0 Votes 0 ·