dataflow sink to update the source file

Question

dataflow1 has the following:
source1 --> aggregate --> sink1

source1 --> dsDatacompanies
sink1 --> dsDatacompanies

Note that source reads a .csv
aggregate then gets the distinct rows
sink1 then writes to the same file as source.

Is this ok or should the sink file be different to that of source?

Thank you

Accepted Answer

Hi @arkiboys ,

I would create a separate output file than overwriting the source file. Note that you can use the same dataset as SOURCE and as SINK, only you have to parameterize the dataset e.g., if you want to create the file in a different folder, param the folder path, if you want to create the file in the same folder with different name --> param the filename. Thanks!

dataflow sink to update the source file

0 additional answers