question

JohnK-3500 avatar image
0 Votes"
JohnK-3500 asked COZYROC answered

SSIS CSV Loading issue

I am looking for a C# solution to use a script task in SSIS so that I can load multiple csv files to SQL Tables
Now If I use a data flow task, then I have to create 1 DFT per file as each file has a different metadata ( unless there is some other way with dynamic DFT?)

So I checked the following solutions
http://www.techbrothersit.com/2016/03/how-to-create-tables-dynamically-from.html
http://www.techbrothersit.com/2016/04/how-to-load-flat-files-to-sql-server.html
One of this solution expects target table to be already present while other creates the table on the fly ( columns from the first line of the CSV) which is fine as well

Issue :-
Now my CSV files have comma as a delimiter and some of the columns have comma in the column value

eg some columns have numbers but with comma separated value eg 100,000
some columns have info about name of person which can be D'Souza
`
so the above scripts error out as they consider any comma as a field separator so error is thrown like count of values doesn't match the number or columns
OR the apostrophe character throws the error as invalid data

eg here's the structure of one csv file ( check lines 2,3)

Thanks in advance.


22805-image.png




sql-server-generalsql-server-transact-sqlazure-sql-databasesql-server-integration-services
image.png (8.7 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Viorel-1 avatar image
0 Votes"
Viorel-1 answered Viorel-1 commented

The borrowed solutions use simple text reading operations that are not suitable for CSV files, which contain commas or quotation marks within values, or which contain multiline values.

If you want to process CSV files yourself, then consider a designed class: TextFieldParser [https://docs.microsoft.com/en-us/dotnet/api/microsoft.visualbasic.fileio.textfieldparser?view=netframework-4.8]. It can be used in C# but requires a reference to Microsoft.VisualBasic assembly.

You can find many samples, for example:

 using Microsoft.VisualBasic.FileIO;
 . . .
 using (var p = new TextFieldParser(@"C:\myfile.csv")
                                   {
                                          TextFieldType = FieldType.Delimited,
                                          Delimiters = new[] { "," },
                                          HasFieldsEnclosedInQuotes = true
                                   })
 {
        while (!p.EndOfData)
        {
               // read and split one line of fields
               string[] fields = p.ReadFields();
    
               // each fields[i] is a value of a column
               // . . .
        }
 }


It is not difficult to identify the first header row.

If you decide to build dynamic SQL, which is not always recommended, then replace each apostrophe (like “Brian D'Souza”) with two apostrophes using string.Replace function.

Try and debug it in a separate console application, in Visual Studio, before moving to SSIS.

But also check if SSIS contains other appropriate tasks.



· 8
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks I am not a C# developer but this code seems easy as fields[i] refer to a column .I can use it to create an insert statement
so by using this class I din't have to handle regular expressions?

0 Votes 0 ·

[...] so by using this class I din't have to handle regular expressions?

This class can read valid CSV files and deliver the parsed values of the fields. You do not have to use Regular Expressions for parsing.

But if you want to process some specific extracted fields, (for example: get currency symbol from Total Amount Paid column), you can use Regular Expressions after you get the data with TextFieldParser.

0 Votes 0 ·

Thanks for a quick response. I tested and it worked like magic.

However, my intention is to have just single SSIS package with just 1 script task to load ALL tables.
However the method you suggested, I am getting the column values in an array fields[0],fields[1] etc which implies I need to create an insert statement with that array members in it which means if a file has 5 columns then my code should be different from the code which can load a file with 10 columns.
So any way to have one script which can generate a dynamic insert statement and load table with the same name of the file. ( just as in the example in one of the links I shared)

0 Votes 0 ·
Show more comments
COZYROC avatar image
0 Votes"
COZYROC answered

Hi,

I would recommend you check the commercial COZYROC Data Flow Task Plus. It is an extension of the standard Data Flow Task with support for dynamic columns metadata at runtime. You can process CSV files with any structure without a need to code or have programming skills. I would recommend you also check the demonstration videos we have posted.


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.