SSIS CSV Loading issue

John K 186 Reputation points
2020-09-05T03:25:01.87+00:00

I am looking for a C# solution to use a script task in SSIS so that I can load multiple csv files to SQL Tables
Now If I use a data flow task, then I have to create 1 DFT per file as each file has a different metadata ( unless there is some other way with dynamic DFT?)

So I checked the following solutions
http://www.techbrothersit.com/2016/03/how-to-create-tables-dynamically-from.html
http://www.techbrothersit.com/2016/04/how-to-load-flat-files-to-sql-server.html
One of this solution expects target table to be already present while other creates the table on the fly ( columns from the first line of the CSV) which is fine as well

Issue :-
Now my CSV files have comma as a delimiter and some of the columns have comma in the column value

eg some columns have numbers but with comma separated value eg 100,000
some columns have info about name of person which can be D'Souza
`
so the above scripts error out as they consider any comma as a field separator so error is thrown like count of values doesn't match the number or columns
OR the apostrophe character throws the error as invalid data

eg here's the structure of one csv file ( check lines 2,3)

Thanks in advance.

22805-image.png

Azure SQL Database
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,654 questions
SQL Server Integration Services
SQL Server Integration Services
A Microsoft platform for building enterprise-level data integration and data transformations solutions.
2,451 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,551 questions
0 comments No comments
{count} votes

Accepted answer
  1. Viorel 111.8K Reputation points
    2020-09-05T09:29:54.757+00:00

    The borrowed solutions use simple text reading operations that are not suitable for CSV files, which contain commas or quotation marks within values, or which contain multiline values.

    If you want to process CSV files yourself, then consider a designed class: TextFieldParser [https://learn.microsoft.com/en-us/dotnet/api/microsoft.visualbasic.fileio.textfieldparser?view=netframework-4.8]. It can be used in C# but requires a reference to Microsoft.VisualBasic assembly.

    You can find many samples, for example:

    using Microsoft.VisualBasic.FileIO;  
    . . .  
    using (var p = new TextFieldParser(@"C:\myfile.csv")  
                                      {  
                                             TextFieldType = FieldType.Delimited,  
                                             Delimiters = new[] { "," },  
                                             HasFieldsEnclosedInQuotes = true  
                                      })  
    {  
           while (!p.EndOfData)  
           {  
                  // read and split one line of fields  
                  string[] fields = p.ReadFields();  
      
                  // each fields[i] is a value of a column  
                  // . . .  
           }  
    }  
    

    It is not difficult to identify the first header row.

    If you decide to build dynamic SQL, which is not always recommended, then replace each apostrophe (like “Brian D'Souza”) with two apostrophes using string.Replace function.

    Try and debug it in a separate console application, in Visual Studio, before moving to SSIS.

    But also check if SSIS contains other appropriate tasks.


1 additional answer

Sort by: Most helpful
  1. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more