question

SinghSg avatar image
0 Votes"
SinghSg asked MelissaMa-msft answered

T-SQL Import CSV file- empty columns Data type mismatch

Hi All

I am import csv file using TSQL Code

     INSERT INTO [TableName]
     SELECT *  FROM openrowset('MSDASQL'
         ,'Driver={Microsoft Access Text Driver (*.txt, *.csv)}'
         ,'select * from \\SupermanFile.csv')



Using some env variable SQL decides how many rows to see before deciding the columns data type...
Can someone pls point me where should I look for this information?

Why bz - a few rows are null after importing... that could be bz sql column might have integers initially and later it might have text.


Thanks

sql-server-transact-sql
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

YitzhakKhabinsky-0887 avatar image
0 Votes"
YitzhakKhabinsky-0887 answered YitzhakKhabinsky-0887 edited

Hi @SwatantraSingh-2674,

Here is a conceptual example for you.
This way you will have a full control on data types, header lines, errors, etc.

T-SQL

 SELECT *
 FROM  OPENROWSET(BULK 'e:\Temp\Quark.csv'
    , FORMATFILE = 'e:\Temp\Quark.xml'  
    , ERRORFILE = 'e:\Temp\Quark.err'
    , FIRSTROW = 2 -- real data starts on the 2nd row
    , MAXERRORS = 100
    ) AS tbl;

Quark.csv

 "ID"|"Name"|"Color"|"LogDate"|"Unknown"
 41|Orange|Orange|2018-09-09 16:41:02.000|
 42|Cherry, Banana|Red,Yellow||
 43|Apple|Yellow|2017-09-09 16:41:02.000|

Quark.xml

 <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <RECORD>
       <FIELD ID="1" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70"/>
       <FIELD ID="2" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
       <FIELD ID="3" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
       <FIELD ID="4" xsi:type="CharTerm" TERMINATOR='|' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
       <FIELD ID="5" xsi:type="CharTerm" TERMINATOR='\r\n' MAX_LENGTH="70" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
    </RECORD>
    <ROW>
       <COLUMN SOURCE="1" NAME="ID" xsi:type="SQLVARYCHAR"/>
       <COLUMN SOURCE="2" NAME="Name" xsi:type="SQLVARYCHAR"/>
       <COLUMN SOURCE="3" NAME="Color" xsi:type="SQLVARYCHAR"/>
       <COLUMN SOURCE="4" NAME="LogDate" xsi:type="SQLVARYCHAR"/>
       <COLUMN SOURCE="5" NAME="Unknown" xsi:type="SQLVARYCHAR"/>
    </ROW>
 </BCPFORMAT>
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MelissaMa-msft avatar image
0 Votes"
MelissaMa-msft answered

Hi @SwatantraSingh-2674,

Welcome to Microsoft Q&A!

Using some env variable SQL decides how many rows to see before deciding the columns data type...

Could you please provide more details about above? What did you refer 'env variable'?

that could be bz sql column might have integers initially and later it might have text

If you would like to check this one, you could refer below and check whether it is a little helpful.

Suppose the ID column was the column you referred.

  SELECT *  
     into #temp
     FROM openrowset('MSDASQL'
             ,'Driver={Microsoft Access Text Driver (*.txt, *.csv)}'
             ,'select * from \\SupermanFile.csv')
     select count(*) total,sum(case when ISNUMERIC(ID)=0 then 1 else 0 end) not_integer from #temp

You could get the total count of rows and the count of rows whose value contains text.

If the latter is larger, you could consider to change the data type of ID column from int to varchar.


Best regards,
Melissa


If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.