Woes importing Excel into SQL using SQL Server Import and Export Wizard

Zedster

Registered User.
Local time
Today, 07:31
Joined
Jul 2, 2019
Messages
169
I am trying to import data from an excel worksheet (WS) into SQL but I am having a number of issues.

The import export wizard is incorrectly identifying data types.

  1. There are two columns on the WS that contain only a heading, nulls and integer values but the wizard classes then as double. The destination table is set up as integer datatype. I can live with that one.
  2. There are five columns that each contain a heading a date in the format dd/mm/yyyy hh:ss and also some nulls. Three of these columns the import wizard correctly identifies as datetime, two of them it identifies as VarChar(255). The destination table is set up as datetime, which is causing problems.
  3. There is an internal notes field in my table currently set to varchar(max), the Excel field is identified as "long text". This field is throwing an error 0xc0204016 SSIS.Pipeline that the output column has a length that is not valid, the length must be between 0 and 4000.

I expected this process to be a five minute job, I have spent ten times that already. Is there any better alternative to importing Excel data into SQL server?

Any help appreciated.

NB the Excel data is fairly consistently formatted because it is an export from an online database.
 
You could convert the spreadsheet to csv and use BULK INSERT or bcp
They are very configurable with format files but poorly documented and infuriatingly finicky.

BCP can be made to generate a Format file but I found I had to manually edit it anyway because I was stuck with importing text. I used the text version of the Format file. It supports fixed width or delimited imports.

It gets really complex if the data values contain delimiter characters. For example commas inside money values. It doesn't understand the idea of quotes around values at all. The field terminators (delimiters) can be multicharacter so any quotes around values in the csv need to be incorporated in the field terminator strings.

Try to make the csv export form Excel as simple as possible. Drop the format the numbers with thousands separators.

On the bright side, bcp is amazingly fast both for import and export and is absolutely brilliant for export.
 
Most importing processes look at a small number of values in the first few rows and take a guess at the datatypes. It might help to sort the data in the spreadsheet so the datatypes of values of the first rows are absolutely unambiguous.
 
Most importing processes look at a small number of values in the first few rows and take a guess at the datatypes. It might help to sort the data in the spreadsheet so the datatypes of values of the first rows are absolutely unambiguous.
I did wonder that and will give a try, one problem I may have though is only 4 of the 5 date fields are ever used at once and the 5th is rarely used. So chances of getting 5th anywhere near top is slim.
 
You could convert the spreadsheet to csv and use BULK INSERT or bcp
They are very configurable with format files but poorly documented and infuriatingly finicky.

BCP can be made to generate a Format file but I found I had to manually edit it anyway because I was stuck with importing text. I used the text version of the Format file. It supports fixed width or delimited imports.

It gets really complex if the data values contain delimiter characters. For example commas inside money values. It doesn't understand the idea of quotes around values at all. The field terminators (delimiters) can be multicharacter so any quotes around values in the csv need to be incorporated in the field terminator strings.

Try to make the csv export form Excel as simple as possible. Drop the format the numbers with thousands separators.

On the bright side, bcp is amazingly fast both for import and export and is absolutely brilliant for export.

Not sure it will work for the very reason you state "..It gets really complex..", Ironically the file actually starts life as a CSV file and contains thousands of lines, but there is a notes field that will contain many instances of commas. For that reason I ruled out any form CSV import, figuring the commas could not be discriminated. So I saved it as an Excel file and set the field types, hoping that would make life easier. But clearly not!
 
I did wonder that and will give a try, one problem I may have though is only 4 of the 5 date fields are ever used at once and the 5th is rarely used. So chances of getting 5th anywhere near top is slim.
What about putting an easily recognised dummy record at the top then deleting from the table afterwards?
 
there is a notes field that will contain many instances of commas. For that reason I ruled out any form CSV import, figuring the commas could not be discriminated.
Use a different delimiter. eg a tilde (~) or a pipe (|)
 
Use a different delimiter. eg a tilde (~) or a pipe (|)
Will give that a try too, but the process is starting to get more manual than I would like. Export from web database as csv (the only format it allows), open in Excel, re-export as text file with different delimeter then import. In total there are 6 different CSV files I export from the web. The time starts to add up..
 
What about putting an easily recognised dummy record at the top then deleting from the table afterwards?
I can confirm that putting a dummy record in overcomes the date issue and allows import to proceed :) will need to create an excel function to do this for all six tables.
 
Or just change the wizard's dtsx to have everything be a text type of some kind - deal with it later in sql. That's almost always an element of my ETL strategy. I don't get fancy in the SSIS portion, even though you can; I keep it simple so that I can focus my effort on just one thing - TSQL, after it's imported to staging.

G mentioned bulk insert, I've used it a little, my main challenge in the corporate world was it requires what some MS considers to be an elevated permission and I've only been able to use it in my side jobs. Another challenge is coding to reference a file path that the server can actually see, may or may not be challenging in your environment, but often isn't possible for a dev to do in a strict silo'ed environment. And the column names have to all be identical, but if that's a problem you can also bulk insert into a View which helped me recently..

I definitely wouldn't deploy an ETL process that takes something directly from an Excel file if there was any way I could help it.
Change it to csv, at least some/any flat file where there is no ambiguity nor room for confusion about data types. Surely there is some .net process you could easily put in the ssis package to convert the excel to flat file. I would almost say that EVEN deploying a solution that automated Office on the server (something normally to be avoided) might be better than hooking ssis directly up to excel......IMHO.
 
Last edited:
For that reason I ruled out any form CSV import, figuring the commas could not be discriminated.
Why?

Did you test?

If the csv is properly formatted (i.e strings are within double quotes) then all the commas within the string field should not cause a problem to the importer.
 
I am trying to import data from an excel worksheet (WS) into SQL but I am having a number of issues.

The import export wizard is incorrectly identifying data types.

  1. There are two columns on the WS that contain only a heading, nulls and integer values but the wizard classes then as double. The destination table is set up as integer datatype. I can live with that one.
  2. There are five columns that each contain a heading a date in the format dd/mm/yyyy hh:ss and also some nulls. Three of these columns the import wizard correctly identifies as datetime, two of them it identifies as VarChar(255). The destination table is set up as datetime, which is causing problems.
  3. There is an internal notes field in my table currently set to varchar(max), the Excel field is identified as "long text". This field is throwing an error 0xc0204016 SSIS.Pipeline that the output column has a length that is not valid, the length must be between 0 and 4000.

I expected this process to be a five minute job, I have spent ten times that already. Is there any better alternative to importing Excel data into SQL server?

Any help appreciated.

NB the Excel data is fairly consistently formatted because it is an export from an online database.
I have similar problems, my "solution" is to group the records. my records are unique, so I will get the same number of records, but grouping seems to force Access/Excel to look at all records, which means the query will be slower, but it won't quickly decide on a data type.
 
Now I import Excel to SQL Server using OPENROWSET. It is the most straightforward technique I have found.

Code:
SELECT * 
INTO 
    database.schema.table

FROM OPENROWSET(
'Microsoft.ACE.OLEDB.12.0'
,'Excel 12.0;Database=path\filename.xlsx;HDR=YES'
,'SELECT * FROM [Sheet1$]')

Works on SQL Server 2016.
You may need to install the ACE redistributable on machines with older versions.
 

Users who are viewing this thread

Back
Top Bottom