How to automate importing 31 .csv files into SQL Server? (4 Viewers)

ions

Access User
Local time
Today, 10:58
Joined
May 23, 2004
Messages
829
Hello,

I have a zip file that contains 31 .csv files. Each .csv file contains 1 mil records and the totality of the files makes the entire table. (This dataset is provided by the US Government) How would you go about importing this data into SQL Server?

When I import one .csv file using the SSMS flat file wizard it takes around 5-10 minutes because I have to give each field the correct data type. (There are around 30 fields)

This data dump gets updated on a weekly basis every Monday. How would you automate this import?

Would you combine the 31 .csv files into 1 .csv file and import the single file? If so, how would you combine the .csv files into one file? The .zip file containing the .csv files 950 MB.

Thank you for your feedback.
 
Never used SQL server.
I would unzip the file to a dedicated folder.
Then I would walk that folder processing each file in turn.
I would have specification saved for the import and use that each time to import to the table.
You could create one huge file in dos with something along the lines of
Code:
Full file = file1 + file2 + file3 etc I believe?
 
Just thinking out loud but maybe try BCP?

Sent from phone...
 
For many years, I did exactly this process with about 30 CSV files each night at about 2 am, importing data into the SQL server BE for my main school database. The import files were also very large so the total process typically took around 45-60 minutes.

However, the process was completely automated and triggered by a scheduled task on the server:
1. On the unlikely event that any users were logged in via terminal server at 2 am, they were warned and then logged out.
2. The CSV files were exported from an external data source
3. The files were then in turn imported into temp tables in Access, processed then the related SQL Server tables were updated.
4. The whole process was logged and if any issues arose, details were also logged.
5. When complete, the DBA was automatically emailed with a summary of the log file which was one of the first items checked on arrival the next day.

The system has worked flawlessly for many years in multiple locations with the only blips being occasional network outages

I know nothing about BCP so can't comment on whether that would have been any better / easier to setup.
In any case, it appears it wasn't available when our system was first setup back around 2005!
 
No question about it, I would create an SSIS package. Who knows how often you'll have to repeat it in the future
The OP might not have SSIS, unless he's using a paid SQL Server edition that includes it. There's also a steep learning curve involved with setting up SSIS packages. BCP should be sufficient for accomplishing the OP's objective on a recurring basis. It also loads data a lot faster than using the SSMS wizard.
 
Data tools is free and I was actually shocked at how small the learning curve is to learn how to set up an ssis package.. you can learn how to do a basic one in about 5 minutes
 
I have been studying BCP. I don't believe the output files are created by BCP, hence, it's making it hard for me to import using BCP. Below are the first 2 records in the .csv file. Do you still recommend proceeding with BCP?

Thank you

"MANIFEST TRACKING NUMBER","UPDATED DATE","SHIPPED DATE","RECEIVED DATE","STATUS","SUBMISSION TYPE","ORIGIN TYPE","GENERATOR ID","GENERATOR NAME","GENERATOR MAIL STREET NO","GENERATOR MAIL STREET1","GENERATOR MAIL STREET2","GENERATOR MAIL CITY","GENERATOR MAIL ZIP","GENERATOR MAIL STATE","GENERATOR LOCATION STREET NO","GENERATOR LOCATION STREET1","GENERATOR LOCATION STREET2","GENERATOR LOCATION CITY","GENERATOR LOCATION ZIP","GENERATOR LOCATION STATE","GENERATOR CONTACT COMPANY NAME","DES FACILITY ID","DES FACILITY NAME","DES FAC MAIL STREET NO","DES FAC MAIL STREET1","DES FAC MAIL STREET2","DES FAC MAIL CITY","DES FAC MAIL ZIP","DES FAC MAIL STATE","DES FAC LOCATION STREET NO","DES FAC LOCATION STREET1","DES FAC LOCATION STREET2","DES FAC LOCATION CITY","DES FAC LOCATION ZIP","DES FAC LOCATION STATE","DES FAC CONTACT COMPANY NAME","MANIFEST RESIDUE","REJECTION","TOTAL QUANTITY ACUTE KG","TOTAL QUANTITY ACUTE TONS","TOTAL QUANTITY HAZ KG","TOTAL QUANTITY HAZ TONS","TOTAL QUANTITY NON ACUTE KG","TOTAL QUANTITY NON ACUTE TONS","TOTAL QUANTITY NON HAZ KG","TOTAL QUANTITY NON HAZ TONS","TOTAL QUANTITY KG","TOTAL QUANTITY TONS","BROKER ID"

"000000000FLE","20211119","20210917","20210927","Signed","DataImage5Copy","Service","CAR000287201","AMAZON.COM SERVICES LLC (DPS3)","","2405 Conejo Spectrum Street","","Thousand Oaks","91320","CA","","2405 Conejo Spectrum Street","","Thousand Oaks","91320","CA","","CAD008364432","Rho Chem LLC","","425 Isis Ave.","","Inglewood","90301","CA","","425 Isis Ave.","","Inglewood","90301","CA","","N","N","0","0","0","0","0.90703004","0.001","0.90703004","0.001","0.90703004","0.001",""

"000000000JJK","20240719","20240628","20240628","Signed","DataImage5Copy","Web","MID087054078","GERDAU MACSTEEL INC","","3000 E FRONT ST","","MONROE","48161","MI","","3000 E FRONT ST","","MONROE","48161","MI","","ILD040891368","BEFESA ZINC US INC.","","2701 E 114TH ST","","CHICAGO","60617","IL","","2701 E 114TH ST","","CHICAGO","60617","IL","","N","N","0","0","20063.5036","22.12","20063.5036","22.12","0","0","20063.5036","22.12",""
 
I don't understand why you say output files are not created by BCP. The files are nevertheless created when running BCP, and you can use them, right? Are you using the switches in the command line for specifying the format file, and other desired options?

https://learn.microsoft.com/en-us/s...-bcp-utility-sql-server?view=sql-server-ver17
I was trying to replicate outputting a simple person table with the above characteristics; all fields enclosed with "" and can support "," within the data. I was not able to accomplish this type of output when studying the switches. Finally, I asked ChatGPT if the files are created by BCP and it said most likely not. A friend offered to help me write a custom solution in a C# console program. I am quite good with C# and will likely pursue that path next.

Thank you for everyone's feedback. I will update this thread when I reach a solution.

1758290514223.png
 
I was trying to replicate outputting a simple person table with the above characteristics; all fields enclosed with "" and can support "," within the data. I was not able to accomplish this type of output when studying the switches. Finally, I asked ChatGPT if the files are created by BCP and it said most likely not. A friend offered to help me write a custom solution in a C# console program. I am quite good with C# and will likely pursue that path next.

Thank you for everyone's feedback. I will update this thread when I reach a solution.

View attachment 121559
Aren't the two comma delimited csv file examples you previously posted the desired format you want for your load files?
 
I was successfully able to import multiple files using a C# console program. It works really well and gives me full control. Thank you for everyone's assistance. Knowing about the BCP option will come in handy in the future I am sure.
 
Aren't the two comma delimited csv file examples you previously posted the desired format you want for your load files?
Those were the files provided by the US government which I need to import into my custom SQL Server Table
 
Data tools is free and I was actually shocked at how small the learning curve is to learn how to set up an ssis package.. you can learn how to do a basic one in about 5 minutes
The big advantage I see in SSIS vs BCP is that it can do transformations. The import file has dates such as "20211119", and without transformation that is not a valid date for a SQL Server date or datetime field.
 
I respect that, as I know plenty of people take advantage of that. I tend to keep the packages VERY simple and land all the stuff (including junk) in a staging table, then deal with in SQL. But you make a good point, that's yet another benefit of using SSIS for this task. BCP is a brutal little fu*ker where you hope on one or two lines of code you can handle all contingencies (which of course you can't). in SSIS you can handle anything and everything, including funneling error rows off to a separate Exceptions table .. etc. etc.
 

Users who are viewing this thread

Back
Top Bottom