optimize insert to access DB

ishaywei · Nov 29, 2006

Hi,

I'm using access tables to store my data. to retrive data I use jet odbc engine in c++.

I want to insert a large amuont of records (about 20 mega records) to my database in the fasts way i can. if i use sql syntax (insert into table ()....) it's takes for ages (about 500 records per second).

if i'm writing a csv file and then use import (via access) it's much faster but here I have two problems
1.I dont know how to use the access import tool from c++.
2.I dont think I can distributie the access import tool with my product.

so my questions are :
1. Does any know any tool that insert records in an optimize way?
2. How can I use the access import tool in c++?
3. Can i use the jet engine to import csv files?

thank ishay

ishaywei · Nov 29, 2006

any one please???

any one please???

The_Doc_Man · Nov 29, 2006

Your observed rate of 500 records per second is actually not that bad. If you have 100-byte records, that would be 50K bytes/second input PLUS whatever disk-related operations are required for swapping, resizing, and extending the MDB file, which seriously rattles your disk. When writing that much data, Access has to ask the file system to extend the file. That involves mechanical seeks to the disk's volume control data to find the storage bitmap and update the file extent pointers.

If you can arrange exclusive use of the MDB file, and if you have a lot of swap space configured on your machine, you can try this:

First and foremost, DEFRAG the machine you are using.

Now, on the table to which you are doing the insert, determine its indexes. (Take GOOD notes). Now remove all indexes.

If possible, pre-sort the file on whatever will be your prime key. Import the data to the "flat" (unindexed) table.

Now reassert all indexes. Expect this step to take some time.

Another thing you could do that might help, BUT it is VERY costly: If you have a way to put this file on a disk of its own (or a PARTITION of its own), you could reset the disk cluster size to give very large clusters, which will cause new cluster allocations to occur less often. That might help in the speed of file growth.

If you cannot do it either of these, then I don't know what else you can do.

Also note that if you have 20M RECORDS of 100 bytes each, you won't have ROOM for indexes anyway. If you have more than a couple of indexes, it is time to rethink your situation.

optimize insert to access DB

ishaywei

New member

ishaywei

New member

The_Doc_Man

Immoderate Moderator

Similar threads

Users who are viewing this thread