Populating Microsoft SQL Server (Azure) table from Access - How to improve performance (1 Viewer)

Derevon

Registered User.
Local time
Today, 12:17
Joined
Jan 14, 2014
Messages
51
Hi everyone,

We have a table with around 60-70k records that needs to be inserted from an Access table into an SQL Server Azure table.

Doing this with regular inserts takes several hours. So my idea was to make a stored procedure and submit many lines at the same time.

This worked just fine with an old Non Azure SQL server, but with the new Azure SQL Server, most lines go missing on the way.

We used something like this:

Code:
Public Sub AddNewEntries(SQL As String)
    Dim cmd As Object
    Set cmd = New ADODB.Command
    cmd.ActiveConnection = [connection string]
    cmd.CommandType = adCmdText
    cmd.CommandText = SQL
    cmd.Execute
    Set cmd = Nothing
End Sub

This sub was triggered by another Sub (simplified version):

Code:
Public Sub AddData()
    c = 0
    Dim rs As Object
    Set rs = CurrentDb.OpenRecordset("SELECT * FROM [sourcetable]")
    SQL = ""
    While Not rs.EOF
        SQL = SQL & "EXEC [dbo].[StoredProc1] (all the fields and data taken from the rs recordset go here and [dbo].[StoredProc1] is the name of the stored procedure);" & vbCrLf
        rs.MoveNext
        c = c + 1
        If c Mod 200 = 0 Then
            AddNewEntries (SQL)
            SQL = ""
            DoEvents
        ElseIf rs.EOF Then
            AddNewEntries (SQL)
        End If
    Wend
    rs.Close
    Set rs = Nothing

The above code ran in under 10 minutes on the non-Azure SQL server, but when I try the same thing on Azure it only added around 12k out of 67k records. Every time it would run it would be a different number of records, so most records would not be inserted.

So after doing some Googling I tried a sub like the one below. It works, but it took nearly 2 hours for some reason. For every 25 rows it executes the stored procedure with 25 lines. If I run it with say a few hundred lines, some lines go missing for some reason, even if the stored procedure is only triggered once.

Code:
Public Sub AddDataV2()
    Dim Cn As ADODB.Connection
    Dim c As Long
    c = 0
    Dim rs As Object

    Set Cn = New ADODB.Connection
    Cn.Open [Connectiong string]

    Set rs = CurrentDb.OpenRecordset("SELECT * FROM [sourcetable]")
    SQL = ""
    While Not rs.EOF
        SQL = SQL & "EXEC [dbo].[StoredProc1] (all the field names and data go here);"& vbCrLf
        rs.MoveNext
        c = c + 1
        If c Mod 25 = 0 Then
            Cn.Execute SQL, , adAsyncExecute
            Do While Cn.State = adStateOpen + adStateExecuting
                Sleep 10 ' 10 ms delay
            Loop
            SQL = ""
            DoEvents
        ElseIf rs.EOF Then
            Cn.Execute SQL, , adAsyncExecute
        End If
    Wend
    rs.Close
    Set rs = Nothing
    Set Cn = Nothing
End Sub

Does anyone know of any way to do this in a more efficient manner while making sure no records go missing?

Thank you
 

Minty

AWF VIP
Local time
Today, 11:17
Joined
Jul 26, 2013
Messages
10,354
Out of interest - How many fields?

What if you simply link to an empty staging Azure table - no indexes etc. (Indexes make bulk data inserts very slow over Azure.)
Do a simple insert query to that.

Then use a stored procedure to copy those records into your main table on the server.
 

Derevon

Registered User.
Local time
Today, 12:17
Joined
Jan 14, 2014
Messages
51
Out of interest - How many fields?

What if you simply link to an empty staging Azure table - no indexes etc. (Indexes make bulk data inserts very slow over Azure.)
Do a simple insert query to that.

Then use a stored procedure to copy those records into your main table on the server.

I was also thinking that maybe the reason why it's gradually slower is the composite primary key, so I tried dropping all keys, index, etc, but it seemed to have little impact on performance.

First 1000 lines take around 22, sec, and then for every 1000 rows added it just keeps climbing until 100 rows take minutes.

c = 1000: 21,73s
c = 2000: 22,45s
c = 3000: 22,33s
c = 4000: 26,01s
c = 5000: 28,93s
c = 6000: 29,57s

I just don't get it why the performance declines so quickly as the table grows larger despite the fact that there are no indexes.

Here are the fields included (26 in total):

[Location] [varchar](5) NOT NULL,
[BU] [varchar](5) NULL,
[ABNo] [int] NULL,
[Vendor] [nvarchar](100) NOT NULL,
[MailingName] [nvarchar](100) NULL,
[GLClass] [varchar](50) NOT NULL,
[AccountNo] [varchar](100) NULL,
[Currency] [varchar](3) NULL,
[Invoice] [varchar](50) NULL,
[OV] [int] NOT NULL,
[PO] [varchar](10) NOT NULL,
[POType] [varchar](5) NOT NULL,
[Line] [float] NOT NULL,
[SubLine] [int] NOT NULL,
[Item] [varchar](100) NULL,
[RecDate] [date] NULL,
[OpenQty] [float] NULL,
[OpenAmt] [float] NULL,
[RecQty] [float] NULL,
[RecAmt] [float] NULL,
[Div] [varchar](5) NULL,
[Originator] [nvarchar](50) NULL,
[ContainerID] [varchar](50) NULL,
[Status] [varchar](100) NULL,
[Comment] [nvarchar](max) NULL,
[User] [nvarchar](50) NULL

I tried now also to change the Comment nvarchar(max) to nvarchar(1000), and that helped performance, but the speed declines similarly anyway. Roughly a doubling of the time each 1000 records take from first 1000 to record 14000-15000.

c = 1000: 16,28s
c = 2000: 12,98s
c = 3000: 13,36s
c = 4000: 13,75s
c = 5000: 14,16s
c = 6000: 14,82s
c = 7000: 18,73s
c = 8000: 21,61s
c = 9000: 20,42s
c = 10000: 21,77s
c = 11000: 24,49s
c = 12000: 27,55s
c = 13000: 29,08s
c = 14000: 31,57s
c = 15000: 33,64s
c = 16000: 37,93s
 

Minty

AWF VIP
Local time
Today, 11:17
Joined
Jul 26, 2013
Messages
10,354
So how long does a single insert query to a new empty linked table take?
 

Derevon

Registered User.
Local time
Today, 12:17
Joined
Jan 14, 2014
Messages
51
I just did some testing with 5000 records and 1000 records using DoCmd.RunSQL "INSERT INTO [SQL SERVER TABLE] SELECT * FROM [ACCESS TABLE]" and these are my results:

5000 records took 208 seconds and 10000 records 411 seconds, so it seems to be proceeding at a constant pace.

While much slower than the Stored Procedure at first, at some point it overtakes it due to its constant pace it seems.
 

Minty

AWF VIP
Local time
Today, 11:17
Joined
Jul 26, 2013
Messages
10,354
Okay - 40 minutes - that still seems poor.

I would download Visual studio and create an SSIS task to do the leg work.
I regularly truncate and reimport > 750k records around between two separate Azure databases and it takes about 10 minutes.
I went that route as merging was taking hours, it was significantly quicker to simply delete and re-copy, then add the composite PK back on.
In total including a merge on another 400k table the whole import task only takes about 18 minutes.

I think it would be worth investigating.
 

Users who are viewing this thread

Top Bottom