Porting Access VBA to VB.NET CLR Functions in SQL Server (1 Viewer)

Galaxiom

Super Moderator
Staff member
Local time
Tomorrow, 04:54
Joined
Jan 20, 2009
Messages
12,849
Today I opened a door in SQL Server and discovered a whole extra wing of machinery, particularly useful for Access VBA developers using SQL Server back ends. Turns out it is quite easy to port VBA code into SQL Server functions and gain a whole lot of performance improvement on the way.

CLR functions are imported to SQL Server databases from compiled dynamic link libraries (.dll) and work like any other inline function in TSQL. They can be very efficient because they are compiled executable code.

It is quite easy to make a dll in Visual Studio using VB.NET. For an experienced VBA developer, the learning curve to VB.NET is not particularly steep. Simple VBA code generally compiles when pasted into Visual Studio. The Intellisense literally tells you what to do to fix it if there are problems. One click on a menu and the dll is created.

From there, one line of SQL Creates an Assembly from the dll after which the dll file is no longer required. Another SQL command creates the CLR function from the Assembly.

In Visual Studio, start a Class Library project, paste your VBA function into the Class and include the word "Shared" into the declaration. The Intellisense is so good in Visual Studio that you can type in anywhere in the function declaration it will jump into the right place.:D

To be Shared, the function must be self contained, it cannot call another function in the class. Visual Studio will tell you this if your function includes outside calls.

You have to turn on CLR capabilities in SQL Server. (It will tell you this if it is off when you go to use the function.)

The trickiest part is the function declaration in SQL which must include the namespace property from the dll in the reference to the function's class. MSSQLSMS does not tell you this one. The namespace can be found in the Application tab of the Project Properties in Visual Studio.

Obviously the datatypes of the VB function must match the function declaration in SQL Server. Most of them are obvious if you are familiar with SQL datatypes. The SQL equivalent for a VB.NET string is nvarchar(max).

I Googled my way through with the help of the Intellisense and had my first VBA function running in SQL Server in a couple of hours. It's magic. Post back if you get stuck.

I will soon know more. I have a lot more VBA code I will be porting into SQL Server starting tomorrow morning.
 

sonic8

AWF VIP
Local time
Today, 18:54
Joined
Oct 27, 2015
Messages
998
Just a few thoughts on this matter...

Don't go overboard with your enthusiasm about this new toy.

Keep in mind that, as you found out already, you will need to enable the CLR integration in SQL-Server to use this. Many DBAs will point-blank refuse this for security concerns without any option to negotiate. - It's not my part to justify this, but with bigger clients this is a matter of policy and you can't do anything about it.
Azure SQL Databases have CLR integration disabled as well.

Be very selective about the VBA code you want to migrate. Most of the VBA that comes to mind at first, is probably better rewritten in T-SQL or removed from the data processing logic alltogether.
I didn't do any performance tests on this, yet. I guess, the performance of the CLR code in isolation is probably very good, but what happens when you mix it with complex queries and T-SQL processing?

During development, I feel the context switch between raw T-SQL and .Net code is yet another context switch I would avoid unless really necessary.

To be Shared, the function must be self contained, it cannot call another function in the class. Visual Studio will tell you this if your function includes outside calls.
Well, no. -Research the Shared keyword in VB.net.

Shared functions do not have to be "self contained". You just cannot access any members of your class without first creating an instance of the class.
 

Galaxiom

Super Moderator
Staff member
Local time
Tomorrow, 04:54
Joined
Jan 20, 2009
Messages
12,849
Be very selective about the VBA code you want to migrate. Most of the VBA that comes to mind at first, is probably better rewritten in T-SQL or removed from the data processing logic alltogether.

I started translating Damearu-Levenshtein Distance function to SQL then discovered TSQL doesn't have arrays which featured heavily in the code. Compiling the VBA was the easiest way forward.

I didn't do any performance tests on this, yet. I guess, the performance of the CLR code in isolation is probably very good, but what happens when you mix it with complex queries and T-SQL processing?

Set based processes where a database engine excels are obviously best done in SQL but the performance of CLR functions exceeds even the very best written SQL for procedural tasks like parsing strings.

During development, I feel the context switch between raw T-SQL and .Net code is yet another context switch I would avoid unless really necessary.

Really no different from writing a VBA function in Access.
 

sonic8

AWF VIP
Local time
Today, 18:54
Joined
Oct 27, 2015
Messages
998
I started translating Damearu-Levenshtein Distance function to SQL then discovered TSQL doesn't have arrays which featured heavily in the code. Compiling the VBA was the easiest way forward.
That is actually a very good example for functionality that absolutely does make sense to implement with .Net inside SQL-Server.


However, there are also implementations of the Levenshtein algorithm in T-SQL. So, if it were just for that single function (or very few in total), I would still not use the CLR for it to avoid the additional context switch to .Net.

(The requirement for the CLR-enbaled server are another matter on top of that.)
 

Galaxiom

Super Moderator
Staff member
Local time
Tomorrow, 04:54
Joined
Jan 20, 2009
Messages
12,849
I'll be interested to compare performance between them.

I tested with Australian Suburb-State records. The vast majority were strings between 11 and 22 characters.

These were produced as a DISTINCT list from a cartesian product of all Suburbs where the first letter was the same on both resulting in about 219K records. The condition also included that one string was greater than the other, eliminating exact matches and processing each pair of string once.

I set the max parameter as 60 on the SQL function to effectively disable it because the CLR version processes everything.

SQL function: 1 minute 20 seconds.

CLR function: 30 seconds.

I'll add a max parameter to the CLR function and have a good look at the SQL version to see if it has anything else that might improve performance in the CLR.

(Note the function used in the CLR is not exactly the one in the link I posted to the VBA version. I modified it to eliminate the called MaxOf and MinOf functions that were a legacy from its original invocation in Excel.)
 

Galaxiom

Super Moderator
Staff member
Local time
Tomorrow, 04:54
Joined
Jan 20, 2009
Messages
12,849
I also note the SQL version is only the Levenshtein Distance.

Damearu-Levenshtein is considerably better in that accounts for the reversal of two characters rather than characterizing it as two separate substitutions. The original article referred to in the SQL version notes also provides a D-L algorithm in both C# and SQL.

The D-L SQL version took just over 2 minutes on the same test as above.
 

Users who are viewing this thread

Top Bottom