Range of ASCII values in criteria (1 Viewer)

Would be interesting to compare the speed when using a single creation of the RegEx object as a module variable.

Code:
Dim re As Object
Dim Match As Variant

Public Sub RunMeOneTime()
    Set re = CreateObject("VBScript.RegExp")
End Sub

Public Function RegExIllegalSearch(ByVal SearchTarget As String) As Boolean

    re.Pattern = "^[ '\-\A-Za-z\xC0-\xFF]*$"
    
    For Each Match In re.Execute(SearchTarget)
        RegExIllegalSearch = Len(Match.Value)
    Next

End Function

Or using Early Binding (with a Reference to VBScript Regular Expressions)

Code:
Public Function RegExIllegalSearch(ByVal SearchTarget As String) As Boolean

Dim re As New VBScript.RegExp
Dim Match As Variant

    re.Pattern = "^[ '\-\A-Za-z\xC0-\xFF]*$"
    
    For Each Match In re.Execute(SearchTarget)
        RegExIllegalSearch = Len(Match.Value)
    Next

End Function
 
Colin,

Attached is a zip file of the database. A couple of forms were added to help describe the database. Some debug.print lines were made comments to reduce the output volume.
Procedure Reptest will run the compare 20 times with all output going to the immediate window.
You can move the output to a table to do the min/max/stdDev if necessary.
In several sample runs the difference between Regex and NoRegex execution times seems moot.

Here is a sample extract from the output

Code:
Processing table Worldcities with 12959 Records 
 from https://simplemaps.com/data/world-cities  
=========================================================================

QryRegex started  :26-Nov-19 9:09:03 AM
QryRegex ended     27  ticks
QryNoRegex started :26-Nov-19 9:09:03 AM
QryNoRegex ended   26 ticks

Processing table Worldcities with 12959 Records 
 from https://simplemaps.com/data/world-cities  
=========================================================================

QryRegex started  :26-Nov-19 9:09:03 AM
QryRegex ended     26  ticks
QryNoRegex started :26-Nov-19 9:09:03 AM
QryNoRegex ended   27 ticks

Processing table Worldcities with 12959 Records 
 from https://simplemaps.com/data/world-cities  
=========================================================================

QryRegex started  :26-Nov-19 9:09:03 AM
QryRegex ended     27  ticks
QryNoRegex started :26-Nov-19 9:09:03 AM
QryNoRegex ended   28 ticks

Processing table Worldcities with 12959 Records 
 from https://simplemaps.com/data/world-cities  
=========================================================================

QryRegex started  :26-Nov-19 9:09:03 AM
QryRegex ended     26  ticks
QryNoRegex started :26-Nov-19 9:09:03 AM
QryNoRegex ended   27 ticks

Processing table Worldcities with 12959 Records 
 from https://simplemaps.com/data/world-cities  
=========================================================================

QryRegex started  :26-Nov-19 9:09:03 AM
QryRegex ended     26  ticks
QryNoRegex started :26-Nov-19 9:09:03 AM
QryNoRegex ended   27 ticks

Hope this is helpful.

jack
 

Attachments

Hi Jack
Thanks to both you and Greg for both the code and dataset.
That was very helpful as I'm very much a novice with Regex.

I've made a few minor changes to your test app.
These were mainly cosmetic so I could save the results to a table but I've also left the debug code in place for those who prefer that.
However, I've also made some changes e.g. to explicitly close each query after each test loop.

More importantly, and unlike you, I've found a significant difference in the test speed for each approach.
The results were consistent for each test both of which I ran 100 times (5 loops) on 2 different PCs
The Regex results take far longer in each case - 6x longer on my desktop PC & 3.5x longer on my underpowered tablet

attachment.php
 

Attachments

  • AvgResults.PNG
    AvgResults.PNG
    5.9 KB · Views: 416
  • DataTest_v2.zip
    DataTest_v2.zip
    715.8 KB · Views: 289
Colin,

I saw that Regex was taking longer on the first run, but that subsequent runs were taking significantly less time. I took values from the repeated runs, and closed the queries at the end of the session. I guess that extra time was for the creation of the regex object??
 
There was a slight delay on the first test run as Access is creating the query execution plan (QEP) for each test. However its not that significant in my tests.

attachment.php


Compacting the utility will delete the QEP so Access will need to create it again on another run

However, partly avoid variations caused by the QEP, I normally use SQL statements in code when running speed comparison tests.

I was very surprised how much slower the Regex code was in practice
 

Attachments

  • TestResults.PNG
    TestResults.PNG
    28 KB · Views: 459
I was very surprised how much slower the Regex code was in practice

I guess what RegEx is doing is overkill. Its strength is finding very complex and repeating patterns.
 
The other technique I mentioned was joining to a table of disallowed characters with a Not Like condition. If the tested strings are limited to 255 ASCII characters then the number of disallowed characters isn't so large.

This way the test is done with the core set processing magic of a database engine instead of working through the strings character by character. It never ceases to amaze me how so much can be processed so quickly on an engine.

I use a similar technique to search hundreds of thousands of records for being like dozens of string patterns. Admittedly it runs in SQL Server but I did do earlier less ambitious version in Access. I think the performance in Access just looking for a hundred or so characters might still surprise.

It would be quite easy to populate the table of characters by deriving Chr(n) against a Tally table and testing the output with Jack's function.
 
As a novice user of Regex, I really couldn't say.
Perhaps you could supply a comparison based on a more complex use of Regex which might work to its strengths e.g as suggested in the last post?
 
The topic is old, but I would like to note that the result for the regex variant will look much better if you implement what @Galaxiom has already mentioned: Instead of reinitializing the regex object with each function call, you should use a persistent object => if 500 people enter the same room at the same time, you won't close the door behind each individual in order to open it again before the next person.

Original version of the function and two variants. These only implement some processing logic. The main part = finding the right search pattern has already been done.
Code:
Public Function RegExIllegalSearch(ByVal SearchTarget As String) As Boolean

10        On Error GoTo RegExIllegalSearch_Error
      Dim re As Object
      Dim Match As Variant

20        Set re = CreateObject("VBScript.RegExp")

30        re.Pattern = "^[ '\-\A-Za-z\xC0-\xFF]*$"
         
40        For Each Match In re.Execute(SearchTarget)
50            RegExIllegalSearch = Len(Match.Value)
60        Next

         
70        On Error GoTo 0
80        Exit Function

RegExIllegalSearch_Error:

90        MsgBox "Error " & Err.Number & " (" & Err.Description & ") in procedure RegExIllegalSearch, line " & Erl & "."

End Function

Deployment of the persistent object:
Code:
Public Function RegExIllegalSearch_V2(ByVal SearchTarget As String) As Boolean

10        On Error GoTo RegExIllegalSearch_Error
      Static re As Object
      Dim Match As Variant

20        If re Is Nothing Then Set re = CreateObject("VBScript.RegExp")

30        re.Pattern = "^[ '\-\A-Za-z\xC0-\xFF]*$"
         
40        For Each Match In re.Execute(SearchTarget)
50            RegExIllegalSearch_V2 = Len(Match.Value)
60        Next

         
70        On Error GoTo 0
80        Exit Function

RegExIllegalSearch_Error:

90        MsgBox "Error " & Err.Number & " (" & Err.Description & ") in procedure RegExIllegalSearch, line " & Erl & "."

End Function

The test method is sufficient to check whether something occurs. You don't have to create and evaluate a matchcollection object via Excute.
Code:
Public Function RegExIllegalSearch_V3(ByVal SearchTarget As String) As Boolean

10        On Error GoTo RegExIllegalSearch_Error
      Static re As Object

20        If re Is Nothing Then Set re = CreateObject("VBScript.RegExp")

30        re.Pattern = "^[ '\-\A-Za-z\xC0-\xFF]*$"
         
40
50        RegExIllegalSearch_V3 = re.Test(SearchTarget)
60

         
70        On Error GoTo 0
80        Exit Function

RegExIllegalSearch_Error:

90        MsgBox "Error " & Err.Number & " (" & Err.Description & ") in procedure RegExIllegalSearch, line " & Erl & "."

End Function
 
Last edited:

Users who are viewing this thread

Back
Top Bottom