Optimizing a query with multiple LIKE '%abc%'

ions

Access User
Local time
Today, 05:30
Joined
May 23, 2004
Messages
866
Hello MS Access Forum,

I have a stored procedure that relies on 4 LIKE '%abc%' statements. (See example code below) The table has ~850K records. Is there something that can be done to improve performance?

Thank you for your advice.

Code:
SELECT GENERATOR_ID
FROM dbo.tblGenerators
WHERE GENERATOR_NAME LIKE '%' + @SearchTerm + '%'
   OR GENERATOR_LOCATION_STREET1 LIKE '%' + @SearchTerm + '%'
   OR GENERATOR_LOCATION_CITY LIKE '%' + @SearchTerm + '%'
   OR GENERATOR_LOCATION_STATE LIKE '%' + @SearchTerm + '%';
 
Not really. Such queries cannot use indexes. They are "non-sargable" if using a 5-dollar word.

What I would do is use 4 search terms rather than one. After all, the user knows if they are searching for a name, street, city, or state.

State search field could be a dropdown of all the possible state names, and that would turn one of your LIKE statement into an = statement.
As another example, maybe you require City to be spelled from the first letter: PHOE is OK for Phoenix, but OENIX is not. That would eliminate the "%" in front of the search term, which would then allow an index on City to be used. Perhaps you can add a dropdown in front of the City field with options of Equals, Begins With, Contains, similar to what you see in Access and Excel.


One other suggestion: your query could have an extra field that is the concatenation of Name, Address, City, and State, and then you only need one LIKE clause, rather than 4.
 
Add something to the WHERE that can use an index to reduce the number to records to search. LIKE cannot use indexes with wild cards on the front.
 
One other suggestion: your query could have an extra field that is the concatenation of Name, Address, City, and State, and then you only need one LIKE clause, rather than 4.

Does SQL Server scan the entire table 4 times with 4 LIKEs?
 
To see how SQL Server resolves your query with 4 LIKEs, run the statement in SSMS and turn "Actual Execution Plan" on.
 
To see how SQL Server resolves your query with 4 LIKEs, run the statement in SSMS and turn "Actual Execution Plan" on.

It appears it's only one scan. Also it's much faster than I thought. I thought this was the bottle neck in my stored procedure but perhaps it's somewhere else.

1761850364187.png
 
The bottle neck was in other parts of my stored procedure. The speed is pretty good now (below 4 seconds). Below is the full stored procedure. Do you see any further efficiencies? I am really shocked how fast it is now. tblGenerators has 842,152 records and vwManifestLines has 4,164,527.

Thank you for your input.

Code:
ALTER   PROCEDURE [dbo].[GenHeadersByGenQF]
    @Year INT = 2024,
    @UserInput VARCHAR(250) = 'Walmart Chicago'
AS
BEGIN
    SET NOCOUNT ON;

    ;WITH SearchTerms AS
    (
        SELECT LTRIM(RTRIM(value)) AS Term
        FROM STRING_SPLIT(@UserInput, ' ')
        WHERE value <> ''
    ),
    GenMatches AS
    (
        SELECT g.GENERATOR_ID
        FROM dbo.tblGenerators g
        INNER JOIN SearchTerms st
            ON g.GENERATOR_NAME LIKE '%' + st.Term + '%'
            OR g.GENERATOR_LOCATION_STREET1 LIKE '%' + st.Term + '%'
            OR g.GENERATOR_LOCATION_CITY LIKE '%' + st.Term + '%'
            OR g.GENERATOR_LOCATION_STATE LIKE '%' + st.Term + '%'
        GROUP BY g.GENERATOR_ID
        HAVING COUNT(DISTINCT st.Term) = (SELECT COUNT(*) FROM SearchTerms)
    ),
    Totals AS
    (
        SELECT
            ml.GENERATOR_ID,
            SUM(ISNULL(ml.QUANTITY_HAZ_KG, 0)) AS decTotalKg,
            COUNT(*) AS intManifestLines
        FROM dbo.vwManifestLines ml
        INNER JOIN GenMatches gm ON ml.GENERATOR_ID = gm.GENERATOR_ID
        WHERE ml.SHIPPED_YEAR = @Year
        GROUP BY ml.GENERATOR_ID
    )
    SELECT
        g.ID,
        g.GENERATOR_ID,
        g.GENERATOR_NAME,
        g.GENERATOR_LOCATION_STREET_NO,
        g.GENERATOR_LOCATION_STREET1,
        g.GENERATOR_LOCATION_STREET2,
        g.GENERATOR_LOCATION_CITY,
        g.GENERATOR_LOCATION_ZIP,
        g.GENERATOR_LOCATION_STATE,
        g.Latitude,
        g.Longitude,
        t.decTotalKg,
        t.intManifestLines
    FROM tblGenerators g
    INNER JOIN Totals t ON g.GENERATOR_ID = t.GENERATOR_ID
    ORDER BY t.decTotalKg DESC, g.GENERATOR_NAME ASC;
END
 
Perhaps two queries , the first just the last %

LIKE '@SearchTerm + '%'

Which will use an index (assuming the field is indexed). And then a second query using

LIKE '%' + @SearchTerm'

On the results from the first query
 
Perhaps two queries , the first just the last %

LIKE '@SearchTerm + '%'

Which will use an index (assuming the field is indexed). And then a second query using

LIKE '%' + @SearchTerm'

On the results from the first query
You are right the State is "NY" or "MI" and should be an =. No need to make that a LIKE at all.

Thank you CJ
 
State is indexed. However, it's possible that the user may not enter a State. I think separating it into 2 queries may add too much complexity? Every search I tried now runs under 3 seconds and my machine is not that powerful.

I think I will search on the full State name because something like 'MI' or 'CA' is causing too many unnecessary hits.

Thank you for all the advice! Much appreciated.


Code:
SELECT g.GENERATOR_ID

        FROM dbo.tblGenerators g

        INNER JOIN SearchTerms st

            ON g.GENERATOR_LOCATION_STATE = st.Term

            OR g.GENERATOR_NAME LIKE '%' + st.Term + '%'

            OR g.GENERATOR_LOCATION_STREET1 LIKE '%' + st.Term + '%'

            OR g.GENERATOR_LOCATION_CITY LIKE '%' + st.Term + '%'

        GROUP BY g.GENERATOR_ID

        HAVING COUNT(DISTINCT st.Term) = (SELECT COUNT(*) FROM SearchTerms)
 
Finally query. I am still in shock how fast it is with almost a million Generator records.

Code:
CREATE OR ALTER   PROCEDURE [dbo].[GenHeadersByGenQF]
    @Year INT = 2024,
    @UserInput VARCHAR(250) = 'Walmart Chicago'
AS
BEGIN
    SET NOCOUNT ON;

    ;WITH SearchTerms AS
    (
        SELECT LTRIM(RTRIM(value)) AS Term
        FROM STRING_SPLIT(@UserInput, ' ')
        WHERE value <> ''
    ),
    GenMatches AS
    (
        SELECT g.GENERATOR_ID
        FROM dbo.tblGenerators g
        INNER JOIN dbo.tblStates s
            ON g.GENERATOR_LOCATION_STATE = s.State
        INNER JOIN SearchTerms st
            ON s.StateName LIKE st.Term + '%'
            OR g.GENERATOR_NAME LIKE '%' + st.Term + '%'
            OR g.GENERATOR_LOCATION_STREET1 LIKE '%' + st.Term + '%'
            OR g.GENERATOR_LOCATION_CITY LIKE '%' + st.Term + '%'
        GROUP BY g.GENERATOR_ID
        HAVING COUNT(DISTINCT st.Term) = (SELECT COUNT(*) FROM SearchTerms)

    ),
    Totals AS
    (
        SELECT
            ml.GENERATOR_ID,
            SUM(ISNULL(ml.QUANTITY_HAZ_KG, 0)) AS decTotalKg,
            COUNT(*) AS intManifestLines
        FROM dbo.vwManifestLines ml
        INNER JOIN GenMatches gm ON ml.GENERATOR_ID = gm.GENERATOR_ID
        WHERE ml.SHIPPED_YEAR = @Year
        GROUP BY ml.GENERATOR_ID
    )
    SELECT
        g.ID,
        g.GENERATOR_ID,
        g.GENERATOR_NAME,
        g.GENERATOR_LOCATION_STREET_NO,
        g.GENERATOR_LOCATION_STREET1,
        g.GENERATOR_LOCATION_STREET2,
        g.GENERATOR_LOCATION_CITY,
        g.GENERATOR_LOCATION_ZIP,
        s.StateName AS GENERATOR_LOCATION_STATE,
        g.Latitude,
        g.Longitude,
        t.decTotalKg,
        t.intManifestLines
    FROM tblGenerators g
    INNER JOIN Totals t
        ON g.GENERATOR_ID = t.GENERATOR_ID
    INNER JOIN tblStates s
        ON g.GENERATOR_LOCATION_STATE = s.State
    ORDER BY t.decTotalKg DESC, g.GENERATOR_NAME ASC;
END
 
Unless you can add a full text index to one of those columns, I don't think there's much that can be done.
 
Unless you can add a full text index to one of those columns, I don't think there's much that can be done.
The production version will be going on Azure and I believe only the high cost tiers allow full text index. Also when I attempted to implement full text index on my local machine I got the message that this feature is not installed.
 
SQL:
-- ...
    Totals AS
    (
        SELECT
            ml.GENERATOR_ID,
            SUM(ISNULL(ml.QUANTITY_HAZ_KG, 0)) AS decTotalKg,
-- ...
Do you need the ISNULL() here or is it another extra unnecessary function call?

I *think* SUM(ml.QUANTITY_HAZ_KG) AS decTotalKg should yield the same result.
 
The production version will be going on Azure and I believe only the high cost tiers allow full text index. Also when I attempted to implement full text index on my local machine I got the message that this feature is not installed.
I am happy to be proved wrong but I believe all versions of Azure SQL can have a full text index?

I can't see anything here
that suggest otherwise?
 

Users who are viewing this thread

Back
Top Bottom