Optimizing a query with multiple LIKE '%abc%' (2 Viewers)

I am happy to be proved wrong but I believe all versions of Azure SQL can have a full text index?

I can't see anything here
that suggest otherwise?

Minty you are right I can use it with my current Azure Database. AI was initially telling me I cannot. Why doesn't full text index come installed by default with the local SQL Server installation? I might explore full text index now.

1761919277102.png
 
Is this with a local dev server (or same LAN)?

Make sure you test over WAN if you are using Azure for a more realistic sense of how long it takes. (y)
This is from my local machine. My machine is almost 5 years old so I find performance is better on Azure. Note, I haven't deployed this specific stored procedure on Azure. That will happen in about 2 months.
 
Thanks for everyone's feedback. The original stored procedure is working well. I have one more sp below. It works great except when the user doesn't provide the State parameter. Then it takes around 12 seconds with one FWC. Otherwise it's always under 3 seconds. vwManifestLines has 2,862,958 records in 2024.

Thank you for any tweaks.

Code:
ALTER   PROCEDURE [dbo].[GenYearlySummary]
    @Year INT,
    @State VARCHAR(8) = NULL,
    @FWC1 VARCHAR(8) = NULL,
    @FWC2 VARCHAR(8) = NULL,
    @FWC3 VARCHAR(8) = NULL,
    @FWC4 VARCHAR(8) = NULL,
    @ContainerTypeCode VARCHAR(3) = NULL,
    @ManagementMethodCode VARCHAR(5) = NULL,
    @GenName VARCHAR(255) = NULL
AS
BEGIN
    SET NOCOUNT ON;

    ;WITH LineMatch AS
    (
        SELECT
            ID, -- unique ManifestDetail Line ID                             
            GENERATOR_ID,
            FEDERAL_WASTE_CODES,
            CONTAINER_TYPE_CODE,
            MANAGEMENT_METHOD_CODE,
            QUANTITY_HAZ_KG
        FROM dbo.vwManifestLines as l
        WHERE SHIPPED_YEAR = @Year
          AND (@State IS NULL OR GENERATOR_LOCATION_STATE = @State)
          AND (@GenName IS NULL OR GENERATOR_NAME LIKE '%' + @GenName + '%')
          AND (@ContainerTypeCode IS NULL OR CONTAINER_TYPE_CODE = @ContainerTypeCode)
          AND (@ManagementMethodCode IS NULL OR MANAGEMENT_METHOD_CODE = @ManagementMethodCode)
          AND (@FWC1 IS NULL OR EXISTS (SELECT 1 FROM dbo.vwManifestLinesFWC f WHERE f.ID = l.ID AND f.FederalWasteCode = @FWC1))
          AND (@FWC2 IS NULL OR EXISTS (SELECT 1 FROM dbo.vwManifestLinesFWC f WHERE f.ID = l.ID AND f.FederalWasteCode = @FWC2))
          AND (@FWC3 IS NULL OR EXISTS (SELECT 1 FROM dbo.vwManifestLinesFWC f WHERE f.ID = l.ID AND f.FederalWasteCode = @FWC3))
          AND (@FWC4 IS NULL OR EXISTS (SELECT 1 FROM dbo.vwManifestLinesFWC f WHERE f.ID = l.ID AND f.FederalWasteCode = @FWC4))
    ),
    Totals AS
    (
        SELECT
            GENERATOR_ID,
            FEDERAL_WASTE_CODES,
            CONTAINER_TYPE_CODE,
            MANAGEMENT_METHOD_CODE,
            SUM(QUANTITY_HAZ_KG) AS TotalKg,
            COUNT(*) AS ManifestLineCount
        FROM LineMatch
        GROUP BY
            GENERATOR_ID,
            FEDERAL_WASTE_CODES,
            CONTAINER_TYPE_CODE,
            MANAGEMENT_METHOD_CODE
    )
    SELECT
        CAST(@Year AS SMALLINT) AS SHIPPED_YEAR,
        t.FEDERAL_WASTE_CODES,
        t.MANAGEMENT_METHOD_CODE,
        t.CONTAINER_TYPE_CODE,
        ISNULL(t.TotalKg,0) AS TotalKg,
        t.ManifestLineCount,
        t.GENERATOR_ID,
        g.GENERATOR_NAME,
        g.GENERATOR_LOCATION_STATE,
        g.GENERATOR_LOCATION_STREET_NO,
        g.GENERATOR_LOCATION_STREET1,
        g.GENERATOR_LOCATION_CITY,
        g.GENERATOR_LOCATION_ZIP
    FROM Totals t
    INNER JOIN dbo.tblGenerators g
        ON t.GENERATOR_ID = g.GENERATOR_ID;
END
 
As noted, just turn on full text indexing. Full text indexing will pluck out each word, and thus any word any place in the text will use a high speed index.
As a result, you probably not require the % sign.
So, say for this text

The user purchased a dell computer

With full text indexing, then you can search for "dell" and the record will be found. And quite sure you can even use a % on the trailing side (suffix) of the keyword you are looking for.

Or you could use "dell*", and that would found say dellaware in the text (anywhere in the text as a separate word).

You can't of course use "*dell", say to match "bigdell", but if in the free text you have "big dell", then you ok.

(so, a *some word (prefix) does not work).

So, I do recommend that you try full text indexing - even the free edition of SQL express supports full text indexing....

You will of course have to change the syntax of the wild card search, but really nice is multiple key words can be "feed" to the condition in one shot...

Full text indexing can thus save the effort and trouble of rolling some custom system.....

And yes you have to "turn on this feature" before you can use it.....
If when having installed SQL or even Express edition? Yes, you have to re-launch the installer, and make sure the full text indexing was installed.
You also have to launch the SQL services config, and make sure that full text service is turned on...

While your results are rather good right now? That's because lots of memory can be allocated to the one user. When you have a "load" on the system, then some of those tables (or even some of the indexing) will not be cached as well in production - since the server is under load.

Of course, the other issue? Well, you have to change your SQL syntax for such searching to work....

The user purchased a dellware computer
So, to find dell in above?
You would use this
WHERE Comments like '%dell%'

But, with full text indexing, then this:

WHERE Contains([Comments], 'dell*')

And really nice? You can have multiple conditions, say this:

CONTAINS([Description], '("car" OR "truck" OR "bike")')

R
Albert
 
Thank you Albert. You convinced me. I will invest the time to learn full text indexing.
We just turned it on about 2 months ago. We have a screen in which comments for a project (one of multiple components for the project) could be searched. We also wanted to be able to search for a project name, and also that of a company name.
Hence, there was about 2-4 fields, and we always had a "wild card" search.

So, if some company was called

West End Ace furnace and HVAC service?

We want to be able to search for

Ace HVAC

So, we would take each search word (Ace and HVAC) and convert that into

WHERE CompanyName is like '%Ace%' OR CompanyName is like '%HVAC%'

So, our code does a "split" on each word entered, and then we create multiple "like" statements from each word.

(this means the order of words when searching for a company don't matter, since we split out/up each word into a search)

I could also type in

HVAC Ace

Such searching was starting to take about 20 seconds, or more -- closer to 30.

And with 100 users/desktops? (and running free SQL express).

Well, even when say a full table (or full index) scan occurs?

Well, that puts huge loads on SQL server - and stress on the caching of such tables into memory. So, with one user - SQL server can hold a lot of that "bad" search in memory, but with 100 users - you can't, and if you could, you still don't want all that memory being used.

So, we full text indexed the Company name, and we also full text indexed the project name, and then for sub components, there was several more comments and description fields (child rows of the project). So, all in all, the number of rows to search?

In project header - about180,000 rows.
In project components (child of above), about 400,000 rows.

So, after converting the search to full text?

It's gone from 20-30 seconds down to LESS then 1/2 second!!!
(in other words, no perceptible delay exists now....).

So, being able to type in "any" keyword of a company - and order of search words does not matter?
And same for searching project comments?

Full text index performance? It's simply like magic....

Here's a sample of one such search screen:

1762018968156.png
 
I am still in shock how fast it is with almost a million Generator records.
Storing only the needed data in sorted order with a cluster index, indexing the right fields, using full text indexing, and disabling transaction logging works wonders!
All you need to do now is to cleanse the data every time you import it.
 
Last edited:
Storing only the needed data in sorted order with a cluster index, indexing the right fields, using full text indexing, and disabling transaction logging works wonders!
All you need to do now is to cleanse the data every time you import it.
I have a demo on Friday going over if I am querying the data in a meaningful manner. That meeting will tell me a lot about how to structure indexes and what the user's find valuable.

Thanks BlueSpruce and yes turning off logging has been a game changer in terms of disk space.
 

Users who are viewing this thread

Back
Top Bottom