Nested Calc'd Columns - Parallelism, Indexing & Efficiency

dalski

Active member
Local time
Today, 03:03
Joined
Jan 5, 2025
Messages
357
Following this thread; I am trying to find the best execution plan to remain fully recalculable & benefit from caching with fast indexing for calculated columns. If I nest views I can index the first view (great) but any subsequent views thereafter loose the ability to index child-views as an indexed view cannot be created when a view is included in the select statement; amongst many other restrictions with an indexed view (not good).
I am trying to avoid already calculated results. So use a persistent UDF - then I have to feed many unnecessary variables to View_2, obfuscating the math & I'm also concerned what happens with persisted results. I don't know if they are cleared from memory in SQL Server/ what happens as the continuous build of endless results would then be slower to retrieve than a non-indexed lot of nested views (with un-indexed memory pointers; forcing full-scan's just to get a single result). Older versions of SQL Server using a scalar function is also bad because it stops the execution plan benefitting from parallelism. I don't know if Ver-22 is completely ridded of this?

The Problem
  • Tbl1 contains info to deduce calc1 calculated result; which is deterministic & other views feed off this calc. My plan was to have a view with this calculation & join the calc'd view column (in the view) to the original table with indexes in the view & table to maximise speed. Allows for automatic recalculation (opposed to a calc'd column in the table itself). BUT the problem arises in subsequent child-views which are dependant on this calc'd column in the view; see View_2:
  • View_2 - uses calc1's result & new info in Tbl2 to deduce calc2, also deterministic. There will be many feeding off calc1 & calc2 so not all of them will be deterministic.
I thought a calc'd table column in Tbl1 for calc1 (inside the table itself) may be an alternative as the child calculations are so dependant on calc1. However already open views failed to update calc1. Whereas if referencing calc1 in View1 did update in the already open views...
Also worried that many calc's benefit in storage space but increased cost on CPU for determining. I can reduce this by using Temp Tables I think with stored procedures to limit what is shown in a view & build a temporary view with indexes on launching the application I guess would be the way to do it.

Just found MSN's Guide On Query Processing Architecture which I hope bears fruit. Have researched inexing, B-Trees, row & batch execution...

Seems a very basic thing that many would have done before -
What is the best route for calculation when you need to reference that calculation in many other places?
 
Last edited:
How long does recalculating take? SQL Server will keep the data in the cache until there's enough data pressure to force it out. It doesn't look like you've made a case for storing calculated data yet. Normally you don't do that. If you create an indexed view, I think the engine stores the results (internally) as a table.
 
When query performance tuning, it is best to start simple to identify where the bottlenecks exist and then work your way from there. You note that you expect to have two layered views with the second view using the first view. Put that aside for now. First, focus on your first view and the underlying table. Some initial first things to try:
  • Run your first view and evaluate the performance of that. Is it running unsatisfactorily slow?
  • Now run your first view where you exclude your user-defined function from it. Did that significantly change performance? Is the view now running very fast compared to when you ran it with the udf? This will help identify whether there are any performance bottlenecks without the udf or if the bottleneck is the udf.
If possible, also show us the query from your first view and that will help in evaluating potential performance red flags or issues.

There will be additional steps from here of course including bringing view 2 into the picture, but this will give an initial baseline.
 
If you create an indexed view, I think the engine stores the results (internally) as a table.
1768207286844.png

👏 Seems correct MadPiet.

It doesn't look like you've made a case for storing calculated data yet. Normally you don't do that.
Thanks, I will not store calc'd data for now; appreciate your input.

How long does recalculating take?
identify where the bottlenecks exist and then work your way from there..
I have no real data in the application yet. In order for me to populate the data I need to build the queries & forms. Because the data is so inherent from different tables I need to build these in or
der to then populate the data. The query structure will affect the build of the application.

When query performance tuning, it is best to start simple to identify where the bottlenecks exist and then work your way from there. You note that you expect to have two layered views with the second view using the first view. Put that aside for now. First, focus on your first view and the underlying table.
I had first 2 views built & running well but with just a few records but with no real data in the app & recently discovering Indexed Views I thought Indexed Views must be better performance. Then discovered the many restrictions of Indexed Views; only Inner Joins permitted.

If possible, also show us the query from your first view and that will help in evaluating potential performance red flags or issues.
I may just have the values of the calculation in the views (which are for calc'd columns). Depends on how hard my joins are later with several sources... Obviously attempting to obfuscate real values something like below pseudo code:
Tbl1
IDProductNrNrOrders
Vw-1
SQL:
SELECT
    Cast(If NrOrders Is Null Condition1, Else NrOrders * ProductNr, Decimal()) AS OrderQty
FROM
    Tbl1
    INNER JOIN TblX ON TblXID = Tbl1ID
WHERE
    CONDITION 1 = SOMETHING
    CONDITION 1 = SOMETHING

Tbl2
Similar to Tbl1
Vw-2
SQL:
SELECT * ,
    Cast(If Tbl2..NrOrders Is Null Condition1, Else Tbl2.NrOrders * Tbl2.Cost, Decimal()) AS OrderQtyLv2
FROM
    Tbl2
    INNER JOIN Vw-1 ON Vw-1.ID = Vw-2.FK
WHERE
    CONDITION 1 = SOMETHING
    CONDITION 1 = SOMETHING

Maybe the indexes on the views are not required & Order By then avoids full scans as it knows they are stored in order. If that's the case then I think I might be ok.
 
Last edited:
Thanks Tom, I was casting a decimal; probs made a crap job of obfuscating my real code; sorry. The biggest questions I have atm is -

1 - Is there a benefit to indexing a view on an column which is already an index in the table?
2 - Is it worth all the aggravation, trying to separate & nest tables as much as possible to allow indexing on the resultant views.
 
I think the answers are No, No.
Have you tried reviewing the execution plan? And maybe also the Database Engine Tuning Advisor?
 
Thanks Tom, I did check the execution plan's, there were quite a few steps compared to basic tut's I've done, I've spent a good few days researching & nearly finished rewriting with indexed views. I'll test & post results but i only have a few rows.

EDIT - I am forgetting, it is necessary to create a clustered index on a view. Only then are you able to make nonclustered indexes on that view. As the ID field is the same field as in the table it is required it makes sense for me to use that ID. Hopefully it gets an execution benefit. No doubt I'm doing it wrong.
 
Last edited:
Thanks Tom, I was casting a decimal; probs made a crap job of obfuscating my real code; sorry. The biggest questions I have atm is -

1 - Is there a benefit to indexing a view on an column which is already an index in the table?
2 - Is it worth all the aggravation, trying to separate & nest tables as much as possible to allow indexing on the resultant views.
On #2 - Denny Cherry (among others) talks about crazy nesting of views. From a debugging perspective, it's a total nightmare. Whatever gains in performance you may get could be offset by the complexity of maintaining something like that.
 
Thanks MadPiet, yeah I've read that too, & probably the reason why I've been on it a few days. On a positive spin I'm trying to obfuscate code as much as possible so that's a positive. But it is quite awkward I must admit. I keep looking back & forth.
 
I may just have the values of the calculation in the views (which are for calc'd columns). Depends on how hard my joins are later with several sources... Obviously attempting to obfuscate real values something like below pseudo code:
.....
Maybe the indexes on the views are not required & Order By then avoids full scans as it knows they are stored in order. If that's the case then I think I might be ok.
Thanks for the additional information. I understand you are in the early stages. As a few initial things for consideration as you move forward with this:
  • Focus on Effectively Indexing the Table(s): SQL Server is very good with utilizing underlying table indexes when designed well with your specific query needs in mind. I have found that with good table indexing and query design, it is almost never necessary to index views (which are limiting and add complexity). SQL Server's query optimizer will utilize table indexes where it thinks they will be beneficial even in complex multi-layered queries. Good indexing however can be difficult to come up with. Some key factors include creating indexes that will help with lookups (fields that you expect will be used in WHERE clauses and/or in joins) and sorting (fields expected to be used in ORDER BY clauses).

  • For better query performance on large intensive queries, avoid User-Defined Functions (UDFs) when possible: UDFs create additional overhead and make it difficult for SQL Server to optimize query plans. Performing your calculation in the SQL query itself instead of using a UDF can result in much better performance as SQL Server can usually better optimize the query plan. Remember if your query is selecting 1 million rows, your UDF is going to be called 1 million times. Also, performance can also be very poor with UDFs and it is generally considered a very bad practice to use a UDF in a WHERE clause (SELECT....WHERE dbo.CalculateProjectedSales(ProductID) > 100000). UDFs can be very powerful and this is not to say they should never be used. But for large scale query performance considerations, they must be used very carefully and avoided in certain cases when possible.

  • Consider use of a Stored Procedure instead of a View in certain circumstances: Stored procedures can provide ways to optimize queries (and obfuscate logic from users if that is a necessity) for your application. If in your case the purpose of your query is for reporting purposes in your application (not to be used for editing records in a bound form), you can build a stored procedure that accepts input parameters to be used for filters (used in a WHERE clause) or sorting and dynamically builds and then returns the results of your SELECT query. Stored Procedures can be called in pass-through queries in MS Access or through ADO in VBA. This provides additional capabilities such as the ability to use #temp tables within your Stored Procedure as ways to further optimize performance.
 

Users who are viewing this thread

Back
Top Bottom