Author |
Topic |
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-03 : 06:04:09
|
Hi,I am working on a project where nested views seem to make sense for data agregation to prevent code duplication. However I understand that nested views perform poorly. How can I replace these with stored procs?Here is what I want to achive simplified down. I know for the below example it looks like table 2 and 3 should be combined but in reality they contain a number of differnt columns.Table1VerIdDeptIdCreatorIdTable2type1IdfkVerIdunitscostTable3type2IdfkVerIdunitscostview type1PriceSELECT type2Id ,fkVerId ,(units * cost) PriceFROM Table2 view type2PriceSELECT type2Id ,fkVerId ,(units * cost) PriceFROM Table3 view versionPriceSELECT VerId ,DeptId ,CreatorId ,SUM(v1.Price) + SUM(v2.Price) TotalPriceFROM Table1 LEFT JOIN type1Price v1 ON v1.fkVerId = VerId LEFT JOIN type2Price v2 ON v2.fkVerId = VerIdGROUP BY VerId ,DeptId ,CreatorId To me the above gives a nice simple way of getting the versionCost for either and individual version, a departments versions or a creators versions. It also allows me to get the type 1 and type 2 prices for a version from views type1Price and type2Price without duplicating the aggregation code in versionPrice.How and should this be replaced with stored procs? |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-03 : 06:23:09
|
I can't see that a stored procedure is going to make any difference.How about putting an index on the View 9so that it is, in effect, cached rather than having to be "calculated" each time)The reason your versionPrice view is "slow" is that it has to add up all the child rows in Table2 and Table3. If that is too slow then you will need to "cache" them instead. For example, you could have triggers on Table2 and Table3 that maintain a "running total" in a new cache table (columns VerId, DeptId, TotalPrice) as the new rows are added, deleted or modified. |
 |
|
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-03 : 06:27:55
|
Unfortunatly in the real solution I join on tables in a lookup database so I can't index the views as schema binding isn't allowed. It just seems that when selecting from view versionPrice for a single version the performance is poor. Not keen on storing calculated values if I can avoid this as I understand this is also bad practice.Also in reality my structure is more complex and I nest up to 5 views to give totals at a number of levels. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-03 : 06:49:11
|
"Not keen on storing calculated values if I can avoid this as I understand this is also bad practice"Why do you think its bad practice?Keeping a running total to reduce the amount of recalculate-every-time-it-is-needed seems like good practice to me!You can have a scheduled task that compares the running total table to freshly made totals and alerts if they are different - thus you will know if ever there is a "flaw" in your trigger.Just replace the VIEW with a persistent table [i.e. of the same name], and triggers on the underlying tables, no program changes needed :) |
 |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-03 : 06:57:42
|
How about: try it, do some performance tests, put the view back if it is not hugely faster.EXEC sp_rename 'MyViewName', 'MyViewName_OLD', 'OBJECT'SELECT * INTO MyViewName FROM MyViewName_OLDCREATE [clustered] Primary Key on the appropriate columns test !(although these won't dynamically change as new data is added) |
 |
|
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-03 : 07:14:22
|
It will be much faster I do understand that. I am really trying to understand why nested views are so bad and how stored procs might be a better option. The issue I am getting with nested views seems to be that the query plan is pulling back all the rows of the nested view then doing the sort and filter on the final view.e.g.SELECT * FROM versionPrice WHERE VerId = 1 Pulls back everything from type1Price and type2Price thus doing (units * cost) on every row. Then selects the rows it needs.Why does the query planner make such bad descisions when using nested views. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-03 : 07:44:24
|
You could try the composite SQL and see if you get a better query plan. And you could fiddle with it to see if you can improve it, and then change your VIEWs based on what you found.SELECT ... YourColumns ...FROM( SELECT VerId ,DeptId ,CreatorId ,SUM(v1.Price) + SUM(v2.Price) TotalPrice FROM Table1 LEFT JOIN ( SELECT type2Id ,fkVerId ,(units * cost) Price FROM Table2 ) AS v1 ON v1.fkVerId = VerId LEFT JOIN ( SELECT type2Id ,fkVerId ,(units * cost) Price FROM Table3 ) AS v2 ON v2.fkVerId = VerId GROUP BY VerId ,DeptId ,CreatorId) AS XWHERE ... you where clause ... but personally I would just change it. You are always going to be SUM()'ing all the rows in Table2 & Table3 every time you use the views. We never build subs-systems that do that, they just don't scale. Its fine for the occasional report, but not for anything which is used frequently. |
 |
|
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-03 : 07:51:42
|
Thanks what I want to understand is why does SQL server do the sum for all the rows rather than getting the rows it needs then doing the sum? |
 |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-03 : 09:25:04
|
You can fiddle with the "unwound" query (i.e. replace all views with the actual code from the VIEW itself) and see how it performs.Hopefully it will perform exactly the same and then you can look at what indexes are / are not being used, try variations, add indexes, and so on until you get something that performs well.I can't help you - you've only got a simplistic example here (which is fine, but I very much doubt that it mirrors your real life query), and you need to try it with your data, schema, and so on. If you've got some specific questions from your findings I'm sure folk here will help.But if it was me I would build a cache table |
 |
|
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-03 : 10:34:14
|
Thanks you are right my actual query is much more complex. I will build the SQL and see how it goes. |
 |
|
cornall
Posting Yak Master
148 Posts |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-03 : 13:22:18
|
Looks like an excellent solution, and thanks for reporting back.Like you, I don't understand why the nested views didn't allow the optimiser to "just solve this" - but that's life, and you wouldn't have been here if it had "just worked"You still have a problem (do you?) that the original code for the type1Price and type2Price VIEWs needs to be duplicated in the versionPrice view definition? (unless I have misunderstood), which would be a bit of a PITA and someone, at some time, will update one and forget to update the other ... but most of us have some of that sort of "If you change THIS you must also change THAT" comments in our code ...I'd still use a caching table!! |
 |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-03 : 13:23:28
|
P.S. TBH I thought that the Query optimiser did JOIN Elimination on Views as a matter of course ... |
 |
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2012-02-03 : 14:12:46
|
we used cte for a similar thing. it worked great in the beginning but it could not scale as underlying tables grew. it might be sweet for small data set but make sure you test it with a larger data set. it bit us and we had to change things once it was deployed , it could not scaleIf you don't have the passion to help people, you have no passion |
 |
|
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-03 : 15:24:16
|
Hi,Thanks yosiasz I did think of that but at least it buys me some time! Probably in the end I will end up with some sort of calculated value being stored.Perhaps you know the answer to my next question does the use of CTE cause the whole undelying view to be loaded or does the query analyser realise I only want a subset of the CTE table and retrive that? I am guessing this may be why it isn't scalable?Kristen the real solution isn't exactly as above I just tried to simplify things so I don't think I have an issue! Although I may still end up with calcluated values being stored :-)Thanks |
 |
|
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-04 : 07:28:44
|
I loaded 20k quotes into the test system which is twice as many as the old system had in 7 years and it still ran ok and faster than before i changed to CTE so I reckon I might be ok.I can plan some archiving to denormalise old quotes into archive tables to keep it running like lightning. |
 |
|
Kristen
Test
22859 Posts |
Posted - 2012-02-04 : 13:24:45
|
Agreed, 20k is probably not enough to cause any significant slowdown ... although SUM'ing the child rows might cause blocking. Hopefully that won't be a side-effect problem for you, or you are already using Read Committed Snapshot? |
 |
|
cornall
Posting Yak Master
148 Posts |
Posted - 2012-02-04 : 13:45:48
|
Each quote is made up of versions which are made up of elements which are made up of cost types so the lower levels have several hundred thousand rows.There isn't much concurent access to quotes so hopefully locking shouldn't be a problem. I think archiving old quotes out of the system is the best bet to keep it quick. |
 |
|
|