Author |
Topic |
coderm3
Starting Member
3 Posts |
Posted - 2010-08-04 : 23:01:35
|
I have a question on what the optimal Clustered index design would be.Consider the following scenario:I have a table called "ConsumerPayments". With the following columns: AccountNum (int), Deleted (bit), PaymentDate (datetime), PaymentAmount (money), PaymentReversed (bit), PostedBy (int), ReversedBy (int) And then, the following three SQL statements would be ran equally as often against the database:SELECT SUM(PaymentAmount) FROM ConsumerPayments WHERE DateDiff(Month, PaymentDate, '7/1/2010')=0 AND Deleted=0SELECT SUM(PaymentAmount) FROM ConsumerPayments WHERE Deleted=0 AND AccountNum = 12345SELECT PaymentAmount, PostedBy FROM ConsumerPayments WHERE Deleted=0 AND AccountNum = 12345ORDER BY PaymentDate DESCMore often than not, the AccountNum in the SQL critria would be one of higher AccountNum's (i.e. more often to be 12345 than 123) and PaymentDate will more often than not be the most recent dates.Which of the following clustered indexes would make the most sense:Deleted(asc), AccountNum(desc), PaymentDate(desc)ORDeleted(asc), PaymentDate(desc), AccountNum(desc)ORPaymentDate(desc), AccountNum(desc), Deleted(asc)OR something else?I am at a loss.Thanks for your help!Jason |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-08-05 : 01:52:36
|
I wouldn't put the cluster on any combination. Too wide. Cluster should be narrow among other considerations. For the cluster I'd probably consider either AccountNum (if it's an identity) or the paymentdate.Then a nonclustered index on Deleted, AccountNum, PaymentDate. In that order.If paymentdate is used within date calculations in the where clause it's not SARGable and can't be used for an index seek. Also, if it's usually used in range filters it should be the last column.http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/--Gail ShawSQL Server MVP |
 |
|
AndrewMurphy
Master Smack Fu Yak Hacker
2916 Posts |
Posted - 2010-08-05 : 03:09:51
|
If paymentdate is used within date calculations as specified in the where clause it's not SARGable.There is an alternative way to do the DateDiff(Month, PaymentDate, '7/1/2010')=0 calculation which brings the date column into contention for use within an index. Check here for examples - members Kristen and the FAQ page may have links. |
 |
|
coderm3
Starting Member
3 Posts |
Posted - 2010-08-05 : 13:53:06
|
GilaMonster:The AcountNum isn't an identity. I actually have another column PaymentID that is an identity and is also the primary key. This is rarely used in a criteria and if it is, it is never a range it woud always only refernce one row (for updates queries).The AccountNum field in this table is not unique, there could me multiple instances as there could be multiple payments for that account.When you say filters should be in the last column, do you mean as part of the index or as a INCLUDE? I'd assume as part of the index so that it's sorted? Thanks for the articles - they help a lot. |
 |
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2010-08-05 : 15:59:46
|
by the way, no point in indexing deleted. it's a bit field and indexing it is useless |
 |
|
coderm3
Starting Member
3 Posts |
Posted - 2010-08-05 : 22:59:56
|
Why is it that the SQL "Database Engine Tuning Advisor" always suggests indexes on bit fields? |
 |
|
Kristen
Test
22859 Posts |
Posted - 2010-08-06 : 02:15:17
|
If DELETED is 99% one thing, and 1% the other then, yes, an index on DELETED for the 1% value will probably be of value. But that index will never be used if you are looking for the 99% value.I'm sure that it can be more than 1% ... but I don't know how much more ... an index key column has to be "selective" to be used - otherwise the query planner will just look at the statistics and decide its way too many disk reads and too much disk head travel to look in the index, and then find that row in the table, so it will just scan the whole table instead ... plus you've still got all the overhead of disk space for the index and maintaining it whenever a row is inserted or deleted, or the DELETED column value changes.So ... index on BIT column very rarely of any worthwhile benefit, regardless of what the Index Wizard says |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-08-06 : 06:26:24
|
quote: Originally posted by russell by the way, no point in indexing deleted. it's a bit field and indexing it is useless
I totally disagree. There's little point in a single-column index on a bit, but as part of a composite index is another matter.--Gail ShawSQL Server MVP |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-08-06 : 06:28:34
|
quote: Originally posted by Kristen an index key column has to be "selective" to be used - otherwise the query planner will just look at the statistics and decide its way too many disk reads and too much disk head travel to look in the index, and then find that row in the table, so it will just scan the whole table instead
Yes, but...That's only true of a noncovering index. If an index is covering, SQL will use it no matter how non-selective it is. Also, the selectivity is not just evaluated on the first column of the index. SQL has density stats on all of the columns of the index.--Gail ShawSQL Server MVP |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-08-06 : 06:32:16
|
quote: Originally posted by coderm3 The AcountNum isn't an identity. I actually have another column PaymentID that is an identity and is also the primary key. This is rarely used in a criteria and if it is, it is never a range it woud always only refernce one row (for updates queries).
Ok, so maybe leave the cluster on that, for the advantages of narrow, ever-increasing, etcquote: When you say filters should be in the last column, do you mean as part of the index or as a INCLUDE? I'd assume as part of the index so that it's sorted?
I said inequality filters should be the last column of the index key. So key column, not include column.I put deleted as the first column of the index because all the sample queries you gave filter on that plus one or more columns. Hence an index on (deleted, AccountNum, PaymentDate) will be seekable for all of the sample queries you gave. Swap the order of Deleted and AccountNum and that's no longer true (the first one would only be able to scan)--Gail ShawSQL Server MVP |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-08-06 : 06:36:58
|
quote: Originally posted by coderm3 Why is it that the SQL "Database Engine Tuning Advisor" always suggests indexes on bit fields?
Because DTA is a pile of junk that shouldn't be trusted.--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2010-08-06 : 08:23:38
|
Sorry, didn't read the whole thread and was referring to an index on just a single BIT column. Gail's points about covering index and composite indexes very valid (although I didn't know that SQL had statistics on the combination of key fields, I though it was only the first) |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-08-06 : 11:53:59
|
It has the histogram only on the leading column. It has density measurements on all left-based subsets of the columns.--Gail ShawSQL Server MVP |
 |
|
russell
Pyro-ma-ni-yak
5072 Posts |
Posted - 2010-08-06 : 16:24:51
|
haha, i should've read your post more thoroughly Gail. I stand corrected. my apologies. |
 |
|
tkizer
Almighty SQL Goddess
38200 Posts |
Posted - 2010-08-06 : 16:47:32
|
quote: Originally posted by GilaMonster
quote: Originally posted by coderm3 Why is it that the SQL "Database Engine Tuning Advisor" always suggests indexes on bit fields?
Because DTA is a pile of junk that shouldn't be trusted.
Agreed. I never use it as a result. And I'm sick of seeing indexes on production systems that I've newly inherited that have dta indexes applied (you can tell by the name since my co-workers didn't change the names to reflect any kind of standard). The dta recommends duplicates and near duplicates way too often, so I'm constantly making recommendations to drop them and apply better ones.Tara KizerMicrosoft MVP for Windows Server System - SQL Serverhttp://weblogs.sqlteam.com/tarad/Subscribe to my blog |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-08-07 : 05:01:36
|
quote: Originally posted by tkizer And I'm sick of seeing indexes on production systems that I've newly inherited that have dta indexes applied (you can tell by the name since my co-workers didn't change the names to reflect any kind of standard). The dta recommends duplicates and near duplicates way too often, so I'm constantly making recommendations to drop them and apply better ones.
Last year I was doing some performance tuning work for a client, monthly reviews and recommendations. I took a look at the DB one month and noticed 26 new indexes on a heavily inserted transaction table. All dta. Oddly enough the insert performance on that table had deteriorated massively.My sole recommendation for that month was 'drop all DTA indexes and never run DTA again'--Gail ShawSQL Server MVP |
 |
|
tkizer
Almighty SQL Goddess
38200 Posts |
|
|