Clustered Index Column Order - SQL Server Forums

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

General SQL Server Forums

New to SQL Server Administration

Clustered Index Column Order

Author

Topic

coderm3
Starting Member

3 Posts

Posted - 2010-08-04 : 23:01:35

I have a question on what the optimal Clustered index design would be.

Consider the following scenario:

I have a table called "ConsumerPayments".

With the following columns:
AccountNum (int),
Deleted (bit),
PaymentDate (datetime),
PaymentAmount (money),
PaymentReversed (bit),
PostedBy (int),
ReversedBy (int)

And then, the following three SQL statements would be ran equally as often against the database:

SELECT SUM(PaymentAmount) FROM ConsumerPayments WHERE DateDiff
(Month, PaymentDate, '7/1/2010')=0 AND Deleted=0

SELECT SUM(PaymentAmount) FROM ConsumerPayments WHERE Deleted=0 AND AccountNum = 12345

SELECT PaymentAmount, PostedBy FROM ConsumerPayments WHERE Deleted=0 AND AccountNum = 12345
ORDER BY PaymentDate DESC

More often than not, the AccountNum in the SQL critria would be one of higher AccountNum's (i.e. more often to be 12345 than 123) and PaymentDate will more often than not be the most recent dates.

Which of the following clustered indexes would make the most sense:
Deleted(asc), AccountNum(desc), PaymentDate(desc)

OR

Deleted(asc), PaymentDate(desc), AccountNum(desc)

OR

PaymentDate(desc), AccountNum(desc), Deleted(asc)

OR something else?

I am at a loss.

Thanks for your help!

Jason

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-08-05 : 01:52:36

I wouldn't put the cluster on any combination. Too wide. Cluster should be narrow among other considerations. For the cluster I'd probably consider either AccountNum (if it's an identity) or the paymentdate.

Then a nonclustered index on Deleted, AccountNum, PaymentDate. In that order.
If paymentdate is used within date calculations in the where clause it's not SARGable and can't be used for an index seek. Also, if it's usually used in range filters it should be the last column.

http://sqlinthewild.co.za/index.php/2009/01/19/index-columns-selectivity-and-equality-predicates/
http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/

--
Gail Shaw
SQL Server MVP

AndrewMurphy
Master Smack Fu Yak Hacker

2916 Posts

Posted - 2010-08-05 : 03:09:51

If paymentdate is used within date calculations as specified in the where clause it's not SARGable.

There is an alternative way to do the DateDiff(Month, PaymentDate, '7/1/2010')=0 calculation which brings the date column into contention for use within an index. Check here for examples - members Kristen and the FAQ page may have links.

coderm3
Starting Member

3 Posts

Posted - 2010-08-05 : 13:53:06

GilaMonster:

The AcountNum isn't an identity. I actually have another column PaymentID that is an identity and is also the primary key. This is rarely used in a criteria and if it is, it is never a range it woud always only refernce one row (for updates queries).

The AccountNum field in this table is not unique, there could me multiple instances as there could be multiple payments for that account.

When you say filters should be in the last column, do you mean as part of the index or as a INCLUDE? I'd assume as part of the index so that it's sorted?

Thanks for the articles - they help a lot.

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2010-08-05 : 15:59:46

by the way, no point in indexing deleted. it's a bit field and indexing it is useless

coderm3
Starting Member

3 Posts

Posted - 2010-08-05 : 22:59:56

Why is it that the SQL "Database Engine Tuning Advisor" always suggests indexes on bit fields?

Kristen
Test

22859 Posts

Posted - 2010-08-06 : 02:15:17

If DELETED is 99% one thing, and 1% the other then, yes, an index on DELETED for the 1% value will probably be of value. But that index will never be used if you are looking for the 99% value.

I'm sure that it can be more than 1% ... but I don't know how much more ... an index key column has to be "selective" to be used - otherwise the query planner will just look at the statistics and decide its way too many disk reads and too much disk head travel to look in the index, and then find that row in the table, so it will just scan the whole table instead ... plus you've still got all the overhead of disk space for the index and maintaining it whenever a row is inserted or deleted, or the DELETED column value changes.

So ... index on BIT column very rarely of any worthwhile benefit, regardless of what the Index Wizard says

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-08-06 : 06:26:24

quote:
Originally posted by russell

by the way, no point in indexing deleted. it's a bit field and indexing it is useless

I totally disagree. There's little point in a single-column index on a bit, but as part of a composite index is another matter.

--
Gail Shaw
SQL Server MVP

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-08-06 : 06:28:34

quote:
Originally posted by Kristen

an index key column has to be "selective" to be used - otherwise the query planner will just look at the statistics and decide its way too many disk reads and too much disk head travel to look in the index, and then find that row in the table, so it will just scan the whole table instead

Yes, but...

That's only true of a noncovering index. If an index is covering, SQL will use it no matter how non-selective it is. Also, the selectivity is not just evaluated on the first column of the index. SQL has density stats on all of the columns of the index.

--
Gail Shaw
SQL Server MVP

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-08-06 : 06:32:16

quote:
Originally posted by coderm3

The AcountNum isn't an identity. I actually have another column PaymentID that is an identity and is also the primary key. This is rarely used in a criteria and if it is, it is never a range it woud always only refernce one row (for updates queries).

Ok, so maybe leave the cluster on that, for the advantages of narrow, ever-increasing, etc

quote:
When you say filters should be in the last column, do you mean as part of the index or as a INCLUDE? I'd assume as part of the index so that it's sorted?

I said inequality filters should be the last column of the index key. So key column, not include column.

I put deleted as the first column of the index because all the sample queries you gave filter on that plus one or more columns. Hence an index on (deleted, AccountNum, PaymentDate) will be seekable for all of the sample queries you gave. Swap the order of Deleted and AccountNum and that's no longer true (the first one would only be able to scan)

--
Gail Shaw
SQL Server MVP

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-08-06 : 06:36:58

quote:
Originally posted by coderm3

Why is it that the SQL "Database Engine Tuning Advisor" always suggests indexes on bit fields?

Because DTA is a pile of junk that shouldn't be trusted.

--
Gail Shaw
SQL Server MVP

Kristen
Test

22859 Posts

Posted - 2010-08-06 : 08:23:38

Sorry, didn't read the whole thread and was referring to an index on just a single BIT column. Gail's points about covering index and composite indexes very valid (although I didn't know that SQL had statistics on the combination of key fields, I though it was only the first)

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-08-06 : 11:53:59

It has the histogram only on the leading column. It has density measurements on all left-based subsets of the columns.

--
Gail Shaw
SQL Server MVP

russell
Pyro-ma-ni-yak

5072 Posts

Posted - 2010-08-06 : 16:24:51

haha, i should've read your post more thoroughly Gail. I stand corrected. my apologies.

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-08-06 : 16:47:32

quote:
Originally posted by GilaMonster

quote:
Originally posted by coderm3

Why is it that the SQL "Database Engine Tuning Advisor" always suggests indexes on bit fields?

Because DTA is a pile of junk that shouldn't be trusted.

Agreed. I never use it as a result. And I'm sick of seeing indexes on production systems that I've newly inherited that have dta indexes applied (you can tell by the name since my co-workers didn't change the names to reflect any kind of standard). The dta recommends duplicates and near duplicates way too often, so I'm constantly making recommendations to drop them and apply better ones.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-08-07 : 05:01:36

quote:
Originally posted by tkizer

And I'm sick of seeing indexes on production systems that I've newly inherited that have dta indexes applied (you can tell by the name since my co-workers didn't change the names to reflect any kind of standard). The dta recommends duplicates and near duplicates way too often, so I'm constantly making recommendations to drop them and apply better ones.

Last year I was doing some performance tuning work for a client, monthly reviews and recommendations. I took a look at the DB one month and noticed 26 new indexes on a heavily inserted transaction table. All dta. Oddly enough the insert performance on that table had deteriorated massively.

My sole recommendation for that month was 'drop all DTA indexes and never run DTA again'

--
Gail Shaw
SQL Server MVP

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-08-07 : 12:26:23

I bet their SELECT queries were fast!

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources