Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Administration
 NonClustered Index

Author  Topic 

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2010-02-07 : 14:36:43
Hi,
I want to create an index based on the following query for improving the performance. The execution plan shows "Scan" for "Trips" table. I have not created any Clustered/Nonclustered index on the tables without primary key. Please send a suggestion.


--The DDL
CREATE TABLE Cities
(city_id INT NOT NULL PRIMARY KEY,
city_name NVARCHAR(15) NOT NULL);

CREATE TABLE Trips
(trip_nbr INT NOT NULL PRIMARY KEY,
start_city_id INT NOT NULL,
end_city_id INT NOT NULL,
start_date DATE NULL
DEFAULT GETDATE()

--Check Constraint
CHECK (start_city_id <> end_city_id),

--Foreign Key Constraints
CONSTRAINT FK_start_city
FOREIGN KEY (start_city_id)
REFERENCES Cities (city_id),

CONSTRAINT FK_end_city
FOREIGN KEY (end_city_id)
REFERENCES Cities (city_id)
);

--The Query
SELECT T.*,
C1.city_name AS start_city_name,
C2.city_name AS end_city_name
FROM Trips T
JOIN Cities C1
ON T.start_city_id = C1.city_id
JOIN Cities C2
ON T.end_city_id = C2.city_id;

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-02-07 : 15:40:01
The City_ID in the cities table is the primary key, so that's already got an index (clustered by default).

As for the trips table, you're asking for the entire table back, there's no where clause that limits the rows. Hence this requires a clustered index (table) scan. I don't think there's any index that can make that perform better. The most optimal way to return the entire table (what you're asking for) is a clustered index scan.

Do you really need every single column and every single row from the trips table?

--
Gail Shaw
SQL Server MVP
Go to Top of Page

ms65g
Constraint Violating Yak Guru

497 Posts

Posted - 2010-02-07 : 19:53:59
Now with the conditions I filter rows. And select some columns instead of whole.
I have not a condition on clustered key (city_id) instead of it I have condition on city_name but execution plan shows "Clustered Index Seek", why? And what index is useful?

SELECT T.trip_nbr, 
T.start_date,
C1.city_name AS start_city_name,
C2.city_name AS end_city_name
FROM Trips T
JOIN Cities C1
ON T.start_city_id = C1.city_id
AND T.start_date BETWEEN '2000-01-01' AND '2009-30-12'
AND C1.city_name = 'city1'

JOIN Cities C2
ON T.end_city_id = C2.city_id;

Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-02-08 : 02:06:00
Conditions belong in the WHERE clause, not in the join

SELECT T.trip_nbr,
T.start_date,
C1.city_name AS start_city_name,
C2.city_name AS end_city_name
FROM Trips T
JOIN Cities C1
ON T.start_city_id = C1.city_id
JOIN Cities C2
ON T.end_city_id = C2.city_id
WHERE T.start_date BETWEEN '2000-01-01' AND '2009-30-12'
AND C1.city_name = 'city1';

If you look at the exec plan, you'll see that SQL is reading the Trips table and then looking up the city value for each row from the Cities table. The seek is based on the join condition.

To help this, try a nonclustered index on the Trips table on the start_date column

--
Gail Shaw
SQL Server MVP
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-08 : 03:39:14
Personally I would have put every condition test for [Cities] in the JOIN Cities clause (allows changing the JOIN to LEFT JOIN etc.), and conditions for [Trips] in the WHERE.

But its mostly personal choice, the optimiser will probably do the same thing whichever way the query is written.
Go to Top of Page

GilaMonster
Master Smack Fu Yak Hacker

4507 Posts

Posted - 2010-02-08 : 09:09:11
quote:
Originally posted by Kristen

Personally I would have put every condition test for [Cities] in the JOIN Cities clause (allows changing the JOIN to LEFT JOIN etc.), and conditions for [Trips] in the WHERE.


There's a foreign key constraint between Trips and Cities, it's impossible to have a city id in Trips that isn't in Cities.

If you put the condition in the join and then change to a left join, all trips in the period will be returned and only those that started in City1 will have a name, the rest will be null.
If the condition is in the WHERE, then only those trips within the period that started in City1 will be returned.

While I obviously can't say for sure, I'd guess that the latter is what the OP wants more likely than the former.

--
Gail Shaw
SQL Server MVP
Go to Top of Page

Kristen
Test

22859 Posts

Posted - 2010-02-08 : 09:35:56
Indeed. Not disputing any of that, and your points are very valid.

Just pointing out my style as an alternative.

Another thing that all-columns-with-joined-table helps with, IMHO, is if the JOIN needs to be commented out (e.g. for debugging). Otherwise there is the risk that the JOIN is commented out, and a line or two in the WHERE, and then someone (nameless!) forgets to comment the WHERE back in again ...
Go to Top of Page
   

- Advertisement -