Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
ms65g
Constraint Violating Yak Guru
497 Posts |
Posted - 2010-02-07 : 14:36:43
|
Hi,I want to create an index based on the following query for improving the performance. The execution plan shows "Scan" for "Trips" table. I have not created any Clustered/Nonclustered index on the tables without primary key. Please send a suggestion.--The DDLCREATE TABLE Cities(city_id INT NOT NULL PRIMARY KEY, city_name NVARCHAR(15) NOT NULL);CREATE TABLE Trips(trip_nbr INT NOT NULL PRIMARY KEY, start_city_id INT NOT NULL, end_city_id INT NOT NULL, start_date DATE NULL DEFAULT GETDATE() --Check Constraint CHECK (start_city_id <> end_city_id), --Foreign Key Constraints CONSTRAINT FK_start_city FOREIGN KEY (start_city_id) REFERENCES Cities (city_id), CONSTRAINT FK_end_city FOREIGN KEY (end_city_id) REFERENCES Cities (city_id) );--The QuerySELECT T.*, C1.city_name AS start_city_name, C2.city_name AS end_city_name FROM Trips T JOIN Cities C1 ON T.start_city_id = C1.city_id JOIN Cities C2 ON T.end_city_id = C2.city_id; |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-02-07 : 15:40:01
|
The City_ID in the cities table is the primary key, so that's already got an index (clustered by default).As for the trips table, you're asking for the entire table back, there's no where clause that limits the rows. Hence this requires a clustered index (table) scan. I don't think there's any index that can make that perform better. The most optimal way to return the entire table (what you're asking for) is a clustered index scan.Do you really need every single column and every single row from the trips table?--Gail ShawSQL Server MVP |
 |
|
ms65g
Constraint Violating Yak Guru
497 Posts |
Posted - 2010-02-07 : 19:53:59
|
Now with the conditions I filter rows. And select some columns instead of whole. I have not a condition on clustered key (city_id) instead of it I have condition on city_name but execution plan shows "Clustered Index Seek", why? And what index is useful?SELECT T.trip_nbr, T.start_date, C1.city_name AS start_city_name, C2.city_name AS end_city_name FROM Trips T JOIN Cities C1 ON T.start_city_id = C1.city_id AND T.start_date BETWEEN '2000-01-01' AND '2009-30-12' AND C1.city_name = 'city1' JOIN Cities C2 ON T.end_city_id = C2.city_id; |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-02-08 : 02:06:00
|
Conditions belong in the WHERE clause, not in the joinSELECT T.trip_nbr, T.start_date, C1.city_name AS start_city_name, C2.city_name AS end_city_name FROM Trips T JOIN Cities C1 ON T.start_city_id = C1.city_id JOIN Cities C2 ON T.end_city_id = C2.city_idWHERE T.start_date BETWEEN '2000-01-01' AND '2009-30-12' AND C1.city_name = 'city1';If you look at the exec plan, you'll see that SQL is reading the Trips table and then looking up the city value for each row from the Cities table. The seek is based on the join condition.To help this, try a nonclustered index on the Trips table on the start_date column--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2010-02-08 : 03:39:14
|
Personally I would have put every condition test for [Cities] in the JOIN Cities clause (allows changing the JOIN to LEFT JOIN etc.), and conditions for [Trips] in the WHERE.But its mostly personal choice, the optimiser will probably do the same thing whichever way the query is written. |
 |
|
GilaMonster
Master Smack Fu Yak Hacker
4507 Posts |
Posted - 2010-02-08 : 09:09:11
|
quote: Originally posted by Kristen Personally I would have put every condition test for [Cities] in the JOIN Cities clause (allows changing the JOIN to LEFT JOIN etc.), and conditions for [Trips] in the WHERE.
There's a foreign key constraint between Trips and Cities, it's impossible to have a city id in Trips that isn't in Cities.If you put the condition in the join and then change to a left join, all trips in the period will be returned and only those that started in City1 will have a name, the rest will be null.If the condition is in the WHERE, then only those trips within the period that started in City1 will be returned.While I obviously can't say for sure, I'd guess that the latter is what the OP wants more likely than the former.--Gail ShawSQL Server MVP |
 |
|
Kristen
Test
22859 Posts |
Posted - 2010-02-08 : 09:35:56
|
Indeed. Not disputing any of that, and your points are very valid.Just pointing out my style as an alternative.Another thing that all-columns-with-joined-table helps with, IMHO, is if the JOIN needs to be commented out (e.g. for debugging). Otherwise there is the risk that the JOIN is commented out, and a line or two in the WHERE, and then someone (nameless!) forgets to comment the WHERE back in again ... |
 |
|
|
|
|
|
|