Better way to rank subsets of data....

Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

All Forums

SQL Server 2005 Forums

Transact-SQL (2005)

Better way to rank subsets of data....

Author

Topic

malachi151
Posting Yak Master

152 Posts

Posted - 2010-04-21 : 17:07:17

I'm looking for some better ways to sequence data and to select top values from sub sets of data. The way I've been doing it my queries and cursors often take forever to run. I'm running on large databses with tens of millions of rows.

The scenario is if you have something like this:

1 - 1/1/2001
1 - 2/1/2001
1 - 3/1/2001
2 - 1/1/2001
2 - 2/1/2001
2 - 3/1/2001
3 - 1/1/2001
3 - 2/1/2001
3 - 3/1/2001

Now lets say that I want to add a third column that indexes the records by date for each of the key values, resulting in:

1 - 1/1/2001 - 1
1 - 2/1/2001 - 2
1 - 3/1/2001 - 3
2 - 1/1/2001 - 1
2 - 2/1/2001 - 2
2 - 3/1/2001 - 3
3 - 1/1/2001 - 1
3 - 2/1/2001 - 2
3 - 3/1/2001 - 3

Currently to do something like this I use a cursor and select the key into the cursor, then loop through the key values and run a query that uses ROW_NUMBER to get ranking number and update the ranking column with the ROW_NUMBER value. This works, its just slow.

There are similar cases where what I need to do is select the top value from a subset of data, so in a simple example, like the one above, select the top date for each key, and then update some other table with that value, but its more complicated than that.

Below is a query that I'm running that just takes forever to run due to the WHERE clause, but I don't know of anyother way to do it, other than using a cursor, which I suspect is no faster.

Is there some technique that I'm totally missing here that provides a better way to do this?

UPDATE
	c
SET 
	c.is_gov_class_exp_pre_audit = 1
FROM 
	Class c
	INNER JOIN
		(SELECT audit_id
		FROM Class
		WHERE exposure_pre_audit != 0.0
		GROUP BY audit_id
		HAVING SUM(ISNULL(CONVERT(int,is_gov_class_exp_pre_audit),0)) = 0) sub1 
	ON c.audit_id = sub1.audit_id
WHERE
	c.class_code = (SELECT TOP 1 class_code
			FROM Class
			WHERE exposure_pre_audit != 0.0	AND audit_id = c.audit_id
			GROUP BY audit_id, class_code, loss_cost_onlvl
			ORDER BY SUM(exposure_pre_audit) DESC, loss_cost_onlvl DESC, class_code DESC)

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-04-21 : 23:47:16

does the table by itself have first two columns now? then its just of matter of creating thrid column and updating it as follows


UPDATE t
SET t.ThirdCol=t.Seq
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY FirstCol ORDER BY SecondCol) AS Seq,
ThirdCol 
FROM Table)t

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

malachi151
Posting Yak Master

152 Posts

Posted - 2010-04-25 : 04:53:59

Thanks, PARTITION is what I needed and wasn't aware of!

quote:
Originally posted by visakh16

does the table by itself have first two columns now? then its just of matter of creating thrid column and updating it as follows

UPDATE t
SET t.ThirdCol=t.Seq
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY FirstCol ORDER BY SecondCol) AS Seq,
ThirdCol
FROM Table)t

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2010-04-25 : 05:01:15

welcome

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Subscribe to SQLTeam.com

SQLTeam.com Articles via RSS

SQLTeam.com Weblog via RSS

- Advertisement -

Resources