Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Grouping rows in SQL Server

Author  Topic 

cardos
Starting Member

2 Posts

Posted - 2012-04-16 : 01:27:55
I need to find a way of grouping rows that occur within 5 minutes of the latest timestamp in the group and the displayid and object id are the same.....
CREATE TABLE [dbo].[ExampleLog]( 
[ID] [uniqueidentifier] NOT NULL,
[EVENTNUMBER] [int] NULL,
[DISPLAYSERIAL] [int] NULL,
[DISPLAYID] [varchar](50) NULL,
[OBJECTSERIAL] [int] NULL,
[OBJECTRSSILEVEL] [int] NULL,
[OBJECTID] [varchar](50) NULL,
[TIMESTAMP] [datetime] NULL)


INSERT INTO [dbo].[ExampleLog] 
VALUES
(NewID(), 206, 6897913, 'HV 644', 6885819, 66, 'WAP 8', '2012-01-01 12:12:00'),
(NewID(), 206, 6897913, 'HV 644', 6885819, 66, 'WAP 8', '2012-01-01 12:13:00'),
(NewID(), 206, 6897913, 'HV 644', 6877807, 54, 'MV 20', '2012-01-01 12:16:00'),
(NewID(), 103, 6897913, 'HV 644', 6898052, 118, 'HV 62', '2012-01-01 12:20:00'),
(NewID(), 206, 6897913, 'HV 644', 6886130, 122, 'HV 62', '2012-01-01 12:21:00'),
(NewID(), 206, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:25:00'),
(NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:31:00'),
(NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:33:00'),
(NewID(), 104, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:39:00')


so using the dataset above, my results would ideally be grouped as follows;

Rows 1 and 2 are grouped because the objectid and displayid are the same and row 2 occurs within 5 minutes of row 1, there are no other rows in the dataset that contain the same displayid and objectid and a timestamp within 5 minutes of the group’s latest timestamp (row 2) so..

Row 3 is grouped by itself because there are no rows in the dataset where the displayid and objectid match within a 5 minute of that groups latest timestamp (row 3).

Row 4, 5 and 6 are grouped because the displayid's match, the objectid's match and the rows timestamps are within 5 minutes of the last row, row 5's timestamp is within 5 minutes of row 4's, and row 6 timestamp is within 5 minutes of row 5's.

Row 7 and 8 are grouped because the objectid's and displayid's are the same and row 8's timestamp is within 5 minutes of that groups latest timestamp (row 7).

Row 9 is grouped by itself because there are no rows in the dataset where the displayid and objectid match within a 5 minute of that groups latest timestamp

*I am using SQL Server 2008 R2

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2012-04-16 : 08:08:24
In the code below, the red 1000 is a hack to make sure the group Id's do not step on each other. If you make the group id's unique identifiers, that problem would go away.

;WITH A AS
(
SELECT *,ROW_NUMBER() OVER (PARTITION BY [DISPLAYID],[OBJECTID] ORDER BY [TIMESTAMP]) AS RN
FROM [ExampleLog]
),
B AS
(
SELECT *, 1000*ROW_NUMBER() OVER (ORDER BY RN) AS GroupId FROM A WHERE RN=1
UNION ALL
SELECT a.*,
CASE
WHEN DATEDIFF(minute,b.[TIMESTAMP],a.[TIMESTAMP]) <= 5 THEN b.GroupId
ELSE b.GroupId + 1
END
FROM
A
INNER JOIN B ON
a.[DISPLAYID] = b.[DISPLAYID] AND a.[OBJECTID] = b.[OBJECTID] AND a.RN = b.RN+1

)
SELECT *
FROM B
ORDER BY [TIMESTAMP]
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-04-16 : 11:38:20
see

http://visakhm.blogspot.com/2010/02/aggregating-data-over-time-slots.html

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2012-04-16 : 12:10:05
quote:
Originally posted by visakh16

see

http://visakhm.blogspot.com/2010/02/aggregating-data-over-time-slots.html

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/



Visakh, I don't think that will do it for the OP. OP is not looking for fixed time slots. Rather, s/he is trying to group islands where any given row is within five minutes of another row.

What I posted earlier would work, although treating the groups as islands as described in the "Islands—solution 3 using group identifier
based on ranking calculations" section in Itzik Ben-Gan's article might be more efficient. http://www.manning.com/nielsen/SampleChapter5.pdf
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2012-04-16 : 14:28:49
[code]
drop table #ExampleLog
CREATE TABLE #ExampleLog(
[ID] [uniqueidentifier] NOT NULL,
[EVENTNUMBER] [int] NULL,
[DISPLAYSERIAL] [int] NULL,
[DISPLAYID] [varchar](50) NULL,
[OBJECTSERIAL] [int] NULL,
[OBJECTRSSILEVEL] [int] NULL,
[OBJECTID] [varchar](50) NULL,
[TIMESTAMP] [datetime] NULL,
sequence int primary key not null identity(1,1),
groupid int null)



INSERT INTO #ExampleLog ([ID] ,
[EVENTNUMBER],
[DISPLAYSERIAL],
[DISPLAYID],
[OBJECTSERIAL],
[OBJECTRSSILEVEL],
[OBJECTID] ,
[TIMESTAMP]
)
select NewID(), 206, 6897913, 'HV 644' as displayid, 6885819, 66 , 'WAP 8' as objectid, '2012-01-01 12:12:00' as mytime union all
select NewID(), 206, 6897913, 'HV 644', 6885819, 66, 'WAP 8', '2012-01-01 12:13:00' union all
select NewID(), 206, 6897913, 'HV 644', 6877807, 54, 'MV 20', '2012-01-01 12:16:00' union all
select NewID(), 103, 6897913, 'HV 644', 6898052, 118, 'HV 62', '2012-01-01 12:20:00' union all
select NewID(), 206, 6897913, 'HV 644', 6886130, 122, 'HV 62', '2012-01-01 12:21:00' union all
select NewID(), 206, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:25:00' union all
select NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:31:00' union all
select NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:33:00' union all
select NewID(), 104, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:39:00'
order by displayid,objectid,mytime

select * from #ExampleLog
declare @anchor int,@groupid int, @timestamp as datetime,@displayid varchar(30),@objectid varchar(30)
set @groupid = 0

update a
set
@groupid = groupid = case when @objectid = objectid and @displayid = displayid and abs(DATEDIFF(minute,@TIMESTAMP,[timestamp])) <= 5 then @Groupid else @Groupid + 1 end
,@timestamp = [timestamp]
,@Displayid = displayid
,@Objectid = objectid
,@anchor = sequence
from #ExampleLog a

select * from #Examplelog order by timestamp
/*
ID EVENTNUMBER DISPLAYSERIAL DISPLAYID OBJECTSERIAL OBJECTRSSILEVEL OBJECTID TIMESTAMP sequence groupid
641B4DE5-3451-4897-B35F-590BBBDDE3D3 206 6897913 HV 644 6885819 66 WAP 8 2012-01-01 12:12:00.000 8 5
3F31F09D-2448-4F22-999F-DFF019C800A7 206 6897913 HV 644 6885819 66 WAP 8 2012-01-01 12:13:00.000 9 5
558857BA-A15D-48A8-8C11-ADFC63FB936C 206 6897913 HV 644 6877807 54 MV 20 2012-01-01 12:16:00.000 7 4
8D1BBE0F-A782-4E4D-B4FB-66894772FAA9 103 6897913 HV 644 6898052 118 HV 62 2012-01-01 12:20:00.000 1 1
7D2FA978-F2EE-4507-973A-F7E8BC30447A 206 6897913 HV 644 6886130 122 HV 62 2012-01-01 12:21:00.000 2 1
DAD47CDD-3B29-4F22-B5A6-1E76E80EDC13 206 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:25:00.000 3 1
3D74DFDB-5ACA-4A2F-A281-C0D82E25B4DE 204 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:31:00.000 4 2
727768FA-93CB-420E-B3E7-E28FA6C6F7E6 204 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:33:00.000 5 2
6B533E4B-74D2-4063-B765-DA54D63335B3 104 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:39:00.000 6 3
*/

[/code]


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

cardos
Starting Member

2 Posts

Posted - 2012-04-17 : 19:11:54
Thanks very much for taking a look at my problem, it's much appreciated, this soloution was provided to me from a user of another forum and works well.

WITH 
cte_remove AS (
SELECT b.*
FROM dbo.examplelog a
INNER JOIN dbo.examplelog b ON
a.ID != b.ID AND
a.DISPLAYID = b.DISPLAYID AND
a.OBJECTID = b.OBJECTID AND
b.[timestamp] BETWEEN a.[timestamp] AND DATEADD(SECOND, 300, a.[timestamp])
),
cte_results AS (
SELECT * FROM dbo.examplelog EXCEPT
SELECT * FROM cte_remove
)
SELECT * FROM cte_results ORDER BY [timestamp]
GO
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2012-04-18 : 11:16:00
The query you provided is different than what was requested. It does not group, it simply returns the line item you can use for grouping. In order to group it you would need to join it back to the original table again.

The query I provided should outperform that one as well, but will provide GROUPED results as you requested (THE GROUPID FIELD I ADDED) to address your original post.


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page
   

- Advertisement -