Author |
Topic |
cardos
Starting Member
2 Posts |
Posted - 2012-04-16 : 01:27:55
|
I need to find a way of grouping rows that occur within 5 minutes of the latest timestamp in the group and the displayid and object id are the same..... CREATE TABLE [dbo].[ExampleLog]( [ID] [uniqueidentifier] NOT NULL, [EVENTNUMBER] [int] NULL, [DISPLAYSERIAL] [int] NULL, [DISPLAYID] [varchar](50) NULL, [OBJECTSERIAL] [int] NULL, [OBJECTRSSILEVEL] [int] NULL, [OBJECTID] [varchar](50) NULL, [TIMESTAMP] [datetime] NULL) INSERT INTO [dbo].[ExampleLog] VALUES (NewID(), 206, 6897913, 'HV 644', 6885819, 66, 'WAP 8', '2012-01-01 12:12:00'), (NewID(), 206, 6897913, 'HV 644', 6885819, 66, 'WAP 8', '2012-01-01 12:13:00'), (NewID(), 206, 6897913, 'HV 644', 6877807, 54, 'MV 20', '2012-01-01 12:16:00'), (NewID(), 103, 6897913, 'HV 644', 6898052, 118, 'HV 62', '2012-01-01 12:20:00'), (NewID(), 206, 6897913, 'HV 644', 6886130, 122, 'HV 62', '2012-01-01 12:21:00'), (NewID(), 206, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:25:00'), (NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:31:00'), (NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:33:00'), (NewID(), 104, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:39:00') so using the dataset above, my results would ideally be grouped as follows;Rows 1 and 2 are grouped because the objectid and displayid are the same and row 2 occurs within 5 minutes of row 1, there are no other rows in the dataset that contain the same displayid and objectid and a timestamp within 5 minutes of the group’s latest timestamp (row 2) so..Row 3 is grouped by itself because there are no rows in the dataset where the displayid and objectid match within a 5 minute of that groups latest timestamp (row 3).Row 4, 5 and 6 are grouped because the displayid's match, the objectid's match and the rows timestamps are within 5 minutes of the last row, row 5's timestamp is within 5 minutes of row 4's, and row 6 timestamp is within 5 minutes of row 5's.Row 7 and 8 are grouped because the objectid's and displayid's are the same and row 8's timestamp is within 5 minutes of that groups latest timestamp (row 7). Row 9 is grouped by itself because there are no rows in the dataset where the displayid and objectid match within a 5 minute of that groups latest timestamp*I am using SQL Server 2008 R2 |
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-04-16 : 08:08:24
|
In the code below, the red 1000 is a hack to make sure the group Id's do not step on each other. If you make the group id's unique identifiers, that problem would go away.;WITH A AS( SELECT *,ROW_NUMBER() OVER (PARTITION BY [DISPLAYID],[OBJECTID] ORDER BY [TIMESTAMP]) AS RN FROM [ExampleLog]),B AS( SELECT *, 1000*ROW_NUMBER() OVER (ORDER BY RN) AS GroupId FROM A WHERE RN=1 UNION ALL SELECT a.*, CASE WHEN DATEDIFF(minute,b.[TIMESTAMP],a.[TIMESTAMP]) <= 5 THEN b.GroupId ELSE b.GroupId + 1 END FROM A INNER JOIN B ON a.[DISPLAYID] = b.[DISPLAYID] AND a.[OBJECTID] = b.[OBJECTID] AND a.RN = b.RN+1 )SELECT *FROM BORDER BY [TIMESTAMP] |
 |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
|
sunitabeck
Master Smack Fu Yak Hacker
5155 Posts |
Posted - 2012-04-16 : 12:10:05
|
quote: Originally posted by visakh16 seehttp://visakhm.blogspot.com/2010/02/aggregating-data-over-time-slots.html------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/
Visakh, I don't think that will do it for the OP. OP is not looking for fixed time slots. Rather, s/he is trying to group islands where any given row is within five minutes of another row.What I posted earlier would work, although treating the groups as islands as described in the "Islands—solution 3 using group identifier based on ranking calculations" section in Itzik Ben-Gan's article might be more efficient. http://www.manning.com/nielsen/SampleChapter5.pdf |
 |
|
Vinnie881
Master Smack Fu Yak Hacker
1231 Posts |
Posted - 2012-04-16 : 14:28:49
|
[code]drop table #ExampleLogCREATE TABLE #ExampleLog( [ID] [uniqueidentifier] NOT NULL, [EVENTNUMBER] [int] NULL, [DISPLAYSERIAL] [int] NULL, [DISPLAYID] [varchar](50) NULL, [OBJECTSERIAL] [int] NULL, [OBJECTRSSILEVEL] [int] NULL, [OBJECTID] [varchar](50) NULL, [TIMESTAMP] [datetime] NULL, sequence int primary key not null identity(1,1), groupid int null)INSERT INTO #ExampleLog ([ID] , [EVENTNUMBER], [DISPLAYSERIAL], [DISPLAYID], [OBJECTSERIAL], [OBJECTRSSILEVEL], [OBJECTID] , [TIMESTAMP])select NewID(), 206, 6897913, 'HV 644' as displayid, 6885819, 66 , 'WAP 8' as objectid, '2012-01-01 12:12:00' as mytime union all select NewID(), 206, 6897913, 'HV 644', 6885819, 66, 'WAP 8', '2012-01-01 12:13:00' union all select NewID(), 206, 6897913, 'HV 644', 6877807, 54, 'MV 20', '2012-01-01 12:16:00' union all select NewID(), 103, 6897913, 'HV 644', 6898052, 118, 'HV 62', '2012-01-01 12:20:00' union all select NewID(), 206, 6897913, 'HV 644', 6886130, 122, 'HV 62', '2012-01-01 12:21:00' union all select NewID(), 206, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:25:00' union all select NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:31:00' union all select NewID(), 204, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:33:00' union all select NewID(), 104, 6897913, 'HV 644', 6886130, 106, 'HV 62', '2012-01-01 12:39:00'order by displayid,objectid,mytimeselect * from #ExampleLogdeclare @anchor int,@groupid int, @timestamp as datetime,@displayid varchar(30),@objectid varchar(30)set @groupid = 0update aset @groupid = groupid = case when @objectid = objectid and @displayid = displayid and abs(DATEDIFF(minute,@TIMESTAMP,[timestamp])) <= 5 then @Groupid else @Groupid + 1 end,@timestamp = [timestamp],@Displayid = displayid,@Objectid = objectid,@anchor = sequencefrom #ExampleLog aselect * from #Examplelog order by timestamp/*ID EVENTNUMBER DISPLAYSERIAL DISPLAYID OBJECTSERIAL OBJECTRSSILEVEL OBJECTID TIMESTAMP sequence groupid641B4DE5-3451-4897-B35F-590BBBDDE3D3 206 6897913 HV 644 6885819 66 WAP 8 2012-01-01 12:12:00.000 8 53F31F09D-2448-4F22-999F-DFF019C800A7 206 6897913 HV 644 6885819 66 WAP 8 2012-01-01 12:13:00.000 9 5558857BA-A15D-48A8-8C11-ADFC63FB936C 206 6897913 HV 644 6877807 54 MV 20 2012-01-01 12:16:00.000 7 48D1BBE0F-A782-4E4D-B4FB-66894772FAA9 103 6897913 HV 644 6898052 118 HV 62 2012-01-01 12:20:00.000 1 17D2FA978-F2EE-4507-973A-F7E8BC30447A 206 6897913 HV 644 6886130 122 HV 62 2012-01-01 12:21:00.000 2 1DAD47CDD-3B29-4F22-B5A6-1E76E80EDC13 206 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:25:00.000 3 13D74DFDB-5ACA-4A2F-A281-C0D82E25B4DE 204 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:31:00.000 4 2727768FA-93CB-420E-B3E7-E28FA6C6F7E6 204 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:33:00.000 5 26B533E4B-74D2-4063-B765-DA54D63335B3 104 6897913 HV 644 6886130 106 HV 62 2012-01-01 12:39:00.000 6 3*/[/code] Success is 10% Intelligence, 70% Determination, and 22% Stupidity.\_/ _/ _/\_/ _/\_/ _/ _/- 881 |
 |
|
cardos
Starting Member
2 Posts |
Posted - 2012-04-17 : 19:11:54
|
Thanks very much for taking a look at my problem, it's much appreciated, this soloution was provided to me from a user of another forum and works well.WITH cte_remove AS ( SELECT b.* FROM dbo.examplelog a INNER JOIN dbo.examplelog b ON a.ID != b.ID AND a.DISPLAYID = b.DISPLAYID AND a.OBJECTID = b.OBJECTID AND b.[timestamp] BETWEEN a.[timestamp] AND DATEADD(SECOND, 300, a.[timestamp])),cte_results AS ( SELECT * FROM dbo.examplelog EXCEPT SELECT * FROM cte_remove )SELECT * FROM cte_results ORDER BY [timestamp]GO |
 |
|
Vinnie881
Master Smack Fu Yak Hacker
1231 Posts |
Posted - 2012-04-18 : 11:16:00
|
The query you provided is different than what was requested. It does not group, it simply returns the line item you can use for grouping. In order to group it you would need to join it back to the original table again.The query I provided should outperform that one as well, but will provide GROUPED results as you requested (THE GROUPID FIELD I ADDED) to address your original post. Success is 10% Intelligence, 70% Determination, and 22% Stupidity.\_/ _/ _/\_/ _/\_/ _/ _/- 881 |
 |
|
|
|
|