Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
Jaypoc
Yak Posting Veteran
58 Posts |
Posted - 2008-09-09 : 18:03:24
|
I have 2 views that show counts of the statistics in a log file. One shows them broken down by date, the other by hour. For some reason the two show different totals (when the columns are added together) I was curious where the discrepancy was. I have the ASP page add all the "count" elements together and the numbers are off. TABLE: access_logdatetime, datetime (Auto-populates with getDate())status, varchar (Action- LOAD, SEARCH <type>, POLLmessage, varchar (Data pertaining to the Action/status)Sample Data:9/5/2008 5:43:24 | LOAD | /9/6/2008 12:00:01 | LOAD | /help.html9/6/2008 4:18:44 | SEARCH CITY | PLAINVIEW9/7/2008 8:00:24 | SEARCH ZIP | 118039/7/2008 1:31:23 | POLL | ID12345VIEWS: (These are the actual views I'm using)CREATE VIEW dbo.log_Daily_COMBINEDASSELECT TOP 31 a.[Date], a.Hits AS hits, b.Hits AS searches, c.Hits AS pollsFROM (SELECT TOP 100 PERCENT CONVERT(varchar, datetime, 1) AS [Date], COUNT(*) AS Hits FROM dbo.access_log WHERE (status = 'LOAD') AND (message = '/') GROUP BY CONVERT(varchar, datetime, 1) ORDER BY Date DESC) a FULL OUTER JOIN (SELECT TOP 100 PERCENT CONVERT(varchar, datetime, 1) AS [Date], COUNT(*) AS Hits FROM dbo.access_log WHERE (status LIKE 'SEARCH%') GROUP BY CONVERT(varchar, datetime, 1) ORDER BY Date DESC) b ON a.[Date] = b.[Date] FULL OUTER JOIN (SELECT TOP 100 PERCENT CONVERT(varchar, datetime, 1) AS [Date], COUNT(*) AS Hits FROM dbo.access_log WHERE (status = 'POLL') GROUP BY CONVERT(varchar, datetime, 1) ORDER BY Date DESC) c ON a.[Date] = c.[Date]ORDER BY a.[Date] DESC and CREATE VIEW dbo.log_Hourly_statisticsASSELECT TOP 100 PERCENT q.hour_display, q.ct AS hits, w.ct AS searches, e.ct AS pollsFROM (SELECT TOP 100 PERCENT l.hour_display, COUNT(*) AS ct FROM (SELECT 0 AS start_hour, '00:00-00:59' AS hour_display UNION ALL SELECT 1, '01:00-01:59' UNION ALL SELECT 4, '04:00-04:59' UNION ALL SELECT 5, '05:00-05:59' UNION ALL SELECT 6, '06:00-06:59' UNION ALL SELECT 7, '07:00-07:59' UNION ALL SELECT 8, '08:00-08:59' UNION ALL SELECT 9, '09:00-09:59' UNION ALL SELECT 10, '10:00-10:59' UNION ALL SELECT 11, '11:00-11:59' UNION ALL SELECT 12, '12:00-12:59' UNION ALL SELECT 13, '13:00-13:59' UNION ALL SELECT 14, '14:00-14:59' UNION ALL SELECT 15, '15:00-15:59' UNION ALL SELECT 16, '16:00-16:59' UNION ALL SELECT 17, '17:00-17:59' UNION ALL SELECT 18, '18:00-18:59' UNION ALL SELECT 19, '19:00-19:59' UNION ALL SELECT 20, '20:00-20:59' UNION ALL SELECT 21, '21:00-21:59' UNION ALL SELECT 22, '22:00-22:59' UNION ALL SELECT 23, '23:00-24:00') l LEFT OUTER JOIN dbo.access_log a ON l.start_hour = DATEPART(hour, a.datetime) AND (a.status = 'LOAD') AND (a.message = '/') GROUP BY l.hour_display) q FULL OUTER JOIN (SELECT TOP 100 PERCENT l.hour_display, COUNT(*) AS ct FROM (SELECT 0 AS start_hour, '00:00-00:59' AS hour_display UNION ALL SELECT 1, '01:00-01:59' UNION ALL SELECT 4, '04:00-04:59' UNION ALL SELECT 5, '05:00-05:59' UNION ALL SELECT 6, '06:00-06:59' UNION ALL SELECT 7, '07:00-07:59' UNION ALL SELECT 8, '08:00-08:59' UNION ALL SELECT 9, '09:00-09:59' UNION ALL SELECT 10, '10:00-10:59' UNION ALL SELECT 11, '11:00-11:59' UNION ALL SELECT 12, '12:00-12:59' UNION ALL SELECT 13, '13:00-13:59' UNION ALL SELECT 14, '14:00-14:59' UNION ALL SELECT 15, '15:00-15:59' UNION ALL SELECT 16, '16:00-16:59' UNION ALL SELECT 17, '17:00-17:59' UNION ALL SELECT 18, '18:00-18:59' UNION ALL SELECT 19, '19:00-19:59' UNION ALL SELECT 20, '20:00-20:59' UNION ALL SELECT 21, '21:00-21:59' UNION ALL SELECT 22, '22:00-22:59' UNION ALL SELECT 23, '23:00-24:00') l LEFT OUTER JOIN dbo.access_log a ON l.start_hour = DATEPART(hour, a.datetime) AND a.status LIKE 'SEARCH%' GROUP BY l.hour_display) w ON q.hour_display = w.hour_display FULL OUTER JOIN (SELECT TOP 100 PERCENT l.hour_display, COUNT(*) AS ct FROM (SELECT 0 AS start_hour, '00:00-00:59' AS hour_display UNION ALL SELECT 1, '01:00-01:59' UNION ALL SELECT 4, '04:00-04:59' UNION ALL SELECT 5, '05:00-05:59' UNION ALL SELECT 6, '06:00-06:59' UNION ALL SELECT 7, '07:00-07:59' UNION ALL SELECT 8, '08:00-08:59' UNION ALL SELECT 9, '09:00-09:59' UNION ALL SELECT 10, '10:00-10:59' UNION ALL SELECT 11, '11:00-11:59' UNION ALL SELECT 12, '12:00-12:59' UNION ALL SELECT 13, '13:00-13:59' UNION ALL SELECT 14, '14:00-14:59' UNION ALL SELECT 15, '15:00-15:59' UNION ALL SELECT 16, '16:00-16:59' UNION ALL SELECT 17, '17:00-17:59' UNION ALL SELECT 18, '18:00-18:59' UNION ALL SELECT 19, '19:00-19:59' UNION ALL SELECT 20, '20:00-20:59' UNION ALL SELECT 21, '21:00-21:59' UNION ALL SELECT 22, '22:00-22:59' UNION ALL SELECT 23, '23:00-24:00') l LEFT OUTER JOIN dbo.access_log a ON l.start_hour = DATEPART(hour, a.datetime) AND a.status LIKE 'POLL%' GROUP BY l.hour_display) e ON w.hour_display = e.hour_displayORDER BY q.hour_display For the DAILY breakdown I get 854 Hits, 513 Searches and 216 PollsFor the HOURLY breakdown I get 841 Hits, 508 Searches and 217 PollsThey're close, but shouldn't they be exact? if not, how come? what did I miss?I appreciate a new set of eyes to analyze this for me (or recommend a better way to do it!) |
|
Jaypoc
Yak Posting Veteran
58 Posts |
Posted - 2008-09-10 : 10:19:43
|
I found one bug when checking for replies today. I noticed that the 2:00 and 3:00 hours were missing. I corrected that, but the numbers are still slightly off:DAILY : 937 HITS / 559 SEARCHES / 234 POLLSHOURLY: 925 HITS / 554 SEARCHES / 234 POLLS(Hey, the polls are counting properly. lol) |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-09-10 : 11:14:09
|
[code]CREATE VIEW dbo.Log_DailyASSELECT TOP 31 DATEADD(DAY, DATEDIFF(DAY, '19000101', Datetime), '19000101') AS Date, SUM(CASE WHEN Status = 'LOAD' AND Message = '/' THEN 1 ELSE 0 END) AS Hits, SUM(CASE WHEN Status LIKE 'SEARCH%' THEN 1 ELSE 0 END) AS Searches, SUM(CASE WHEN Status = 'POLL' THEN 1 ELSE 0 END) AS PollsFROM dbo.Access_LogWHERE Status = 'LOAD' AND Message = '/' OR Status LIKE 'SEARCH%' OR Status = 'POLL'GROUP BY DATEADD(DAY, DATEDIFF(DAY, '19000101', Datetime), '19000101')ORDER BY DATEADD(DAY, DATEDIFF(DAY, '19000101', Datetime), '19000101') DESC[/code] E 12°55'05.63"N 56°04'39.26" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-09-10 : 11:19:41
|
The difference in your queries are that 1) first view has a.Status = 'Poll' and second view has a.Status LIKE 'POLL%'2) You do a COUNT(*) over a LEFT JOIN query. A match is counted once and so is a mismatch.3) the log table has records for more than one date, thus grouping over hour includes several days. E 12°55'05.63"N 56°04'39.26" |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2008-09-10 : 11:21:45
|
[code]CREATE VIEW dbo.Log_HourlyASSELECT l.Hour_Display, SUM(CASE WHEN a.Status = 'LOAD' AND a.Message = '/' THEN 1 ELSE 0 END) AS Hits, SUM(CASE WHEN a.Status LIKE 'SEARCH%' THEN 1 ELSE 0 END) AS Searches, SUM(CASE WHEN a.Status LIKE 'POLL%' THEN 1 ELSE 0 END) AS PollsFROM ( SELECT 0 AS Start_Hour, '00:00-00:59' AS Hour_Display UNION ALL SELECT 1, '01:00-01:59' UNION ALL SELECT 2, '02:00-02:59' UNION ALL SELECT 3, '03:00-03:59' UNION ALL SELECT 4, '04:00-04:59' UNION ALL SELECT 5, '05:00-05:59' UNION ALL SELECT 6, '06:00-06:59' UNION ALL SELECT 7, '07:00-07:59' UNION ALL SELECT 8, '08:00-08:59' UNION ALL SELECT 9, '09:00-09:59' UNION ALL SELECT 10, '10:00-10:59' UNION ALL SELECT 11, '11:00-11:59' UNION ALL SELECT 12, '12:00-12:59' UNION ALL SELECT 13, '13:00-13:59' UNION ALL SELECT 14, '14:00-14:59' UNION ALL SELECT 15, '15:00-15:59' UNION ALL SELECT 16, '16:00-16:59' UNION ALL SELECT 17, '17:00-17:59' UNION ALL SELECT 18, '18:00-18:59' UNION ALL SELECT 19, '19:00-19:59' UNION ALL SELECT 20, '20:00-20:59' UNION ALL SELECT 21, '21:00-21:59' UNION ALL SELECT 22, '22:00-22:59' UNION ALL SELECT 23, '23:00-24:00' ) AS lLEFT JOIN dbo.Access_Log AS a ON DATEPART(HOUR, a.Datetime) = l.Start_Hour AND ( a.Status = 'LOAD' AND a.Message = '/' OR a.Status LIKE 'SEARCH%' OR a.Status LIKE 'POLL%' )GROUP BY l.Hour_Display[/code] E 12°55'05.63"N 56°04'39.26" |
 |
|
Jaypoc
Yak Posting Veteran
58 Posts |
Posted - 2008-09-10 : 13:41:13
|
Peso, I replied already, but I just realized that all 3 posts were from you. I want to thank you. Prior to your post, I re-wrote the queries and then found the results were off because of a typo in the code itself (started the loop with MoveNext instead of MoveFirst causing it to skip the first row of records (0:00 - 0:59)Both your Query re-writes appear to be more efficient than mine so I may re-write them again, or at least study yours to improve my syntax and better understand more complex queries.As a note, I found the =POLL and LIKE POLL% differences (which did not affect the reults fortunately) and I did want all-time totals and not for the hourly breakdown.Thanks again! |
 |
|
|
|
|
|
|