Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 Inconsitant Data from 2 Views using 1 Table.

Author  Topic 

Jaypoc
Yak Posting Veteran

58 Posts

Posted - 2008-09-09 : 18:03:24
I have 2 views that show counts of the statistics in a log file. One shows them broken down by date, the other by hour. For some reason the two show different totals (when the columns are added together) I was curious where the discrepancy was. I have the ASP page add all the "count" elements together and the numbers are off.

TABLE: access_log
datetime, datetime (Auto-populates with getDate())
status, varchar (Action- LOAD, SEARCH <type>, POLL
message, varchar (Data pertaining to the Action/status)

Sample Data:
9/5/2008 5:43:24 | LOAD | /
9/6/2008 12:00:01 | LOAD | /help.html
9/6/2008 4:18:44 | SEARCH CITY | PLAINVIEW
9/7/2008 8:00:24 | SEARCH ZIP | 11803
9/7/2008 1:31:23 | POLL | ID12345

VIEWS: (These are the actual views I'm using)



CREATE VIEW dbo.log_Daily_COMBINED
AS
SELECT TOP 31 a.[Date], a.Hits AS hits, b.Hits AS searches, c.Hits AS polls
FROM (SELECT TOP 100 PERCENT CONVERT(varchar, datetime, 1) AS [Date], COUNT(*) AS Hits
FROM dbo.access_log
WHERE (status = 'LOAD') AND (message = '/')
GROUP BY CONVERT(varchar, datetime, 1)
ORDER BY Date DESC) a FULL OUTER JOIN
(SELECT TOP 100 PERCENT CONVERT(varchar, datetime, 1) AS [Date], COUNT(*) AS Hits
FROM dbo.access_log
WHERE (status LIKE 'SEARCH%')
GROUP BY CONVERT(varchar, datetime, 1)
ORDER BY Date DESC) b ON a.[Date] = b.[Date] FULL OUTER JOIN
(SELECT TOP 100 PERCENT CONVERT(varchar, datetime, 1) AS [Date], COUNT(*) AS Hits
FROM dbo.access_log
WHERE (status = 'POLL')
GROUP BY CONVERT(varchar, datetime, 1)
ORDER BY Date DESC) c ON a.[Date] = c.[Date]
ORDER BY a.[Date] DESC


and


CREATE VIEW dbo.log_Hourly_statistics
AS
SELECT TOP 100 PERCENT q.hour_display, q.ct AS hits, w.ct AS searches, e.ct AS polls
FROM (SELECT TOP 100 PERCENT l.hour_display, COUNT(*) AS ct
FROM (SELECT 0 AS start_hour, '00:00-00:59' AS hour_display
UNION ALL
SELECT 1, '01:00-01:59'
UNION ALL
SELECT 4, '04:00-04:59'
UNION ALL
SELECT 5, '05:00-05:59'
UNION ALL
SELECT 6, '06:00-06:59'
UNION ALL
SELECT 7, '07:00-07:59'
UNION ALL
SELECT 8, '08:00-08:59'
UNION ALL
SELECT 9, '09:00-09:59'
UNION ALL
SELECT 10, '10:00-10:59'
UNION ALL
SELECT 11, '11:00-11:59'
UNION ALL
SELECT 12, '12:00-12:59'
UNION ALL
SELECT 13, '13:00-13:59'
UNION ALL
SELECT 14, '14:00-14:59'
UNION ALL
SELECT 15, '15:00-15:59'
UNION ALL
SELECT 16, '16:00-16:59'
UNION ALL
SELECT 17, '17:00-17:59'
UNION ALL
SELECT 18, '18:00-18:59'
UNION ALL
SELECT 19, '19:00-19:59'
UNION ALL
SELECT 20, '20:00-20:59'
UNION ALL
SELECT 21, '21:00-21:59'
UNION ALL
SELECT 22, '22:00-22:59'
UNION ALL
SELECT 23, '23:00-24:00') l LEFT OUTER JOIN
dbo.access_log a ON l.start_hour = DATEPART(hour, a.datetime) AND (a.status = 'LOAD') AND (a.message = '/')
GROUP BY l.hour_display) q FULL OUTER JOIN
(SELECT TOP 100 PERCENT l.hour_display, COUNT(*) AS ct
FROM (SELECT 0 AS start_hour, '00:00-00:59' AS hour_display
UNION ALL
SELECT 1, '01:00-01:59'
UNION ALL
SELECT 4, '04:00-04:59'
UNION ALL
SELECT 5, '05:00-05:59'
UNION ALL
SELECT 6, '06:00-06:59'
UNION ALL
SELECT 7, '07:00-07:59'
UNION ALL
SELECT 8, '08:00-08:59'
UNION ALL
SELECT 9, '09:00-09:59'
UNION ALL
SELECT 10, '10:00-10:59'
UNION ALL
SELECT 11, '11:00-11:59'
UNION ALL
SELECT 12, '12:00-12:59'
UNION ALL
SELECT 13, '13:00-13:59'
UNION ALL
SELECT 14, '14:00-14:59'
UNION ALL
SELECT 15, '15:00-15:59'
UNION ALL
SELECT 16, '16:00-16:59'
UNION ALL
SELECT 17, '17:00-17:59'
UNION ALL
SELECT 18, '18:00-18:59'
UNION ALL
SELECT 19, '19:00-19:59'
UNION ALL
SELECT 20, '20:00-20:59'
UNION ALL
SELECT 21, '21:00-21:59'
UNION ALL
SELECT 22, '22:00-22:59'
UNION ALL
SELECT 23, '23:00-24:00') l LEFT OUTER JOIN
dbo.access_log a ON l.start_hour = DATEPART(hour, a.datetime) AND a.status LIKE 'SEARCH%'
GROUP BY l.hour_display) w ON q.hour_display = w.hour_display FULL OUTER JOIN
(SELECT TOP 100 PERCENT l.hour_display, COUNT(*) AS ct
FROM (SELECT 0 AS start_hour, '00:00-00:59' AS hour_display
UNION ALL
SELECT 1, '01:00-01:59'
UNION ALL
SELECT 4, '04:00-04:59'
UNION ALL
SELECT 5, '05:00-05:59'
UNION ALL
SELECT 6, '06:00-06:59'
UNION ALL
SELECT 7, '07:00-07:59'
UNION ALL
SELECT 8, '08:00-08:59'
UNION ALL
SELECT 9, '09:00-09:59'
UNION ALL
SELECT 10, '10:00-10:59'
UNION ALL
SELECT 11, '11:00-11:59'
UNION ALL
SELECT 12, '12:00-12:59'
UNION ALL
SELECT 13, '13:00-13:59'
UNION ALL
SELECT 14, '14:00-14:59'
UNION ALL
SELECT 15, '15:00-15:59'
UNION ALL
SELECT 16, '16:00-16:59'
UNION ALL
SELECT 17, '17:00-17:59'
UNION ALL
SELECT 18, '18:00-18:59'
UNION ALL
SELECT 19, '19:00-19:59'
UNION ALL
SELECT 20, '20:00-20:59'
UNION ALL
SELECT 21, '21:00-21:59'
UNION ALL
SELECT 22, '22:00-22:59'
UNION ALL
SELECT 23, '23:00-24:00') l LEFT OUTER JOIN
dbo.access_log a ON l.start_hour = DATEPART(hour, a.datetime) AND a.status LIKE 'POLL%'
GROUP BY l.hour_display) e ON w.hour_display = e.hour_display
ORDER BY q.hour_display


For the DAILY breakdown I get 854 Hits, 513 Searches and 216 Polls
For the HOURLY breakdown I get 841 Hits, 508 Searches and 217 Polls
They're close, but shouldn't they be exact? if not, how come? what did I miss?

I appreciate a new set of eyes to analyze this for me (or recommend a better way to do it!)








Jaypoc
Yak Posting Veteran

58 Posts

Posted - 2008-09-10 : 10:19:43
I found one bug when checking for replies today. I noticed that the 2:00 and 3:00 hours were missing. I corrected that, but the numbers are still slightly off:

DAILY : 937 HITS / 559 SEARCHES / 234 POLLS
HOURLY: 925 HITS / 554 SEARCHES / 234 POLLS

(Hey, the polls are counting properly. lol)
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-10 : 11:14:09
[code]CREATE VIEW dbo.Log_Daily
AS

SELECT TOP 31 DATEADD(DAY, DATEDIFF(DAY, '19000101', Datetime), '19000101') AS Date,
SUM(CASE WHEN Status = 'LOAD' AND Message = '/' THEN 1 ELSE 0 END) AS Hits,
SUM(CASE WHEN Status LIKE 'SEARCH%' THEN 1 ELSE 0 END) AS Searches,
SUM(CASE WHEN Status = 'POLL' THEN 1 ELSE 0 END) AS Polls
FROM dbo.Access_Log
WHERE Status = 'LOAD' AND Message = '/'
OR Status LIKE 'SEARCH%'
OR Status = 'POLL'
GROUP BY DATEADD(DAY, DATEDIFF(DAY, '19000101', Datetime), '19000101')
ORDER BY DATEADD(DAY, DATEDIFF(DAY, '19000101', Datetime), '19000101') DESC[/code]

E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-10 : 11:19:41
The difference in your queries are that

1) first view has a.Status = 'Poll' and second view has a.Status LIKE 'POLL%'
2) You do a COUNT(*) over a LEFT JOIN query. A match is counted once and so is a mismatch.
3) the log table has records for more than one date, thus grouping over hour includes several days.


E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-09-10 : 11:21:45
[code]CREATE VIEW dbo.Log_Hourly
AS

SELECT l.Hour_Display,
SUM(CASE WHEN a.Status = 'LOAD' AND a.Message = '/' THEN 1 ELSE 0 END) AS Hits,
SUM(CASE WHEN a.Status LIKE 'SEARCH%' THEN 1 ELSE 0 END) AS Searches,
SUM(CASE WHEN a.Status LIKE 'POLL%' THEN 1 ELSE 0 END) AS Polls
FROM (
SELECT 0 AS Start_Hour, '00:00-00:59' AS Hour_Display UNION ALL
SELECT 1, '01:00-01:59' UNION ALL
SELECT 2, '02:00-02:59' UNION ALL
SELECT 3, '03:00-03:59' UNION ALL
SELECT 4, '04:00-04:59' UNION ALL
SELECT 5, '05:00-05:59' UNION ALL
SELECT 6, '06:00-06:59' UNION ALL
SELECT 7, '07:00-07:59' UNION ALL
SELECT 8, '08:00-08:59' UNION ALL
SELECT 9, '09:00-09:59' UNION ALL
SELECT 10, '10:00-10:59' UNION ALL
SELECT 11, '11:00-11:59' UNION ALL
SELECT 12, '12:00-12:59' UNION ALL
SELECT 13, '13:00-13:59' UNION ALL
SELECT 14, '14:00-14:59' UNION ALL
SELECT 15, '15:00-15:59' UNION ALL
SELECT 16, '16:00-16:59' UNION ALL
SELECT 17, '17:00-17:59' UNION ALL
SELECT 18, '18:00-18:59' UNION ALL
SELECT 19, '19:00-19:59' UNION ALL
SELECT 20, '20:00-20:59' UNION ALL
SELECT 21, '21:00-21:59' UNION ALL
SELECT 22, '22:00-22:59' UNION ALL
SELECT 23, '23:00-24:00'
) AS l
LEFT JOIN dbo.Access_Log AS a ON DATEPART(HOUR, a.Datetime) = l.Start_Hour
AND (
a.Status = 'LOAD' AND a.Message = '/'
OR a.Status LIKE 'SEARCH%'
OR a.Status LIKE 'POLL%'
)
GROUP BY l.Hour_Display[/code]

E 12°55'05.63"
N 56°04'39.26"
Go to Top of Page

Jaypoc
Yak Posting Veteran

58 Posts

Posted - 2008-09-10 : 13:41:13
Peso,

I replied already, but I just realized that all 3 posts were from you. I want to thank you. Prior to your post, I re-wrote the queries and then found the results were off because of a typo in the code itself (started the loop with MoveNext instead of MoveFirst causing it to skip the first row of records (0:00 - 0:59)

Both your Query re-writes appear to be more efficient than mine so I may re-write them again, or at least study yours to improve my syntax and better understand more complex queries.

As a note, I found the =POLL and LIKE POLL% differences (which did not affect the reults fortunately) and I did want all-time totals and not for the hourly breakdown.

Thanks again!
Go to Top of Page
   

- Advertisement -