Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
PeteFromOz
Starting Member
8 Posts |
Posted - 2012-03-14 : 20:05:31
|
Hi Folks,(if this looks familiar, I have actually posted it to one other forum, hope thats ok)I am currently producing a report in SQL Server 2005 which needs to aggregate the days difference between two dates The dates are arrived at by calling functions that go off and scan log type tables which are HUGE (58M records and counting).For example. @int1 = sum (dbo.fncGetWorkingDays (dbo.fncGetSpecialDate1 (d.doc_id),dbo.fncGetSpecialDate2 (d.doc_id))), @int2 = sum (datediff (dd,dbo.fncGetSpecialDate1 (d.doc_id),dbo.fncGetSpecialDate2(d.doc_id))) FROM docs d WHERE d.company=@strOrgID AND d.date BETWEEN @dateStart AND @dateEnd @strOrgID, @dateStart and @dateEnd are from the parent function which uses a (gasp) cursor to step through all the companies, but there are only a dozen or so.The code for fncGetSpecialDate1 is simple in and of itself, and gopes something like this... BEGIN DECLARE @tblMinIn TABLE (LogIndex VARCHAR (50)) INSERT INTO @tblMinIn (LogIndex) (SELECT Item FROM dbo.fncSplitCSV (@strMinIn, ',')) SET @dateStart = (SELECT min (al.stamp_date) FROM viewActionLog al WHERE al.log_index IN (SELECT LogIndex FROM @tblMinIn) AND al.doc_id = @strDocId) END The different fncGetSpecialDate functions call the above function with a specific string. The string contains the action log event codes that I am looking for. So it might be '1234,3456,4567' and I look through the action log to find the first instance of any of these events, then return the date of that event. The action log view has a id from the original object that is being logged, so if 20 events happened, then there is 20 rows with the same id... then an index code (which we are serching for) and a date (which we return). I created a view with the minimum number of rows (but still huge) and a unique clustered index on id,index,stampdate.My problem is that the functions fncGetSpecialDate1 and fncGetSpecialDate2 are very expensive as they are the ones that go off to trawl the huge log files (of which I have made an indexed view but are still >10M records). How can I restructure the above SELECT statement so I only have to call the functions once as inputs for both the other functions?Any advice on the aggregation or just plain optimisation strategy much appreciated.Cheers,Pete. |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2012-03-15 : 16:20:38
|
I think issue is basically effort spend on parsing the delimited '1234,3456,4567' value and then comparing each and every values and getting a date corresponding to first occurance of themOne suggestion would be separate this part and prepopulate values to table with date against doc_id and then in above code just include a simple join to the earlier table to get date. That would save some amount of effort in calling this udf and doing this parsing inline for large dataset------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
PeteFromOz
Starting Member
8 Posts |
Posted - 2012-03-16 : 22:52:01
|
Thanks visakh,This is what I came up with... basically hard coding the event codes as they are pretty set anyway, and there are only a few sets I need to query (I guess it was just my OO training that led be down the path of making a generic, reusable function )CREATE FUNCTION dbo.fncGetSLATime (@strDocId VARCHAR (32), @dateStartDate DATETIME,--if I already know one of these dates then pass it in @dateEndDate DATETIME) RETURNS INTAS BEGIN DECLARE @intSLADays INT DECLARE @dateMaxIn DATETIME DECLARE @dateMaxOut DATETIME DECLARE @tblIn TABLE (Log_Index INT) INSERT INTO @tblIn (log_index) (SELECT 6012 UNION SELECT 5013 UNION SELECT 10507 UNION SELECT 1008 UNION SELECT 6007 UNION SELECT 5006) IF @dateStartDate IS NULL SET @dateMaxIn = (SELECT max (al.stamp_date) FROM dbo.viewReporting al INNER JOIN @tblIn ti ON al.log_index = ti.log_index WHERE al.doc_id = @strDocId) ELSE SET @dateMaxIn = @dateStartDate IF @dateEndDate IS NULL SET @dateMaxout = (SELECT max (al.stamp_date) FROM dbo.viewReporting al WHERE al.doc_id = @strDocId AND al.log_index = 10000) ELSE SET @dateMaxIn = @dateStartDate SET @intSLADays = dbo.fncGetWorkingDays (@dateStartDate, @dateEndDate) RETURN @intSLADays END Is that what you had in mind? I think it should be more efficient (?)Thanks again,Cheers,Pete. |
 |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2012-03-16 : 23:00:38
|
i would have put the values in a table in main code itself rather than using a separate udf------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
PeteFromOz
Starting Member
8 Posts |
Posted - 2012-03-17 : 19:39:37
|
Hi visakh,Sorry, i am confused now... is there a significant overhead just calling a udf? I have to do this several times in my main code as i am aggrigatng (see the first bit of code in the original post), so i really only did this to make things neater, but my main issue now is definately performance.any inights much apreciatedCheers,Pete. |
 |
|
visakh16
Very Important crosS Applying yaK Herder
52326 Posts |
Posted - 2012-03-18 : 10:49:04
|
is there a significant overhead just calling a udf? there's a significant impact. why not create a permanent table and populate it first with required data one time. then use it across------------------------------------------------------------------------------------------------------SQL Server MVPhttp://visakhm.blogspot.com/ |
 |
|
|
|
|
|
|