Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Performance of merge statement vs separate DMLs?

Author  Topic 

strauss_jon
Starting Member

23 Posts

Posted - 2012-04-23 : 11:42:46
Hi there, thanks for taking the time to read this thread

We have a load of ETLs and I'm going through them all to replace the separate inserts and updates with merge statements for 'better' performance. I carried out some (preliminary, not exhaustive) tests and the merge is about 1/3 times slower than the individual DMLs. I've cleared the data and plan caches before each to ensure fairness (same instance/db) and the execution plan seems to be identical when i combine the insert and update plan with the merge's plan. Is there anything fundamentally different about the merge statement in terms of how it executes (except for the obvious stuff such as single pass only etc). Presumably though, if the execution plans are the same, they're both going to execute in the same way except that the merge does it in 1 pass so should be quicker?

Ive copied the merge code along with the separate DMLs below, nothing complicated. As easy an example as it can be:

-------------
SEPARATE DMLs
-------------

------
INSERT
------
INSERT INTO BookingServicePassengerStaging
(
action,
src_tbx_booking_id,
src_tbx_product_code,
src_tbx_item_no,
src_tbx_passenger_no
)
SELECT 'I',
L.BOOKING_ID,
L.BOOKING_ITEM_REF1,
L.BOOKING_ITEM_REF2,
L.PASSENGER_NO
FROM BookingServicePassengerTbxLoad L
LEFT JOIN BookingServicePassengerStaging S ON
L.BOOKING_ID = S.src_tbx_booking_id
AND L.BOOKING_ITEM_REF1 = S.src_tbx_product_code
AND L.BOOKING_ITEM_REF2 = S.src_tbx_item_no
AND L.PASSENGER_NO = S.src_tbx_passenger_no
WHERE S.src_tbx_booking_id IS NULL

------
UPDATE (soft deletes)
------
UPDATE S
SET S.[Action] = 'D'
FROM BookingServicePassengerStaging S
LEFT JOIN BookingServicePassengerTbxLoad L ON
L.BOOKING_ID = S.src_tbx_booking_id
AND L.BOOKING_ITEM_REF1 = S.src_tbx_product_code
AND L.BOOKING_ITEM_REF2 = S.src_tbx_item_no
AND L.PASSENGER_NO = S.src_tbx_passenger_no
WHERE L.BOOKING_ID IS NULL

MERGE
-----

MERGE STAGING_PRO..BookingServicePassengerStaging AS dest
USING
(
(SELECT ,
BOOKING_ID ,
BOOKING_ITEM_REF1,
BOOKING_ITEM_REF2,
PASSENGER_NO
FROM
Staging_Pro.dbo.BookingServicePassengerTbxLoad)
) AS src
ON
src.BOOKING_ID = dest.src_tbx_booking_id AND
src.BOOKING_ITEM_REF1 = dest.src_tbx_product_code AND
src.BOOKING_ITEM_REF2 = dest.src_tbx_item_no AND
src.PASSENGER_NO = dest.src_tbx_passenger_no
WHEN NOT MATCHED BY SOURCE THEN
UPDATE
SET [Action] = 'D'

WHEN NOT MATCHED BY TARGET
THEN INSERT (
[Action] ,
srC_tbx_BOOKING_ID ,
src_Tbx_product_code,
src_Tbx_item_no,

srC_tbx_PASSENGER_NO
)
VALUES ( 'I' ,
src.BOOKING_ID ,
BOOKING_ITEM_REF1,
BOOKING_ITEM_REF2,

src.PASSENGER_NO
) ;

Thanks in advance! Jon

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-04-23 : 11:47:24
I have always found MERGE statement should be slower compared to individual INSERT UPDATE statements having proper indexes in place.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

sunitabeck
Master Smack Fu Yak Hacker

5155 Posts

Posted - 2012-04-23 : 11:50:25
quote:
Originally posted by visakh16

I have always found MERGE statement should be slower compared to individual INSERT UPDATE statements having proper indexes in place.

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/



Intuitively I would have thought MERGE to be more efficient/faster. But my intuition is always wrong, so no surprise there.

Is MERGE slower because it is one long transaction as opposed to multiple smaller transactions?

Regardless, I am going to cling on to MERGE if I can help - I like the syntax of it, and the fact that it is a single transaction.
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-23 : 11:50:57
it depends on the size of the underlying table(s)
If your BookingServicePassengerTbxLoad, BookingServicePassengerStaging tables have are huge then you will most definitely have issues.

Notice your FROM Staging_Pro.dbo.BookingServicePassengerTbxLoad has no sorts of filter. How many rows does it have? Is it possible you could somehow filter it? Does it have indexes? etc etc


<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-04-23 : 11:56:25
quote:
Originally posted by yosiasz

it depends on the size of the underlying table(s)
If your BookingServicePassengerTbxLoad, BookingServicePassengerStaging tables have are huge then you will most definitely have issues.


Notice your FROM Staging_Pro.dbo.BookingServicePassengerTbxLoad has no sorts of filter. How many rows does it have? Is it possible you could somehow filter it? Does it have indexes? etc etc


<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion


Definitely
The tables I tried on was huge transactional tables with high volume of transactions taking place daily. in such cases INSERT UPDATE proved much better option for me

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

strauss_jon
Starting Member

23 Posts

Posted - 2012-04-23 : 12:03:47
Thanks for your replies, much appreciated!

This is a like for like feasibility test to ensure it's worth doing so i wanted to compare the 2 scenarios on a big table (no filters) to be able to see the difference in performance. The table is 25 million rows strong. With regards to the indexes, a clustered index scan on both load and staging tbls with a hash match join, identical in both scenarios (don't know why it's not doing a merge join, that's something for another time).

Can you explain to me why it's a no-no for big tables as the separate inserts/updates still have to trawl over the same number of rows, twice.

Go to Top of Page

strauss_jon
Starting Member

23 Posts

Posted - 2012-04-23 : 12:10:47
"Is MERGE slower because it is one long transaction as opposed to multiple smaller transactions?"

--> Both inserts and updates are within the same transaction so should be no different to the merge in that respect.
Go to Top of Page

Sachin.Nand

2937 Posts

Posted - 2012-04-23 : 12:23:14
Found this

http://technet.microsoft.com/en-us/library/cc879317.aspx

See if that helps..

After Monday and Tuesday even the calendar says W T F ....
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-04-23 : 12:32:47
quote:
Originally posted by Sachin.Nand

Found this

http://technet.microsoft.com/en-us/library/cc879317.aspx

See if that helps..

After Monday and Tuesday even the calendar says W T F ....


Very informative article

Thanks for sharing

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

strauss_jon
Starting Member

23 Posts

Posted - 2012-04-23 : 12:39:33
Agreed, very useful. A brief look at the article it looks like the biggest gains for the typical scenario is with the index config. This design is already in place so not looking overly promising at the mo but will continue to investigate. Anyway, i'm off home and will pick this up again in the morning. Many thanks for your comments, if anyone has anything else to add, would appreciate as much info as poss. Have a good day :-)
Go to Top of Page
   

- Advertisement -