Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Update Help Required to Fix Duplicate Data Issues

Author  Topic 

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-08 : 15:46:53
My SQL skills are pretty limited, and I need to fix some issues with duplicated data caused by a software bug. I need to update all records in one table with the id of first occurrence of a value in another table if that makes sense.

So given that the tables are as follows (simplified for example):

Registrations
- EventHostId

EventHost
- Id
- Name

The EventHost table contains many duplicates so I need to select the id of the first Host in EventHosts that has the same name as the host referenced in Registrations.EventHostId. Then I need to update all Registrations with id's that are not the first selected in the earlier step but have the same [Name]. Following the success of this I would then like to delete all of the duplicates leaving just one of each host name in the EventHosts table.

Hopefully I have explained that well enough for someone to understand what I need to achieve.

TIA, Simon

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-02-08 : 16:11:39
[code]
;With Events
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Id) AS Rn,Id,Name
FROM EventHost
)


UPDATE r
SET r.EventHostId = e2.Id
FROM Registrations r
INNER JOIN Events e1
ON e1.Id= r.EventHostId
INNER JOIN Events e2
ON e2.Name = e1.Name
AND e2.Rn=1
WHERE e2.Id <> r.EventHostId

DELETE t
FROM EventHost t
INNER JOIN Events e
ON e.Id = t.Id
AND e.Rn > 1
[/code]

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-08 : 16:41:01
Thank you I will try that and report back back on my success.

www.prolificnotion.co.uk
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-02-08 : 21:31:55
no problems
let us know how you got on

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-09 : 01:19:09
On executing the below I get no rows updated?

[quote]Originally posted by visakh16


;With Events
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Id) AS Rn,Id,Name
FROM EventHost
)


UPDATE r
SET r.EventHostId = e2.Id
FROM Registrations r
INNER JOIN Events e1
ON e1.Id= r.EventHostId
INNER JOIN Events e2
ON e2.Name = e1.Name
AND e2.Rn=1
WHERE e2.Id <> r.EventHostId



www.prolificnotion.co.uk
Go to Top of Page

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-09 : 12:10:02
Thanks for the help so far but I'm currently a bit lost as to what I need to change in order for the changes to be applied. As I am unfamiliar with the syntax of the query you suggested - any help would be most appreciated.

www.prolificnotion.co.uk
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-02-09 : 14:24:42
can you show some sample data from the tables EventHost and Registrations? Re you sure you've multiple records with same Name in EventHosts?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-09 : 16:03:09
My apologies I restored the database and it has worked perfectly. Thank you so much for the help :)

www.prolificnotion.co.uk
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-02-09 : 17:42:37
welcome

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-10 : 04:30:12
I've taken your solution in principle and attempted to use it to fix other related issues caused by the software bug however this problem presents a different problem - when the update runs it violates a Primary Key constraint.

Example query:

;With Venues
AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Id) AS Rn,Id,Name
FROM [Venue]
)

UPDATE e
SET e.VenueId = e2.Id
FROM VenueEvent e
INNER JOIN Venues e1
ON e1.Id= e.VenueId
INNER JOIN Venues e2
ON e2.Name = e1.Name
AND e2.Rn=1
WHERE e2.Id <> e.VenueId

Error message:

Violation of PRIMARY KEY constraint 'PK_VenueEvent'. Cannot insert duplicate key in object 'dbo.VenueEvent'.

The table VenueEvent has a composite primary key.

www.prolificnotion.co.uk
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-02-10 : 09:26:23
which are columns that are part of 'PK_VenueEvent' primary key? Is VenueId column primary key of table VenueEvent ?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-10 : 09:30:26
Yes VenueId and EventId are both Primary keys for the VenueEvent table.

www.prolificnotion.co.uk
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-02-10 : 09:38:57
just analysed this update and i'm bit confused
can you explain the need of last update? whats purpose of updating venueid? so are you trying to map multiple venues to single venue?

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page

CreativeNRG
Starting Member

44 Posts

Posted - 2012-02-13 : 07:35:47
Sure, the same bug that affected EventHost table also affected the Venues table so we have lots of duplicate venues to consolidate in much the same was as you did before.

www.prolificnotion.co.uk
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2012-02-13 : 09:46:19
quote:
Originally posted by CreativeNRG

Sure, the same bug that affected EventHost table also affected the Venues table so we have lots of duplicate venues to consolidate in much the same was as you did before.

www.prolificnotion.co.uk


thats not possible as far as venueid is primary key
you can make multiple records point to same venueid
so you need to modify primary key before applying such a change
just decide which all combinations of columns will identify a unique row and change primary key accordingly

------------------------------------------------------------------------------------------------------
SQL Server MVP
http://visakhm.blogspot.com/

Go to Top of Page
   

- Advertisement -