Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 How to find duplicated records regardless of the v

Author  Topic 

pras2007
Posting Yak Master

216 Posts

Posted - 2010-06-12 : 13:54:09
Hello All,

I have a table similar to the one shown below. The goal is to find duplicated values based on the entire records regardless what field the values are stored, excluding the ID field.

Ex:

Based on the table record 1 and 4 are duplicated, because they have the same values regardless of what field they are stored in.


ID Price1 Price2 Price3 Price4
1 $2.00 $3.00 $6.00 $8.00
2 $5.00 $7.00 $3.00 $7.00
3 $3.00 $6.00 $4.00 $1.00
4 $3.00 $8.00 $2.00 $6.00


Does anyone know how to write the query for this task?

Please advice.

Thanks

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-06-12 : 15:22:30
What datatype are the columns?


N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

pras2007
Posting Yak Master

216 Posts

Posted - 2010-06-12 : 15:31:57
The datatype of the colums are char.

Thanks
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-06-12 : 15:48:58
Oh, why..?
DECLARE	@Sample TABLE
(
ID INT,
Price1 VARCHAR(20),
Price2 VARCHAR(20),
Price3 VARCHAR(20),
Price4 VARCHAR(20)
)

INSERT @Sample
(
ID,
Price1,
Price2,
Price3,
Price4
)
SELECT 1, '$2.00', '$3.00', '$6.00', '$8.00' UNION ALL
SELECT 2, '$5.00', '$7.00', '$3.00', '$7.00' UNION ALL
SELECT 3, '$3.00', '$6.00', '$4.00', '$1.00' UNION ALL
SELECT 4, '$3.00', '$8.00', '$2.00', '$6.00'

-- Peso 1
;WITH cteSource(ID, thePrice, recID)
AS (
SELECT u.ID,
u.thePrice,
ROW_NUMBER() OVER (PARTITION BY u.ID ORDER BY u.thePrice) AS recID
FROM @Sample AS s
UNPIVOT (
thePrice
FOR theCol IN (s.Price1, s.Price2, s.Price3, s.Price4)
) AS u
)
SELECT ID,
CHECKSUM_AGG(CHECKSUM(*))
FROM cteSource
GROUP BY ID

-- Peso 2
;WITH cteSource(ID, thePrice, recID)
AS (
SELECT u.ID,
u.thePrice,
ROW_NUMBER() OVER (PARTITION BY u.ID ORDER BY u.thePrice) AS recID
FROM @Sample AS s
UNPIVOT (
thePrice
FOR theCol IN (s.Price1, s.Price2, s.Price3, s.Price4)
) AS u
)
SELECT s1.ID
FROM cteSource AS s1
LEFT JOIN cteSource AS s2 ON s2.recID = s1.recID
AND s2.thePrice = s1.thePrice
AND s2.ID <> s1.ID
GROUP BY s1.ID
HAVING COUNT(s1.recID) = COUNT(s2.recID)



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2010-06-12 : 15:52:27
[code]
declare @test table(id int, P1 char(5), P2 char(5), P3 char(5), P4 char(5))
insert @test
select 1, '$2.00', '$3.00', '$6.00', '$8.00' union all
select 2, '$5.00', '$7.00', '$3.00', '$7.00' union all
select 3, '$3.00', '$6.00', '$4.00', '$1.00' union all
select 4, '$3.00', '$8.00', '$2.00', '$6.00'

select * from @test t1
where exists(
select * from @test t2
where (t1.P1=t2.P1 or t1.P1=t2.P2 or t1.P1=t2.P3 or t1.P1=t2.P4)
and (t1.P2=t2.P1 or t1.P2=t2.P2 or t1.P2=t2.P3 or t1.P2=t2.P4)
and (t1.P3=t2.P1 or t1.P3=t2.P2 or t1.P3=t2.P3 or t1.P3=t2.P4)
and (t1.P4=t2.P1 or t1.P4=t2.P2 or t1.P4=t2.P3 or t1.P4=t2.P4)
and t1.id <> t2.id
)
[/code]


No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2010-06-12 : 16:18:38
Both solutions will give a wrong output if we add
SELECT 5, '$3.00', '$8.00', '$3.00', '$6.00'
to the test data.

Peter I think your approach with checksum can solve it.


No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2010-06-12 : 16:41:52
A small rewrite of Peso 2
-- Peso 2
;WITH cteSource(ID, thePrice, recID)
AS (
SELECT u.ID,
u.thePrice,
ROW_NUMBER() OVER (PARTITION BY u.ID ORDER BY u.thePrice) AS recID
FROM @Sample AS s
UNPIVOT (
thePrice
FOR theCol IN (s.Price1, s.Price2, s.Price3, s.Price4)
) AS u
)
SELECT s1.ID
FROM cteSource AS s1
LEFT JOIN cteSource AS s2 ON s2.recID = s1.recID
AND s2.thePrice = s1.thePrice
AND s2.ID <> s1.ID
GROUP BY s1.ID,
s2.ID
HAVING COUNT(s2.ID) = 4



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page
   

- Advertisement -