Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 Transact-SQL (2000)
 finding like duplicates

Author  Topic 

qwertyjjj
Posting Yak Master

131 Posts

Posted - 2008-07-22 : 12:46:52
We have some customer names that are not identical in our database even though we know they are the same customer. Usually the first x characters are the same but not always.
Is there a way to find this data by joining on itself?

e.g.
FRANK JENSON ACCOUNTS
FRANK JENSON LTD
FRANK JENSON LIMITED

These are all the same company but would like to find them programatically.
There will be a few spurious ones such as FRAN LTD that might actually be different.

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2008-07-22 : 13:02:45
Use Soundex or similar logic.



E 12°55'05.25"
N 56°04'39.16"
Go to Top of Page

visakh16
Very Important crosS Applying yaK Herder

52326 Posts

Posted - 2008-07-22 : 14:35:27
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=106824
Go to Top of Page

qwertyjjj
Posting Yak Master

131 Posts

Posted - 2008-07-23 : 05:24:08
Does fuzzy logic work in SQL 2000? Seems to only be part of SQL 2005 SQL Server Integration Services?
So far I have the following but it;s not producing very exact results:

SELECT a.companyname, b.companyname
from companies a
inner join companies b on SOUNDEX (a.[companyname]) = SOUNDEX (b.[companyname])
WHERE DIFFERENCE(a.[companyname],b.[companyname]) >=3
Go to Top of Page

blindman
Master Smack Fu Yak Hacker

2365 Posts

Posted - 2008-07-23 : 12:19:56
SOUNDEX is not appropriate for string matching. It will not find spelling mistakes or variations.
Use this instead: http://sqlblindman.googlepages.com/fuzzysearchalgorithm

e4 d5 xd5 Nf6
Go to Top of Page
   

- Advertisement -