Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 General SQL Server Forums
 New to SQL Server Administration
 dedupe data

Author  Topic 

dtrivedi
Posting Yak Master

153 Posts

Posted - 2010-12-02 : 13:46:45
I have these three tables in SQL. I need to clean it up to send marketing a list by tomorrow

1. rental.dbo is a list of customers who received a thank you card on November 18.
2. mail.dbo is a list of customers who either need to be eliminated because they have already received a card OR have not received a card yet.
3. custmailing.dbo is a list of customers who need to be eliminated completely, regardless of whether they have received a card or not.

So essentially, all names from #3’s file need to be eliminated from #1’s file. Then all names from #1 need to be eliminated from #2. Please let me know if you need further explanation.

These are the names of the coulumns.
Customer Cust Adr1 Cust City Cust State Cust Zip STORE NUMBER/ED REP Item Class


I know it's going to be a join can someone help me dedupe statement?

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-12-02 : 13:51:36
We need sample data and expected output.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

dtrivedi
Posting Yak Master

153 Posts

Posted - 2010-12-02 : 14:00:51
This is the sample data for one of the databases. Each of them look like this but with diffrent customer information. Just want to delete the names from rental.dbo that are on custmailing.dbo (so the records in rental shouldn't have any names from custmailing.dbo)

The list is really long but i shortened it just for sample data

Customer Cust Adr1 Cust City Cust State Cust Zip STORE NUMBER/ED REP Item Class
BECKY WELLS 2167 JENNI LN WASHINGTON COURT HOUSE OH 43160-1689 Ed Rep #2954 CLARINET
STEVEN H ESKSTEIN 1208 BRAMBLE AVE WASHINGTON COURT HOUSE OH 43160-1086 Ed Rep #2954 ALTO SAX
DAWN R DAWSON 728 E PAINT ST WASHINGTON COURT HOUSE OH 43160-1512 Ed Rep #2954 TRUMPET
LOUIS LE 1338 BARCLAY DR CARROLLTON TX 75007-2870 Ed Rep #6954 VIOLA
ALICE BENWIDEZ 1737 E FRANKFORD RD APT 2304 CARROLLTON TX 75007-5613 Ed Rep #6954 VIOLA
LIEP NGUYEN 1403 HOMESTEAD LN CARROLLTON TX 75007-2952 Ed Rep #6954 VIOLA
SUZI DAVIS 3005 SIERRA DR CARROLLTON TX 75007-5646 Ed Rep #6954 VIOLIN
NANCY MORALES 3314 GREENGLEN CIR CARROLLTON TX 75007-2732 Ed Rep #6954 VIOLIN
JOSE ALBERTO GARCIA 3279 NORTHVIEW CARROLLTON TX 75007-3022 Ed Rep #6954 VIOLIN
SHAHIDA JIVANI 1120 MAC ARTHUR DR APT 1103 CARROLLTON TX 75007-4484 Music & Arts #6911 VIOLIN
LANICA M DORLEY 1820 E PETERS COLONY RD APT 3406 CARROLLTON TX 75007-3720 Ed Rep #6954 VIOLIN
ELVIRA ORTEGA 1711 BIG CANYON TRL CARROLLTON TX 75007-5019 Ed Rep #6954 VIOLIN
DOUGLAS EDMEADS 1604 SAINT JAMES DR CARROLLTON TX 75007-2913 Ed Rep #6954 VIOLIN
TRESSA EDELMAN 3020 ROCKETT DR CARROLLTON TX 75007-5216 Ed Rep #6954 VIOLIN
STEVE COUTOUMANOS 1649 BLACKSTONE DR CARROLLTON TX 75007-5122 Ed Rep #6954 VIOLIN
KIM GAMBINO 1827 WOODBURY CARROLLTON TX 75007-6105 Ed Rep #6954 VIOLIN
HAI TRAN 1501 SHONKA DR CARROLLTON TX 75007-2935 Ed Rep #6954 VIOLIN
STACY HUNT 929 PLANTATION DR LEWISVILLE TX 75067-6120 Ed Rep #6954 VIOLIN
BRENDA OWENS 1810 CASTILLE DR CARROLLTON TX 75007-3014 Ed Rep #6954 VIOLIN
DESIREA MYERS 4250 LAVACA TRL CARROLLTON TX 75010-4025 Ed Rep #6954 VIOLIN
BELINDA CEPEDA 1414 MONACO DR 75067 LEWISVILLE TX 75067-5616 Ed Rep #6954 VIOLIN
MILTON HERNANDEZ 1820 E PETERS COLONY RD CARROLLTON TX 75007-3728 Ed Rep #6954 VIOLIN
Go to Top of Page

dtrivedi
Posting Yak Master

153 Posts

Posted - 2010-12-02 : 14:20:42
maybe something like this??



select customer, count(*) as myDupes, (select top 1 customer from dbo.custmail c2 where c2.customer=c1.customer)
into #myTempTable from Mailer$ c1
group by Customer
having count(*) > 1
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-12-02 : 14:22:08
That sample data is useless to me. Please look at it and see that it's just a jumbled mess. Please provide INSERT INTO statements, DDL for your tables, and expected output.

Why do your tables have .dbo at the end of the name?

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

dtrivedi
Posting Yak Master

153 Posts

Posted - 2010-12-02 : 14:28:05
These tables were excel sheets
When I imported them in it came with the names
here are the insert scripts for each table

INSERT INTO [marketing].[dbo].[custmail]
([Customer]
,[Cust Adr1]
,[Cust City]
,[Cust State]
,[Cust Zip]
,[Store Number/Ed Rep]
,[Item Class])
VALUES
(<Customer, nvarchar(255),>
,<Cust Adr1, nvarchar(255),>
,<Cust City, nvarchar(255),>
,<Cust State, nvarchar(255),>
,<Cust Zip, nvarchar(255),>
,<Store Number/Ed Rep, float,>
,<Item Class, nvarchar(255),>)
GO

INSERT INTO [marketing].[dbo].[Mailer$]
([Customer]
,[Cust Adr1]
,[Cust City]
,[Cust State]
,[Cust Zip]
,[Store Number/Ed Rep]
,[Item Class])
VALUES
(<Customer, nvarchar(255),>
,<Cust Adr1, nvarchar(255),>
,<Cust City, nvarchar(255),>
,<Cust State, nvarchar(255),>
,<Cust Zip, nvarchar(255),>
,<Store Number/Ed Rep, float,>
,<Item Class, nvarchar(255),>)
GO


INSERT INTO [marketing].[dbo].[Rental$]
([Customer]
,[Cust Adr1]
,[Cust City]
,[Cust State]
,[Cust Zip]
,[STORE NUMBER/ED REP]
,[Item Class])
VALUES
(<Customer, nvarchar(255),>
,<Cust Adr1, nvarchar(255),>
,<Cust City, nvarchar(255),>
,<Cust State, nvarchar(255),>
,<Cust Zip, nvarchar(255),>
,<STORE NUMBER/ED REP, nvarchar(255),>
,<Item Class, nvarchar(255),>)
GO

All i want in the end is one table with the results for the remaining data. The output should be a new table which has no names from the custmail table and also no names from the mail table. the output should be the reamining data left over after those two changes

Go to Top of Page

dtrivedi
Posting Yak Master

153 Posts

Posted - 2010-12-02 : 14:28:13
does that make sense?
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-12-02 : 14:43:22
You need to add your sample data into the INSERT INTO statements for us.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

tkizer
Almighty SQL Goddess

38200 Posts

Posted - 2010-12-02 : 14:44:06
For an example, please see a thread where I asked for help a few years ago: http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=110247

By providing your problem in this form, notice how fast I got answers.

Tara Kizer
Microsoft MVP for Windows Server System - SQL Server
http://weblogs.sqlteam.com/tarad/

Subscribe to my blog
Go to Top of Page

dtrivedi
Posting Yak Master

153 Posts

Posted - 2010-12-03 : 08:40:33
I really don't understnad how to do that!
Go to Top of Page

dtrivedi
Posting Yak Master

153 Posts

Posted - 2010-12-03 : 08:50:50
since i can't do that can you atleast help me to tell me if this statement is right or wrong? i think i missed something?

SELECT a.Customer,a.[Cust Adr1] adr1, b.Customer,b.[Cust Adr1], c.Customer, c.[Cust Adr1], Count(*) as mydupes into
#mytemptable
FROM dbo.custmail a INNER JOIN
dbo.Mailer$ b ON a.Customer = b.Customer INNER JOIN
dbo.Rental$ c ON b.Customer = c.Customer
group by a.Customer,a.[Cust Adr1]
having COUNT(*) > 1
Go to Top of Page
   

- Advertisement -