Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
mattt
Posting Yak Master
194 Posts |
Posted - 2012-04-12 : 11:34:34
|
So I have this joining table:CREATE TABLE [TagMap]( [intItemId] [bigint] NOT NULL, [intTagId] [bigint] NOT NULL, CONSTRAINT [PK_TagMap_intItemId] PRIMARY KEY CLUSTERED ( [intItemId] ASC, [intTagId] ASC) My sproc discovers the tags that are associated with a seedItemId and then selects other intItemIds that are associated with those tags, like this:declare @baseTags table (intTagId bigint primary key clustered); INSERT INTO @baseTags (intTagId) SELECT TOP 20 t1.intTagId FROM TagMap t1 WHERE t1.intItemId = 776 SELECT TOP 500 t1.intItemId FROM TagMap t1 JOIN @baseTags t2 ON t1.intTagId = t2.intTagId GROUP BY t1.intItemId ORDER BY Count(*) DESC What I need to to is specify a minimum number of tag associations - say 10 - between two intItemIds below which a value is not returned. In other words in the TagMap table can find ten or more intTagIds that two intItemIds have in common, it's good and we select it - otherwise ignore it.Any suggestions on how to do this, please?Cheers,Matt |
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2012-04-12 : 13:46:12
|
Matt,Could you please provide some sample data and expected results?<><><><><><><><><><><><><><><><><>If you don't have the passion to help people, you have no passion |
 |
|
mattt
Posting Yak Master
194 Posts |
Posted - 2012-04-13 : 08:35:30
|
quote: Originally posted by yosiasz Matt,Could you please provide some sample data and expected results?
Sorry. Here you go:CREATE TABLE #TagMap( [intItemId] [bigint] NOT NULL, [intTagId] [bigint] NOT NULL, CONSTRAINT [PK_TagMap_intItemId] PRIMARY KEY CLUSTERED ( [intItemId] ASC, [intTagId] ASC))insert into #TagMap(intItemId, intTagId)values (1, 100),(1, 200),(1, 300),(2, 100),(2, 200),(2, 500),(2, 600),(3, 100),(3, 500),(3, 600)Assume the match threshold is two.If the seed is intItemId 1, then only intItemId 2 should be returned (it has two matching tag Ids: 100 and 200, while intItemId 3 has one: 100, which is below the threshold). If the seed is intItemId 2, then both intItemId 1 and 3 should be returned (intItemId 1 matches the tag Ids 100 and 200, while intItemId 3 matches the tag Ids 500 and 600).If the seed is intItemId 3, then only intItemId 2 should be returned (it has two matching tag Ids: 500 and 600, while intItemId has one: 100 which is below the threshold).Does that make it clearer? |
 |
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2012-04-13 : 10:58:30
|
check this out I had the same question you had a while back. When I saw yours I said almost identical to mine and posted it again and got a responsehttp://www.sqlteam.com/forums/topic.asp?TOPIC_ID=173550Credit goes to Bustaz Kooldeclare @Items TABLE (intItemId int, descr varchar(100))insert into @Items(intItemId, descr)values (1, 'Led Ziiplin'),(2, 'Run Dom M C'),(3, 'Manbungo')declare @Tags TABLE ([intTagId] int, descr varchar(100))insert into @Items(intItemId, descr)values (100, 'Classical/Metal'),(200, 'Blues/Concerto'),(300, 'Icelandic Rap'),(500, 'Mountain Goat Jazz'),(600, 'Polar Bear Rap')declare @TagMap TABLE ([intItemId] [bigint] NOT NULL,[intTagId] [bigint] NOT NULL)insert into @TagMap(intItemId, intTagId)values (1, 100),(1, 200),(1, 300),(2, 100),(2, 200),(2, 500),(2, 600),(3, 100),(3, 500),(3, 600)select * From @TagMap;with ItemsWithSameTagas ( select bt1.intItemId , bt1.intTagId, bt2.intItemId intItemId2 from @TagMap bt1 inner join @TagMap bt2 on bt1.intItemId <> bt2.intItemId and bt1.intTagId = bt2.intTagId ),ItemsWithMatchesas ( select intItemId, intItemId2, count(*) kount from ItemsWithSameTag group by intItemId, intItemId2 having count(*) = 2 )select b1.descr, b2.descr, bwfm.kountfrom @Items b1inner join ItemsWithMatches bwfm on bwfm.intItemId = b1.intItemIdinner join @Items b2 on bwfm.intItemId2 = b2.intItemId <><><><><><><><><><><><><><><><><>If you don't have the passion to help people, you have no passion |
 |
|
mattt
Posting Yak Master
194 Posts |
Posted - 2012-04-16 : 06:06:10
|
Thanks. Loving the music genres :) |
 |
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2012-04-16 : 16:29:27
|
In re: "Polar Bear Rap"It's getting harder and harder to find any Polar Bear Rap; at least the good kind. Forget East Coast vs West Coast - These guys from the North Coast are the real ganstas. When they pop a cap you really get iced.=================================================There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE) |
 |
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2012-04-16 : 16:55:24
|
:) word<><><><><><><><><><><><><><><><><>If you don't have the passion to help people, you have no passion |
 |
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2012-04-16 : 16:58:16
|
I heard they were going to do a duet with Seal.=================================================There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE) |
 |
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2012-04-16 : 16:59:21
|
that's coldYo VIP let's kick it<><><><><><><><><><><><><><><><><>If you don't have the passion to help people, you have no passion |
 |
|
Bustaz Kool
Master Smack Fu Yak Hacker
1834 Posts |
Posted - 2012-04-16 : 17:04:36
|
Vanilla Ice (I don't have a joke for that but I'm sure it's in there somewhere; you find it.)=================================================There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE) |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2012-04-16 : 17:05:08
|
Why don't you try to write an indexed view?CREATE VIEW dbo.vwSimilaritiesWITH SCHEMABINDINGASSELECT tm1.intItemId AS ItemID, tm2.intItemId AS SimilarID, i.Descr, COUNT_BIG(*) AS ItemsFROM @TagMap AS tm1INNER JOIN @TagMap AS tm2 ON tm2.intTagId = tm1.intTagIdINNER JOIN @Items AS i ON i.intItemId = tm2.intItemIdWHERE tm1.intItemId <> tm2.intItemIdGROUP BY tm1.intItemId, tm2.intItemId, i.DescrGOCREATE UNIQUE CLUSTERED INDEX UCX_Similarities ON dbo.vwSimilarities(ItemID, SimilarID)GO Then you just get all similarites using the viewSELECT * FROM dbo.vwSimilarities WHERE ItemID = 1 AND Items >= 2 N 56°04'39.26"E 12°55'05.63" |
 |
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2012-04-16 : 17:05:22
|
did you miss it in my post?Yo VIP let's kick it<><><><><><><><><><><><><><><><><>If you don't have the passion to help people, you have no passion |
 |
|
yosiasz
Master Smack Fu Yak Hacker
1635 Posts |
Posted - 2012-04-16 : 17:06:57
|
Oh Peso got in there. Sorry Peso I was replying to Busta. View on it sounds real good idea and the COUNT_BIG is all new to me. Let me check it out.Thanks<><><><><><><><><><><><><><><><><>If you don't have the passion to help people, you have no passion |
 |
|
Vinnie881
Master Smack Fu Yak Hacker
1231 Posts |
Posted - 2012-04-16 : 17:16:51
|
Peso, why did you choose to use a count_big there? Just out of habit, or is there a benefit to it in this scenario over count? I get one returns a bigint and the other just a int, but was there a reason besides that? Success is 10% Intelligence, 70% Determination, and 22% Stupidity.\_/ _/ _/\_/ _/\_/ _/ _/- 881 |
 |
|
SwePeso
Patron Saint of Lost Yaks
30421 Posts |
Posted - 2012-04-17 : 03:11:51
|
I am used to have COUNT_BIG in aggregates due to the sizes of the databases I deal with (100 GB to 1.5 PB).You can have COUNT if you want to. N 56°04'39.26"E 12°55'05.63" |
 |
|
Vinnie881
Master Smack Fu Yak Hacker
1231 Posts |
Posted - 2012-04-17 : 11:55:51
|
Thanks! Normally when you specifically illustrate/suggests something there is always a reason for the exact code, and often something I learn from. I just wanted to make sure that the reason was simply habit and not something else I was missing, due to I do not see too many people use the count_big aggregate.Thanks again for the clarification! Success is 10% Intelligence, 70% Determination, and 22% Stupidity.\_/ _/ _/\_/ _/\_/ _/ _/- 881 |
 |
|
|
|
|
|
|