Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 Specify minimum number of join points

Author  Topic 

mattt
Posting Yak Master

194 Posts

Posted - 2012-04-12 : 11:34:34
So I have this joining table:

CREATE TABLE [TagMap](
[intItemId] [bigint] NOT NULL,
[intTagId] [bigint] NOT NULL,
CONSTRAINT [PK_TagMap_intItemId] PRIMARY KEY CLUSTERED
(
[intItemId] ASC,
[intTagId] ASC
)


My sproc discovers the tags that are associated with a seedItemId and then selects other intItemIds that are associated with those tags, like this:


declare @baseTags table (intTagId bigint primary key clustered);


INSERT INTO @baseTags (intTagId)
SELECT TOP 20 t1.intTagId
FROM TagMap t1
WHERE t1.intItemId = 776

SELECT TOP 500 t1.intItemId
FROM TagMap t1
JOIN @baseTags t2
ON t1.intTagId = t2.intTagId
GROUP BY t1.intItemId
ORDER BY Count(*) DESC


What I need to to is specify a minimum number of tag associations - say 10 - between two intItemIds below which a value is not returned. In other words in the TagMap table can find ten or more intTagIds that two intItemIds have in common, it's good and we select it - otherwise ignore it.

Any suggestions on how to do this, please?

Cheers,
Matt

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-12 : 13:46:12
Matt,

Could you please provide some sample data and expected results?



<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

mattt
Posting Yak Master

194 Posts

Posted - 2012-04-13 : 08:35:30
quote:
Originally posted by yosiasz

Matt,

Could you please provide some sample data and expected results?



Sorry. Here you go:

CREATE TABLE #TagMap(
[intItemId] [bigint] NOT NULL,
[intTagId] [bigint] NOT NULL,
CONSTRAINT [PK_TagMap_intItemId] PRIMARY KEY CLUSTERED
(
[intItemId] ASC,
[intTagId] ASC
))

insert into #TagMap
(intItemId, intTagId)
values
(1, 100),(1, 200),(1, 300),
(2, 100),(2, 200),(2, 500),(2, 600),
(3, 100),(3, 500),(3, 600)

Assume the match threshold is two.

If the seed is intItemId 1, then only intItemId 2 should be returned (it has two matching tag Ids: 100 and 200, while intItemId 3 has one: 100, which is below the threshold).

If the seed is intItemId 2, then both intItemId 1 and 3 should be returned (intItemId 1 matches the tag Ids 100 and 200, while intItemId 3 matches the tag Ids 500 and 600).

If the seed is intItemId 3, then only intItemId 2 should be returned (it has two matching tag Ids: 500 and 600, while intItemId has one: 100 which is below the threshold).

Does that make it clearer?
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-13 : 10:58:30
check this out I had the same question you had a while back. When I saw yours I said almost identical to mine and posted it again and got a response

http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=173550

Credit goes to Bustaz Kool


declare @Items TABLE (intItemId int, descr varchar(100))

insert into @Items
(intItemId, descr)
values
(1, 'Led Ziiplin'),
(2, 'Run Dom M C'),
(3, 'Manbungo')


declare @Tags TABLE ([intTagId] int, descr varchar(100))

insert into @Items
(intItemId, descr)
values
(100, 'Classical/Metal'),
(200, 'Blues/Concerto'),
(300, 'Icelandic Rap'),
(500, 'Mountain Goat Jazz'),
(600, 'Polar Bear Rap')

declare @TagMap TABLE (
[intItemId] [bigint] NOT NULL,
[intTagId] [bigint] NOT NULL)

insert into @TagMap
(intItemId, intTagId)
values
(1, 100),(1, 200),(1, 300),
(2, 100),(2, 200),(2, 500),(2, 600),
(3, 100),(3, 500),(3, 600)


select * From @TagMap



;with ItemsWithSameTag
as (
select bt1.intItemId , bt1.intTagId, bt2.intItemId intItemId2
from @TagMap bt1
inner join
@TagMap bt2
on bt1.intItemId <> bt2.intItemId
and bt1.intTagId = bt2.intTagId
),
ItemsWithMatches
as (
select intItemId, intItemId2, count(*) kount
from ItemsWithSameTag
group by intItemId, intItemId2
having count(*) = 2
)
select
b1.descr, b2.descr, bwfm.kount
from
@Items b1
inner join
ItemsWithMatches bwfm
on bwfm.intItemId = b1.intItemId
inner join
@Items b2
on bwfm.intItemId2 = b2.intItemId

<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

mattt
Posting Yak Master

194 Posts

Posted - 2012-04-16 : 06:06:10
Thanks. Loving the music genres :)
Go to Top of Page

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2012-04-16 : 16:29:27
In re: "Polar Bear Rap"

It's getting harder and harder to find any Polar Bear Rap; at least the good kind. Forget East Coast vs West Coast - These guys from the North Coast are the real ganstas. When they pop a cap you really get iced.

=================================================
There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE)
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-16 : 16:55:24
:) word

<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2012-04-16 : 16:58:16
I heard they were going to do a duet with Seal.

=================================================
There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE)
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-16 : 16:59:21
that's cold
Yo VIP let's kick it
<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2012-04-16 : 17:04:36
Vanilla Ice (I don't have a joke for that but I'm sure it's in there somewhere; you find it.)

=================================================
There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE)
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-04-16 : 17:05:08
Why don't you try to write an indexed view?
CREATE VIEW dbo.vwSimilarities
WITH SCHEMABINDING
AS
SELECT tm1.intItemId AS ItemID,
tm2.intItemId AS SimilarID,
i.Descr,
COUNT_BIG(*) AS Items
FROM @TagMap AS tm1
INNER JOIN @TagMap AS tm2 ON tm2.intTagId = tm1.intTagId
INNER JOIN @Items AS i ON i.intItemId = tm2.intItemId
WHERE tm1.intItemId <> tm2.intItemId
GROUP BY tm1.intItemId,
tm2.intItemId,
i.Descr
GO
CREATE UNIQUE CLUSTERED INDEX UCX_Similarities ON dbo.vwSimilarities(ItemID, SimilarID)
GO

Then you just get all similarites using the view

SELECT * FROM dbo.vwSimilarities WHERE ItemID = 1 AND Items >= 2



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-16 : 17:05:22
did you miss it in my post?

Yo VIP let's kick it

<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-16 : 17:06:57
Oh Peso got in there. Sorry Peso I was replying to Busta. View on it sounds real good idea and the COUNT_BIG is all new to me. Let me check it out.

Thanks

<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2012-04-16 : 17:16:51
Peso, why did you choose to use a count_big there? Just out of habit, or is there a benefit to it in this scenario over count? I get one returns a bigint and the other just a int, but was there a reason besides that?


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-04-17 : 03:11:51
I am used to have COUNT_BIG in aggregates due to the sizes of the databases I deal with (100 GB to 1.5 PB).
You can have COUNT if you want to.



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Vinnie881
Master Smack Fu Yak Hacker

1231 Posts

Posted - 2012-04-17 : 11:55:51
Thanks! Normally when you specifically illustrate/suggests something there is always a reason for the exact code, and often something I learn from. I just wanted to make sure that the reason was simply habit and not something else I was missing, due to I do not see too many people use the count_big aggregate.

Thanks again for the clarification!


Success is 10% Intelligence, 70% Determination, and 22% Stupidity.
\_/ _/ _/\_/ _/\_/ _/ _/- 881
Go to Top of Page
   

- Advertisement -