Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2008 Forums
 Transact-SQL (2008)
 proximity, similarity, levenstein

Author  Topic 

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-12 : 15:18:32
Greetings,

Given the following data how can one go about trying to figure out only books have five terminologies in common .If less not interested


declare @books TABLE(book_id int, descr nvarchar(50))
insert into @books
SELECT 1, 'Gone with the wind'
UNION ALL
SELECT 2, 'Mars'
UNION ALL
SELECT 3, 'Do Androids Dream of Electric Sheep?'
UNION ALL
SELECT 4, 'Ender''s Game'
UNION ALL
SELECT 5, 'Dune'
UNION ALL
SELECT 6, 'Foundation'
UNION ALL
SELECT 7, 'Hitch Hiker''s Guide to the Galaxy'
UNION ALL
SELECT 8, '1984'
UNION ALL
SELECT 9, 'Star Wars'
UNION ALL
SELECT 10, 'Battlestar Galactica'
UNION ALL
SELECT 11, 'Fringe'
UNION ALL
SELECT 13, 'Prometheus'
UNION ALL
SELECT 14, 'Prometheus VI'
UNION ALL
SELECT 15, 'Contagion'
UNION ALL
SELECT 16, 'Deep Blue'
UNION ALL
SELECT 16, 'Hunger Games'

--SELECT * FROM @books
declare @terminologies TABLE(terminology_id int, descr nvarchar(50))
INSERT INTO @terminologies
SELECT 1, 'Futuristic'
UNION ALL
SELECT 2, 'Propulsion'
UNION ALL
SELECT 3, 'Guidance'
UNION ALL
SELECT 4, 'Life Support'
UNION ALL
SELECT 5, 'Cabin Structure'
UNION ALL
SELECT 6, 'Communications'
UNION ALL
SELECT 7, 'Thermal Protection'
UNION ALL
SELECT 8, 'Displays And Controls'
UNION ALL
SELECT 9, 'Space Craft'
UNION ALL
SELECT 10, 'Light Speed'
UNION ALL
SELECT 11, 'Warp Speed'
UNION ALL
SELECT 12, 'Jedi'
UNION ALL
SELECT 13, 'Force is weak with you rookie Jedi'
UNION ALL
SELECT 14, 'UnCivil War'

--SELECT * FROM @terminologies

DECLARE @book_terminologies TABLE(book_id int, terminology_id int)

INSERT INTO @book_terminologies
( book_id, terminology_id )
SELECT 1, 1
UNION ALL
SELECT 1, 2
UNION ALL
SELECT 1, 3
UNION ALL
SELECT 1, 4
UNION ALL
SELECT 1, 5
UNION ALL
SELECT 1, 6
UNION ALL
SELECT 1, 7
UNION ALL
SELECT 1, 8
UNION ALL
SELECT 1, 9
UNION ALL
SELECT 1, 10
UNION ALL


SELECT 2, 1
UNION ALL
SELECT 2, 2
UNION ALL
SELECT 2, 3
UNION ALL
SELECT 2, 4
UNION ALL
SELECT 2, 5
UNION ALL
SELECT 2, 6
UNION ALL
SELECT 2, 7
UNION ALL
SELECT 2, 8
UNION ALL
SELECT 2, 9
UNION ALL
SELECT 2, 10
UNION ALL
SELECT 3, 1
UNION ALL
SELECT 3, 2
UNION ALL
SELECT 3, 3
UNION ALL
SELECT 3, 4
UNION ALL
SELECT 3, 5


SELECT b.descr, b.book_id,
t.descr, t.terminology_id
FROM @books b

inner join @book_terminologies bt
on b.book_id = bt.book_id

inner join @terminologies t
on t.terminology_id = bt.terminology_id
order by b.book_id, t.terminology_id


<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2012-04-12 : 19:01:11
[CODE];with BooksWithSameTerminology
as (
select bt1.book_id, bt1.terminology_id, bt2.book_id book_id2
from @book_terminologies bt1
inner join
@book_terminologies bt2
on bt1.book_id < bt2.book_id
and bt1.terminology_id = bt2.terminology_id
),
BooksWithFiveMatches
as (
select book_id, book_id2, count(*) kount
from BooksWithSameTerminology
group by book_id, book_id2
having count(*) >= 5
)
select
b1.descr, b2.descr, bwfm.kount
from
@books b1
inner join
BooksWithFiveMatches bwfm
on bwfm.book_id = b1.book_id
inner join
@books b2
on bwfm.book_id2 = b2.book_id[/CODE]HTH

=================================================
There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE)
Go to Top of Page

LoztInSpace
Aged Yak Warrior

940 Posts

Posted - 2012-04-13 : 02:33:09
I can't help feeling I'm missing something:

SELECT
b1.book_id, b2.book_id, COUNT(*)
FROM @books b1
inner join @book_terminologies t1
on b1.book_id = t1.book_id
inner join @book_terminologies t2
on t1.terminology_id = t2.terminology_id
inner join @books b2
on b2.book_id = t2.book_id
where b1.book_id<>b2.book_id
group by b1.book_id,b2.book_id
having COUNT(*)>=5


Note this assumes a distinct set of terminologies per book. If that's not the case we'd have to sub-select distinct terminologies instead of using the table directly:


SELECT
b1.book_id, b2.book_id, COUNT(*)
FROM @books b1
inner join (select distinct * from @book_terminologies) t1
on b1.book_id = t1.book_id
inner join (select distinct * from @book_terminologies) t2
on t1.terminology_id = t2.terminology_id
inner join @books b2
on b2.book_id = t2.book_id
where b1.book_id<>b2.book_id
group by b1.book_id,b2.book_id
having COUNT(*)>=5
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-13 : 02:43:41
this is looking very nice Bustaz

Let me try it out. I assume you mean to say <> instead of>
quote:
Originally posted by Bustaz Kool

[CODE];with BooksWithSameTerminology
as (
select bt1.book_id, bt1.terminology_id, bt2.book_id book_id2
from @book_terminologies bt1
inner join
@book_terminologies bt2
on bt1.book_id < bt2.book_id
and bt1.terminology_id = bt2.terminology_id
),
BooksWithFiveMatches
as (
select book_id, book_id2, count(*) kount
from BooksWithSameTerminology
group by book_id, book_id2
having count(*) >= 5
)
select
b1.descr, b2.descr, bwfm.kount
from
@books b1
inner join
BooksWithFiveMatches bwfm
on bwfm.book_id = b1.book_id
inner join
@books b2
on bwfm.book_id2 = b2.book_id[/CODE]HTH

=================================================
There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE)



<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2012-04-13 : 12:05:10
quote:
I assume you mean to say <> instead of>


Where?

=================================================
There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE)
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-13 : 12:07:55

on bt1.book_id < bt2.book_id


<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page

Bustaz Kool
Master Smack Fu Yak Hacker

1834 Posts

Posted - 2012-04-13 : 12:17:40
The less than is there in order to impose a hierarchy on the books. Without it you'll get book X matching book Y AND book Y matching book X. If you want both, you're right to correct it to a not equal comparison.

=================================================
There is a foolish corner in the brain of the wisest man. -Aristotle, philosopher (384-322 BCE)
Go to Top of Page

yosiasz
Master Smack Fu Yak Hacker

1635 Posts

Posted - 2012-04-13 : 12:29:15
aha got it! Thanks BK

<><><><><><><><><><><><><><><><><>
If you don't have the passion to help people, you have no passion
Go to Top of Page
   

- Advertisement -