Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 join to a table via an attribute-value index

Author  Topic 

dlh
Starting Member

28 Posts

Posted - 2010-05-24 : 14:51:50
I'd appreciate any recommendations for how to tackle this problem.

I have the following tables:

TableA
contains Field1, Field2, Field3, ...

TableB
contains RecordID, ...

Index_TableB
contains RecordID (foreign key), IndexType, IndexValue

A_B_JoinCriteria
contains TableA_FieldName (quasi-foreign key from metadata), TableB_IndexType (foreign key)

A row in TableA corresponds to a row in TableB if and only if for all rows in A_B_JoinCriteria, there is a match between a field-value pair in TableA and an IndexType-IndexValue pair in Index_TableB.

The problem: for each record in TableB, count the number of matching records in TableA.

Complications: there may be zero to many rows in Index_TableB with any given IndexType, so we're not guaranteed existence or uniqueness of indexes. And I'm not allowed to store anything in permanent tables in the database; everything must be done on the fly.

Lamprey
Master Smack Fu Yak Hacker

4614 Posts

Posted - 2010-05-24 : 15:28:12
We are really going to need sample data and expected output to help you with this. I'm very unclear how TableA relates to table A_B_JoinCritera..

maybe this'll help:
http://weblogs.sqlteam.com/brettk/archive/2005/05/25/5276.aspx
Go to Top of Page

dlh
Starting Member

28 Posts

Posted - 2010-05-24 : 17:01:51
I was worried my description wouldn't be clear enough... I'll try to be more explicit.

My question: for each row in TableB, I want to count the number of "matching" rows in TableA, where "match" is defined by all the rows in my join-criteria table.

Pseudocode for creating the 4 tables:

CREATE TABLE TableA
(
RecordID INT PRIMARY KEY IDENTITY(1,1),
Field1 varchar(200),
Field2 varchar(200),
Field3 varchar(200),
...
)

CREATE TABLE TableB
(
RecordID INT PRIMARY KEY IDENTITY(1,1),
...
)

CREATE TABLE Index_TableB
(
RecordID INT REFERENCES TableB(RecordID),
IndexType varchar(50),
IndexValue varchar(200)
)

CREATE TABLE A_B_JoinCriteria
(
RecordID INT PRIMARY KEY IDENTITY(1,1),
TableA_FieldName varchar(50),
TableB_IndexType varchar(50) REFERENCES Index_TableB(IndexType)
)

The one constraint on the join-criteria table that cannot be explicitly specified is TableA_FieldName is restricted to column names from TableA, e.g. "Field1" or "Field2", etc.

Here's a simple illustration of data in each of the tables. For this example I'm going to rename Field1 -> Color, Field2 -> Size, Field3 -> Shape for extra clarity; note there could be any number of these fields.

INSERT INTO TableA (Color, Size, Shape)
SELECT 'Red', 'Medium', 'Circle' UNION ALL
SELECT 'Blue', 'Medium', 'Square' UNION ALL
SELECT 'Blue', 'Medium', 'Circle'

INSERT INTO TableB (...)
SELECT ... UNION ALL
SELECT ... UNION ALL
SELECT ...
--the contents of TableB aren't important, just the primary key identity.

INSERT INTO Index_TableB (RecordID, IndexType, IndexValue)
SELECT 1, 'Color', 'Red' UNION ALL
SELECT 1, 'Color', 'Blue' UNION ALL
SELECT 1, 'Size', 'Medium' UNION ALL
SELECT 2, 'Color', 'Blue' UNION ALL
SELECT 2, 'Size', 'Small' UNION ALL
SELECT 2, 'Shape', 'Circle' UNION ALL
SELECT 3, 'Color', 'Blue' UNION ALL
SELECT 3, 'Size', 'Medium' UNION ALL
SELECT 3, 'Shape', 'Square'

INSERT INTO A_B_JoinCriteria (TableA_FieldName, TableB_IndexType)
SELECT 'Color', 'Color' UNION ALL
SELECT 'Size', 'Size'

So to join rows in TableA with rows in TableB, this example requires a 'Color' match and a 'Size' match. In particular, look at row 1 in TableB which is 'Red', 'Blue' and 'Medium'. This matches all three records in TableA. So the count for that row would be 3. And look at row 2, which is 'Blue' and 'Small'. This matches 0 rows in TableA.

For now I'm assuming there will always be rows in Index_TableB for each IndexType in A_B_JoinCriteria.

********************

REVISION: I've solved the problem. Thanks for reading.
Go to Top of Page

Julien.Crawford
Starting Member

21 Posts

Posted - 2010-05-27 : 21:40:02
Well done - I've been monitoring this thread hoping to see the solution.
I was going to have a shot at this, but its been high pressure.
We use attribute tables and I'm keen to see your solution.

Any chance of posting?

Julien
Go to Top of Page

dlh
Starting Member

28 Posts

Posted - 2010-05-28 : 19:17:57
Sure. I'm away from my workstation so I haven't checked the following for typos.

First imagine a static A_B_JoinCriteria, i.e. join conditions that are known in advance and can be hard-coded. Say all three possible fields in the above example ('Color', 'Shape', and 'Size') are required for the TableA-to-TableB match.

Then the following query would give the counts of TableA matches for all rows in TableB:

SELECT
TableB.RecordID,
COUNT(TableA.RecordID)
FROM
TableA
INNER JOIN Index_TableB AS b1
ON b1.IndexValue = TableA.Color
AND b1.RecordID = b1.RecordID
INNER JOIN Index_TableB AS b2
ON b2.IndexValue = TableA.Size
AND b2.RecordID = b1.RecordID
INNER JOIN Index_TableB AS b3
ON b3.IndexValue = TableA.Shape
AND b3.RecordID = b1.RecordID
WHERE
1 = 1
AND b1.IndexType = 'Color'
AND b2.IndexType = 'Size'
AND b3.IndexType = 'Shape'
GROUP BY
TableB.RecordID

Notice that there are a couple redundancies in the above to emphasize symmetry. So the above text needs to be generated dynamically based on the contents of A_B_JoinCriteria.

DECLARE @SQL VARCHAR(MAX)

SELECT @SQL = 'SELECT
TableB.RecordID,
COUNT(TableA.RecordID)
FROM
TableA'


SELECT @SQL = @SQL + '
INNER JOIN Index_TableB AS b'
+ CAST(jc.RecordID AS VARCHAR) + '
ON b'
+ CAST(jc.RecordID AS VARCHAR) + '.IndexValue = TableA.' + jc.TableA_FieldName + '
AND b'
+ CAST(jc.RecordID AS VARCHAR) + '.RecordID = b1.RecordID'
FROM A_B_JoinCriteria AS jc

SELECT @SQL = @SQL + '
WHERE
1 = 1'


SELECT @SQL = @SQL + '
AND b'
+ CAST(jc.RecordID AS VARCHAR) + '.IndexType = ''' + jc.TableA_FieldName + ''''
FROM A_B_JoinCriteria AS jc

SELECT @SQL = @SQL + '
GROUP BY
TableB.RecordID'


EXEC(@SQL)

An added problem: I realized that the runtime can vary greatly based on the order of rows in A_B_JoinCriteria. With my actual data, it was 10 seconds at best and a half hour at worst. For greatest efficiency, A_B_JoinCriteria needs to be sorted so that the first row has the IndexType with the smallest average number of records in TableB sharing the same IndexValue.

I hope this helps and I hope I haven't made any mistakes.
Go to Top of Page

Julien.Crawford
Starting Member

21 Posts

Posted - 2010-05-31 : 19:04:56
Ah, Dynamic SQL.
I thought of that also.

I was thinking you would need to do some magic with to see if there was any non matching rows.
Or do a count(*) on the number of matching rows.
I'm not being very clear here - sorry.

I would love to have a go at this, I just don't have the time :-(
Go to Top of Page
   

- Advertisement -