Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
dlh
Starting Member
28 Posts |
Posted - 2010-05-24 : 14:51:50
|
| I'd appreciate any recommendations for how to tackle this problem.I have the following tables:TableAcontains Field1, Field2, Field3, ...TableBcontains RecordID, ...Index_TableBcontains RecordID (foreign key), IndexType, IndexValueA_B_JoinCriteriacontains TableA_FieldName (quasi-foreign key from metadata), TableB_IndexType (foreign key)A row in TableA corresponds to a row in TableB if and only if for all rows in A_B_JoinCriteria, there is a match between a field-value pair in TableA and an IndexType-IndexValue pair in Index_TableB.The problem: for each record in TableB, count the number of matching records in TableA.Complications: there may be zero to many rows in Index_TableB with any given IndexType, so we're not guaranteed existence or uniqueness of indexes. And I'm not allowed to store anything in permanent tables in the database; everything must be done on the fly. |
|
|
Lamprey
Master Smack Fu Yak Hacker
4614 Posts |
|
|
dlh
Starting Member
28 Posts |
Posted - 2010-05-24 : 17:01:51
|
I was worried my description wouldn't be clear enough... I'll try to be more explicit.My question: for each row in TableB, I want to count the number of "matching" rows in TableA, where "match" is defined by all the rows in my join-criteria table.Pseudocode for creating the 4 tables:CREATE TABLE TableA( RecordID INT PRIMARY KEY IDENTITY(1,1), Field1 varchar(200), Field2 varchar(200), Field3 varchar(200), ... )CREATE TABLE TableB( RecordID INT PRIMARY KEY IDENTITY(1,1), ...)CREATE TABLE Index_TableB( RecordID INT REFERENCES TableB(RecordID), IndexType varchar(50), IndexValue varchar(200))CREATE TABLE A_B_JoinCriteria( RecordID INT PRIMARY KEY IDENTITY(1,1), TableA_FieldName varchar(50), TableB_IndexType varchar(50) REFERENCES Index_TableB(IndexType)) The one constraint on the join-criteria table that cannot be explicitly specified is TableA_FieldName is restricted to column names from TableA, e.g. "Field1" or "Field2", etc.Here's a simple illustration of data in each of the tables. For this example I'm going to rename Field1 -> Color, Field2 -> Size, Field3 -> Shape for extra clarity; note there could be any number of these fields.INSERT INTO TableA (Color, Size, Shape)SELECT 'Red', 'Medium', 'Circle' UNION ALLSELECT 'Blue', 'Medium', 'Square' UNION ALLSELECT 'Blue', 'Medium', 'Circle'INSERT INTO TableB (...)SELECT ... UNION ALLSELECT ... UNION ALLSELECT ...--the contents of TableB aren't important, just the primary key identity.INSERT INTO Index_TableB (RecordID, IndexType, IndexValue)SELECT 1, 'Color', 'Red' UNION ALLSELECT 1, 'Color', 'Blue' UNION ALLSELECT 1, 'Size', 'Medium' UNION ALLSELECT 2, 'Color', 'Blue' UNION ALLSELECT 2, 'Size', 'Small' UNION ALLSELECT 2, 'Shape', 'Circle' UNION ALLSELECT 3, 'Color', 'Blue' UNION ALLSELECT 3, 'Size', 'Medium' UNION ALLSELECT 3, 'Shape', 'Square'INSERT INTO A_B_JoinCriteria (TableA_FieldName, TableB_IndexType)SELECT 'Color', 'Color' UNION ALLSELECT 'Size', 'Size' So to join rows in TableA with rows in TableB, this example requires a 'Color' match and a 'Size' match. In particular, look at row 1 in TableB which is 'Red', 'Blue' and 'Medium'. This matches all three records in TableA. So the count for that row would be 3. And look at row 2, which is 'Blue' and 'Small'. This matches 0 rows in TableA.For now I'm assuming there will always be rows in Index_TableB for each IndexType in A_B_JoinCriteria.********************REVISION: I've solved the problem. Thanks for reading. |
 |
|
|
Julien.Crawford
Starting Member
21 Posts |
Posted - 2010-05-27 : 21:40:02
|
| Well done - I've been monitoring this thread hoping to see the solution.I was going to have a shot at this, but its been high pressure.We use attribute tables and I'm keen to see your solution.Any chance of posting?Julien |
 |
|
|
dlh
Starting Member
28 Posts |
Posted - 2010-05-28 : 19:17:57
|
Sure. I'm away from my workstation so I haven't checked the following for typos.First imagine a static A_B_JoinCriteria, i.e. join conditions that are known in advance and can be hard-coded. Say all three possible fields in the above example ('Color', 'Shape', and 'Size') are required for the TableA-to-TableB match.Then the following query would give the counts of TableA matches for all rows in TableB:SELECT TableB.RecordID, COUNT(TableA.RecordID)FROM TableA INNER JOIN Index_TableB AS b1 ON b1.IndexValue = TableA.Color AND b1.RecordID = b1.RecordID INNER JOIN Index_TableB AS b2 ON b2.IndexValue = TableA.Size AND b2.RecordID = b1.RecordID INNER JOIN Index_TableB AS b3 ON b3.IndexValue = TableA.Shape AND b3.RecordID = b1.RecordIDWHERE 1 = 1 AND b1.IndexType = 'Color' AND b2.IndexType = 'Size' AND b3.IndexType = 'Shape'GROUP BY TableB.RecordID Notice that there are a couple redundancies in the above to emphasize symmetry. So the above text needs to be generated dynamically based on the contents of A_B_JoinCriteria.DECLARE @SQL VARCHAR(MAX)SELECT @SQL = 'SELECT TableB.RecordID, COUNT(TableA.RecordID)FROM TableA'SELECT @SQL = @SQL + ' INNER JOIN Index_TableB AS b' + CAST(jc.RecordID AS VARCHAR) + ' ON b' + CAST(jc.RecordID AS VARCHAR) + '.IndexValue = TableA.' + jc.TableA_FieldName + ' AND b' + CAST(jc.RecordID AS VARCHAR) + '.RecordID = b1.RecordID'FROM A_B_JoinCriteria AS jcSELECT @SQL = @SQL + 'WHERE 1 = 1'SELECT @SQL = @SQL + ' AND b' + CAST(jc.RecordID AS VARCHAR) + '.IndexType = ''' + jc.TableA_FieldName + ''''FROM A_B_JoinCriteria AS jcSELECT @SQL = @SQL + 'GROUP BY TableB.RecordID'EXEC(@SQL) An added problem: I realized that the runtime can vary greatly based on the order of rows in A_B_JoinCriteria. With my actual data, it was 10 seconds at best and a half hour at worst. For greatest efficiency, A_B_JoinCriteria needs to be sorted so that the first row has the IndexType with the smallest average number of records in TableB sharing the same IndexValue.I hope this helps and I hope I haven't made any mistakes. |
 |
|
|
Julien.Crawford
Starting Member
21 Posts |
Posted - 2010-05-31 : 19:04:56
|
| Ah, Dynamic SQL.I thought of that also.I was thinking you would need to do some magic with to see if there was any non matching rows.Or do a count(*) on the number of matching rows.I'm not being very clear here - sorry.I would love to have a go at this, I just don't have the time :-( |
 |
|
|
|
|
|
|
|