Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2005 Forums
 Transact-SQL (2005)
 SELECT IN misses NULLs; efficiently finding them

Author  Topic 

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-02-10 : 05:41:05
Please run the following block of code. I see that the NULL values are missed out of the final set of results, where I would expect NULL values to remain.


DECLARE @tbl1 TABLE (colA int)
DECLARE @tbl2 TABLE (colB int)

INSERT INTO @tbl1 (colA)
SELECT (1) UNION ALL
SELECT (2) UNION ALL
SELECT (2) UNION ALL
SELECT (3) UNION ALL
SELECT (3) UNION ALL
SELECT (3) UNION ALL
SELECT (NULL) UNION ALL
SELECT (NULL)

INSERT INTO @tbl2 (colB)
SELECT (1) UNION ALL
SELECT (2) UNION ALL
SELECT (3) UNION ALL
SELECT (NULL)

SELECT * FROM @tbl1
SELECT * FROM @tbl2
SELECT * FROM @tbl1 WHERE (colA IN (SELECT colB from @tbl2))


I find that I can force the NULL values to appear back in the results if I do this:

SELECT * FROM @tbl1 WHERE (COALESCE(colA, -1) IN (SELECT COALESCE(colB, -1) from @tbl2))


However, is this the best/most efficient way to get the NULL values back in? Any pointers would be appreciated.

webfred
Master Smack Fu Yak Hacker

8781 Posts

Posted - 2012-02-10 : 06:01:26
NULL means "Don't know the value" so the best is what IN() is doing -> ignore them, because it makes no sense to compare "Don't know" with another "Don't know".



No, you're never too old to Yak'n'Roll if you're too young to die.
Go to Top of Page

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-02-10 : 06:04:52
Okay great - thanks for the clarification on that.
So....moving on, is a COALESCE on both values (as in my last statement) the best way to get those NULLs back in to the results, or is there a better way? In my case I know exactly what NULL is referring to, so need to override the default IN behaviour.
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2012-02-10 : 06:25:11
No. if you need to include NULL values then what you should do is use an OR predicate.

Something like

SELECT *
FROM @tbl1
WHERE
colA IN (SELECT colB from @tbl2)
OR [colA] IS NULL

It's more declarative (it reads better).

Also you can still use an index on the column for the IN check which you couldn't before (when you were wrapping the column in function calls).

You may find that the best (best index use) solution is to do the query with IN and then union another query with the NULL check.

Something like:

SELECT * FROM @tbl1 WHERE colA IN (SELECT colB from @tbl2)
UNION ALL SELECT * FROM @tbl1 WHERE [colA] IS NULL

This would let you use a FILTERED INDEX on the column (ColA) only for the values that are NULL.

does this make any sense?

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-02-10 : 06:43:31
Hi Charlie

Thanks for that - yes, it makes perfect sense. When I actually try out your method though, for some reason the NULLs don't show. Have I mistyped something?


SELECT
colA
FROM
@tbl1
where
colA IN (
select
colb
from
@tbl2 union all
select
colb
from
@tbl2
where
colB is NULL
)
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2012-02-10 : 06:53:46
huh? I didn't say anything like that....

This doesn't make any logical sense

select
colb
from
@tbl2 union all
select
colb
from
@tbl2
where
colB is NULL

What you are doing there is:
1) Select the value of [colb] from EVERY ROW in the table (because there is no filter criteria
2) UNION that with the value of [colb] from ANY MATCHING Row WHERE [colb] IS NULL

THis means you are going to get NULLS back for (1) and for (2)

All the nulls will then not match anything in the outer select (because of the IN condition) and then you'll loose them again.


I think you are a little confused and I don't think the simplified examples are helping.

What are you actually trying to do?

Remember you can't compare NULLs with any predicate except for IS NULL or IS NOT NULL.

A NULL compared to anything else (even another NULL) will always result in a NULL answer.

So the following will not print anything for the first two IF conditions

DECLARE @a BIT = NULL
DECLARE @b BIT = NULL

IF @a = @b PRINT 'NULL equal to NULL'
IF @a <> @b PRINT 'NULL not equal to NULL'
IF @a IS NULL PRINT '@a IS NULL'
IF @b IS NULL PRINT '@a IS NULL'

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

SwePeso
Patron Saint of Lost Yaks

30421 Posts

Posted - 2012-02-10 : 07:15:11
The easiest way is to override the ANSI rules like this

SET ANSI_NULLS OFF -- Get rid of default setting

SELECT * FROM @tbl1 WHERE (colA IN (SELECT colB from @tbl2))

SET ANSI_NULLS ON -- Don't forget to switch on detfault setting



N 56°04'39.26"
E 12°55'05.63"
Go to Top of Page

Transact Charlie
Master Smack Fu Yak Hacker

3451 Posts

Posted - 2012-02-10 : 07:33:55
but where's the fun in that........? + it feels.... dirty somehow.

Charlie
===============================================================
Msg 3903, Level 16, State 1, Line 1736
The ROLLBACK TRANSACTION request has no corresponding BEGIN TRANSACTION
Go to Top of Page

R
Constraint Violating Yak Guru

328 Posts

Posted - 2012-02-10 : 08:10:49
Hi

Thanks for your further explanation. Yes I was getting confused with what I was trying to achieve. I'll stick with the colA IN (SELECT colB from @tbl2) OR [colA] IS NULL option for now as that works well enough.

Thanks again. That's another brain cell occupied with something useful...!
Go to Top of Page
   

- Advertisement -