sql中IN与EXISTS，NOT IN与NOT EXISTS的详细介绍

强调下：在含有NULL值的列的时候，就要小心了，not exists与not in在逻辑上是不等价的
--先创建2张表

代码如下

复制代码

use master;

if db_id('DbTest') is not null drop database DbTest;

create database DbTest;
go

use DbTest;
go

--创建Customers表
create table Customers
(
custid INT NOT NULL IDENTITY,
companyname NVARCHAR(40) NOT NULL,
country NVARCHAR(15) NOT NULL,
constraint pk_customer primary key(custid)
);

--创建Orders表

代码如下

复制代码

CREATE TABLE Orders
(
orderid INT NOT NULL IDENTITY,
custid INT NULL,

CONSTRAINT PK_Orders PRIMARY KEY(orderid),
CONSTRAINT FK_Orders_Customers FOREIGN KEY(custid)
REFERENCES Customers(custid),

);
set identity_insert Customers on;

INSERT INTO Customers(custid, companyname,country)
VALUES(1, N'大众', N'中国');
INSERT INTO Customers(custid, companyname,country)
VALUES(2, N'宝马', N'美国');
INSERT INTO Customers(custid, companyname,country)
VALUES(3, N'奔驰', N'中国');
INSERT INTO Customers(custid, companyname,country)
VALUES(4, N'奇瑞', N'德国');
INSERT INTO Customers(custid, companyname,country)
VALUES(5, N'福特', N'美国');

set identity_insert Customers off;
set identity_insert Orders on;
--custid代表员工号
INSERT INTO Orders(orderid, custid)
VALUES(1,1);
INSERT INTO Orders(orderid, custid)
VALUES(2,2);
INSERT INTO Orders(orderid, custid)
VALUES(3,3);
INSERT INTO Orders(orderid, custid)
VALUES(4,4);
INSERT INTO Orders(orderid, custid)
VALUES(5,5);
--查看表的数据
select custid,companyname,country from Customers;
select orderid,custid from Orders;
--插入数据成功

--咱们回到正题,比较Exists与in,not exists与 not in

--查询来自中国,而且下过订单的所有客户

代码如下	复制代码
select custid,companyname from Customers as C where country=N'中国' and exists (select * from Orders as O where O.custid=C.custid); --返回 --custid companyname --1 大众 --3 奔驰

--外部查询返回来自中国的客户信息,对于这个客户,exists谓词在Orders表查找是否至少存在一个与外部客户行信息相同的custid订单行

--用IN查询刚刚的需求

代码如下	复制代码
select custid,companyname from Customers as C where country=N'中国' and custid in(select custid from Orders);

--结果跟上面的返回一样的值

--下面的知识点我们需要认识到:
--当列表中有NULL时，in实际会产生一个UNKNOWN的结果，例如 a in(d,b,null)的结果是UNKNOWN,而a not in (d,b,null)返回的是not unknowd仍然是unknowd
--而not in与not exists则结果会很不同，例如a in(a,b,null)返回的是TRUE,而a not in(a,b,null)返回的肯定是not true即为false
--有了上面的认识，好继续开工了....
--我们现在向Orders表插入一行数据（6,null）

代码如下

复制代码

set identity_insert Orders on;

insert into Orders(orderid,custid) values(6,null);

set identity_insert Orders off;

set identity_insert Customers on;

insert into Customers(custid,companyname,country) values(7,N'雷克萨斯',N'美国');

set identity_insert Customers off;

select * from Orders;
select * from Customers;

--假设现在要返回来自美国且没有订单的客户

代码如下	复制代码
select custid,companyname from Customers as C where country=N'美国' and not exists (select * from Orders as O where O.custid=C.custid ); --返回 --custid companyname --7 雷克萨斯

--我们再用IN方法

代码如下	复制代码
select custid,companyname from Customers as C where country=N'美国' and custid not in(select custid from Orders);

--返回的结果为空!!!
--为什么呢??
--因为还记得我们刚插入的一行数据中custid为null么，这就是问题所在
--not in (select custid from Orders)返回的实际是unknown，所以返回结果集为空，除非你显示的规定custid不能为空

--下面是正确的解决方法

代码如下	复制代码
select custid,companyname from Customers as C where country=N'美国' and custid not in (select custid from Orders where custid is not null); --返回 --custid companyname --7 雷克萨斯

--所以在含有NULL值的列的时候，就要小心了，not exists与not in在逻辑上是不等价的

in适合内外表都很大的情况，exists适合外表结果集很小的情况。
exists 和 in 使用一例
===========================================================
今天市场报告有个sql及慢，运行需要20多分钟，如下：

代码如下	复制代码
update p_container_decl cd set cd.ANNUL_FLAG='0001',ANNUL_DATE = sysdate where exists( select 1 from ( select tc.decl_no,tc.goods_no from p_transfer_cont tc,P_AFFIRM_DO ad where tc.GOODS_DECL_NO = ad.DECL_NO and ad.DECL_NO = 'sssssssssssssssss' ) a where a.decl_no = cd.decl_no and a.goods_no = cd.goods_no )

上面涉及的3个表的记录数都不小，均在百万左右。根据这种情况，我想到了前不久看的tom的一篇文章，说的是exists和in的区别，
in 是把外表和那表作hash join，而exists是对外表作loop，每次loop再对那表进行查询。
这样的话，in适合内外表都很大的情况，exists适合外表结果集很小的情况。

而我目前的情况适合用in来作查询，于是我改写了sql，如下：

代码如下	复制代码
update p_container_decl cd set cd.ANNUL_FLAG='0001',ANNUL_DATE = sysdate where (decl_no,goods_no) in ( select tc.decl_no,tc.goods_no from p_transfer_cont tc,P_AFFIRM_DO ad where tc.GOODS_DECL_NO = ad.DECL_NO and ad.DECL_NO = ‘ssssssssssss’ )

让市场人员测试，结果运行时间在1分钟内。问题解决了，看来exists和in确实是要根据表的数据量来决定使用。

请注意not in 逻辑上不完全等同于not exists，如果你误用了not in，小心你的程序存在致命的BUG：

请看下面的例子：

代码如下

复制代码

create table t1 (c1 number,c2 number);
create table t2 (c1 number,c2 number);

insert into t1 values (1,2);
insert into t1 values (1,3);
insert into t2 values (1,2);
insert into t2 values (1,null);

select * from t1 where c2 not in (select c2 from t2);
no rows found
select * from t1 where not exists (select 1 from t2 where t1.c2=t2.c2);
c1 c2
1 3

正如所看到的，not in 出现了不期望的结果集，存在逻辑错误。如果看一下上述两个select语句的执行计划，也会不同。后者使用了hash_aj。
因此，请尽量不要使用not in(它会调用子查询)，而尽量使用not exists(它会调用关联子查询)。如果子查询中返回的任意一条记录含有空值，则查询将不返回任何记录，正如上面例子所示。
除非子查询字段有非空限制，这时可以使用not in ,并且也可以通过提示让它使用hasg_aj或merge_aj连接。

接着看

SQL中IN,NOT IN,EXISTS,NOT EXISTS的用法和差别:
　　IN:确定给定的值是否与子查询或列表中的值相匹配。
　　IN 关键字使您得以选择与列表中的任意一个值匹配的行。
　　当要获得居住在 California、Indiana 或 Maryland 州的所有作者的姓名和州的列表时，就需要下列查询：
　

代码如下	复制代码
SELECT ProductID, ProductName FROM Northwind.dbo.Products WHERE CategoryID = 1 OR CategoryID = 4 OR CategoryID = 5

　　然而，如果使用 IN，少键入一些字符也可以得到同样的结果：
　　

代码如下	复制代码
SELECT ProductID, ProductName FROM Northwind.dbo.Products WHERE CategoryID IN (1, 4, 5)

　　IN 关键字之后的项目必须用逗号隔开，并且括在括号中。
　　下列查询在 titleauthor 表中查找在任一种书中得到的版税少于 50% 的所有作者的 au_id，然后从 authors 表中选择 au_id 与
　　titleauthor 查询结果匹配的所有作者的姓名：
　

代码如下	复制代码
SELECT au_lname, au_fname FROM authors WHERE au_id IN (SELECT au_id FROM titleauthor WHERE royaltyper <50)

　　结果显示有一些作者属于少于 50% 的一类。
　　NOT IN:通过 NOT IN 关键字引入的子查询也返回一列零值或更多值。
　　以下查询查找没有出版过商业书籍的出版商的名称。
　　

代码如下	复制代码
SELECT pub_name FROM publishers WHERE pub_id NOT IN (SELECT pub_id FROM titles WHERE type = 'business')

　　使用 EXISTS 和 NOT EXISTS 引入的子查询可用于两种集合原理的操作：交集与差集。两个集合的交集包含同时属于两个原集合的所有元素。
　　差集包含只属于两个集合中的第一个集合的元素。
　　EXISTS:指定一个子查询，检测行的存在。
　　本示例所示查询查找由位于以字母 B 开头的城市中的任一出版商出版的书名：
　　

代码如下	复制代码
SELECT DISTINCT pub_name FROM publishers WHERE EXISTS (SELECT * FROM titles WHERE pub_id = publishers.pub_id AND type = 　　'business') 　　SELECT distinct pub_name FROM publishers WHERE pub_id IN (SELECT pub_id FROM titles WHERE type = 'business')

两者的区别:
　　EXISTS:后面可以是整句的查询语句如:SELECT * FROM titles
　　IN:后面只能是对单列:SELECT pub_id FROM titles
　　NOT EXISTS:
　　例如，要查找不出版商业书籍的出版商的名称：
　

代码如下	复制代码
SELECT pub_name FROM publishers WHERE NOT EXISTS (SELECT * FROM titles WHERE pub_id = publishers.pub_id AND type = 　　'business')

　　下面的查询查找已经不销售的书的名称:

代码如下	复制代码
SELECT title FROM titles WHERE NOT EXISTS (SELECT title_id FROM sales WHERE title_id = titles.title_id)

补充：数据库,Mssql