-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
1. 第6章集合内容
- Loading branch information
Showing
1 changed file
with
247 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,247 @@ | ||
# 集合 | ||
|
||
[TOC] | ||
|
||
## 1. 集合理论实践 | ||
|
||
1. 当对两个数据集合执行集合操作时, 必须首先应用下面的规范: | ||
|
||
* 两个数据集合必须具有同样数目的列 | ||
* 两个数据集中对应列的数据类必须是一样的(或者服务器能够将其中一张转换为另一种) | ||
|
||
|
||
2.对于两张表中的数据行, 如果每个 `列对` 都包含了同样的字符串, 数字或日期等, 那么它们可以被视为`重复行`. <br> | ||
3. 将多个查询的结果集使用集合操作符进行连接, 得到最终查询结果集的称为**复合查询** | ||
|
||
## 2. 集合操作符 | ||
|
||
> 结合操作符 | ||
> | ||
> * union 并操作 | ||
> * intersect 交操作 | ||
> * except 差操作 | ||
> 每个集合操作符可以有两种修饰符: | ||
> 一个表示包含重复项 | ||
> 另一个表示去除重复项(但不一定是所有的重复项) | ||
|
||
### 2.1 union 操作符 | ||
> 集合的并操作, 就是将两个集合的数据合并到一起, 成为一个大集合 | ||
> union 和 union all的区别: | ||
> union 对连接后的集合排序并去除重复项 | ||
> union all 保留重复项. 最终数据集的行数还是总是等于索要连接的各集合的行数之和. | ||
> [13.2.9.3 UNION语法 | ||
](https://dev.mysql.com/doc/refman/5.7/en/union.html) | ||
|
||
**union** | ||
> 将重复的行删除 | ||
```c | ||
mysql> select emp_id | ||
-> from employee | ||
-> where assigned_branch_id = 2 | ||
-> and (title='Teller' or title='Head Teller') // 所有Woburn支行的所有柜员 | ||
-> union // 去除重复行 | ||
-> select distinct open_emp_id // distinct 关键字去除第二个集合数据中重复的数据 | ||
-> from account | ||
-> where open_branch_id = 2; // 所有在Woburn支行开户的雇员 | ||
|
||
+--------+ | ||
| emp_id | | ||
+--------+ | ||
| 10 | | ||
| 11 | | ||
| 12 | | ||
| 20 | | ||
+--------+ | ||
4 rows in set (0.01 sec) | ||
|
||
mysql> | ||
``` | ||
**union all** | ||
> 结果集中不删除重复行 | ||
```c | ||
mysql> select emp_id | ||
-> from employee | ||
-> where assigned_branch_id = 2 | ||
-> and (title='Teller' or title='Head Teller') // 所有Woburn支行的所有柜员 | ||
-> union all // 不去除重复行 | ||
-> select distinct open_emp_id | ||
-> from account | ||
-> where open_branch_id = 2; // 所有在Woburn支行开户的雇员 | ||
+--------+ | ||
| emp_id | | ||
+--------+ | ||
| 10 | | ||
| 11 | | ||
| 12 | | ||
| 20 | | ||
| 10 | | ||
+--------+ | ||
5 rows in set (0.01 sec) | ||
``` | ||
|
||
---- | ||
|
||
### 2.2 intersect 操作符 | ||
> 集合交操作, 用于获取两个集合共有的数据.如果没有共有的数据就返回空集 | ||
> intersect操作符还未在MySQL中实现 | ||
> intersect 操作符去除了交集区域中所有重复的行(也就是说交集操作的结果集中不会有重复的行). | ||
> intersect all 操作符保留所有的交集数据, 不会去除重复行. | ||
**intersect** | ||
|
||
```c | ||
mysql> select emp_id, fname, lname | ||
-> from employee // 返回每个雇员的ID和姓名 | ||
-> intersect // 交操作 | ||
-> select cust_id, fname, lname | ||
-> from individual; // 返回每个客户的id和新项目 | ||
|
||
// 两个集合是完全不重合的, 因此以上的交集操作产生的结果为空集. | ||
|
||
``` | ||
|
||
```c | ||
mysql> select emp_id | ||
-> from employee | ||
-> where assigned_branch_id = 2 | ||
-> and (title = 'Teller' or title = 'Head Teller') // 返回 10, 11, 12, 20 | ||
-> intersect | ||
-> select distinct open_emp_id | ||
-> from account | ||
-> where open_branch_id = 2; // 返回 10 (distinct去除了重复的行) | ||
|
||
+--------+ | ||
| emp_id | | ||
+--------+ | ||
| 10 | // 所以交集的结果为 10 | ||
+--------+ | ||
|
||
// 注意: 结果集中的列名使用了第一个集合的列名 | ||
|
||
``` | ||
---- | ||
### 2.3 except 操作符 | ||
> 集合差操作. | ||
> except操作符也没有在MySQL中实现 | ||
> except操作符返回第一个表减去第二个表重合的元素后剩下的部分. | ||
> except操作在结果集中去除所有的相同数据. | ||
> except all 操作在结果集中根据重复数据在集合B中出现的次数进行删除. | ||
**举例** | ||
集合A中含有数据: 10, 11, 12, 10, 10 | ||
集合B中含有数据: 10, 10 | ||
操作 A except B 产生的结果是: 11, 12 | ||
操作 A except all B 产生的结果是: 10, 11, 12. | ||
**猜测** | ||
操作 B except A 产生的结果是: 空集 | ||
----- | ||
## 3. 集合操作规则 | ||
### 3.1 对复合查询结果排序 | ||
> 如果要对复合查询结果进行排序, 可以在**最后一个查询**后面增加`order by`子句 | ||
> 注意: 当在`order by`子句中指定要排序的列时, 需要从复合查询的**第一个查询中选择列名**. | ||
> 通常情况下, 复合查询中两个查询对应列的名字是相同的, 但并不是强制的. | ||
```c | ||
mysql> select emp_id, assigned_branch_id | ||
-> from employee | ||
-> where title = 'Teller' | ||
-> union // 并操作 | ||
-> select open_emp_id, open_branch_id | ||
-> from account | ||
-> where product_cd = 'SAV' | ||
-> order by emp_id; // 根据职员id号升序排序, 选择第一个查询中的 emp_id作为排序条件. | ||
+--------+--------------------+ | ||
| emp_id | assigned_branch_id | | ||
+--------+--------------------+ | ||
| 1 | 1 | | ||
| 7 | 1 | | ||
| 8 | 1 | | ||
| 9 | 1 | | ||
| 10 | 2 | | ||
| 11 | 2 | | ||
| 12 | 2 | | ||
| 14 | 3 | | ||
| 15 | 3 | | ||
| 16 | 4 | | ||
| 17 | 4 | | ||
| 18 | 4 | | ||
+--------+--------------------+ | ||
12 rows in set (0.01 sec) | ||
``` | ||
|
||
> 如果上面的复合查询中的排序条件使用第二个查询的`open_emp_id`列就会报错; | ||
```c | ||
mysql> select emp_id, assigned_branch_id | ||
-> from employee | ||
-> where title = 'Teller' | ||
-> union | ||
-> select open_emp_id, open_branch_id | ||
-> from account | ||
-> where product_cd = 'SAV' | ||
-> order by open_emp_id; | ||
ERROR 1054 (42S22): Unknown column 'open_emp_id' in 'order clause' | ||
mysql> | ||
|
||
``` | ||
|
||
**解决方法** | ||
> 针对两个集合中对应的列, 起相同的别名. 这样就不容易发生上面的错误了. | ||
```c | ||
mysql> select emp_id employee_id, assigned_branch_id branch_id // 起了别名: employee_id和branch_id | ||
-> from employee | ||
-> where title = 'Teller' | ||
-> union | ||
-> select open_emp_id employee_id, open_branch_id branch_id // 和第一个查询中对应的列起了相同的别名. | ||
-> from account | ||
-> where product_cd = 'SAV' | ||
-> order by employee_id; | ||
+-------------+-----------+ | ||
| employee_id | branch_id | | ||
+-------------+-----------+ | ||
| 1 | 1 | | ||
| 7 | 1 | | ||
| 8 | 1 | | ||
| 9 | 1 | | ||
| 10 | 2 | | ||
| 11 | 2 | | ||
| 12 | 2 | | ||
| 14 | 3 | | ||
| 15 | 3 | | ||
| 16 | 4 | | ||
| 17 | 4 | | ||
| 18 | 4 | | ||
+-------------+-----------+ | ||
12 rows in set (0.00 sec) | ||
|
||
mysql> | ||
``` | ||
---- | ||
### 3.2 集合操作符优先级 | ||
> 当查询中出现多个集合操作符时, 先出现的操作符优先级高于后出现的操作符优先级. | ||
> 但是根据ANSI SQL标准, 在调用集合操作时, intersect操作符比其他操作符具有更高的优先级.(也就是不受第一条规则的影响) | ||
> 可以使用圆括号提高集合操作符的优先级. | ||