上一篇对路径的生成进行了分析,通过make_one_rel最终拿到了一个带着路径的RelOptInfo。本篇针对带volatile函数的排序场景继续分析subquery_planner的后续流程。
subquery_planner
grouping_planner
query_planner
make_one_rel <<< 上一篇
// 后续流程 <<< 本篇
一句话总结:带有volatile的投影列会被SORT算子忽略,达到先排序在投影计算volatile的效果。
grouping_planner→make_one_rel
层层生成path
,每个path都会带pathtarget
(不一定是SQL中最后需要的target列表),一般都是层层继承上来的。make_one_rel
生成的最终path中,会忽略volatile
函数列,交给外层grouping_planner
函数处理,所以生成的path
中的pathtarget
都是看不到volatile函数列的。pathtarget
和make_sort_input_target
计算出来列表的是不是一样的
path
中的pathtarget会忽略volatile函数。make_sort_input_target
中的volatile函数正常也会被忽略掉(实例3),除非volatile函数就是排序列(实例4)。最终效果是,投影列有volatile
函数的SQL(函数非排序列),sort
节点会忽略这类函数的执行,sort结束后,在投影节点使用sort的结果集来计算这类函数。
实例3:
drop table student;
create table student(sno int primary key, sname varchar(10), ssex int);
insert into student values(1, 'stu1', 0);
insert into student values(2, 'stu2', 1);
insert into student values(3, 'stu3', 1);
insert into student values(4, 'stu4', 0);
drop table course;
create table course(cno int primary key, cname varchar(10), tno int);
insert into course values(20, 'meth', 10);
insert into course values(21, 'english', 11);
drop table teacher;
create table teacher(tno int primary key, tname varchar(10), tsex int);
insert into teacher values(10, 'te1', 1);
insert into teacher values(11, 'te2', 0);
drop table score;
create table score (sno int, cno int, degree int);
create index idx_score_sno on score(sno);
insert into score values (1, 20, 100);
insert into score values (1, 21, 89);
insert into score values (2, 20, 99);
insert into score values (2, 21, 90);
insert into score values (3, 20, 87);
insert into score values (3, 21, 20);
insert into score values (4, 20, 60);
insert into score values (4, 21, 70);
explain
SELECT STUDENT.sname, COURSE.cname, SCORE.degree
FROM STUDENT
LEFT JOIN SCORE ON STUDENT.sno = SCORE.sno
LEFT JOIN COURSE ON SCORE.cno = COURSE.cno;
QUERY PLAN
------------------------------------------------------------------------------
Hash Left Join (cost=69.50..110.65 rows=2040 width=80)
Hash Cond: (score.cno = course.cno)
-> Hash Right Join (cost=34.75..70.53 rows=2040 width=46)
Hash Cond: (score.sno = student.sno)
-> Seq Scan on score (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=21.00..21.00 rows=1100 width=42)
-> Seq Scan on student (cost=0.00..21.00 rows=1100 width=42)
-> Hash (cost=21.00..21.00 rows=1100 width=42)
-> Seq Scan on course (cost=0.00..21.00 rows=1100 width=42)
grouping_planner
current_rel = query_planner(root, standard_qp_callback, &qp_extra);
final_target = create_pathtarget(root, root->processed_tlist);
STUDENT.sname
COURSE.cname
SCORE.degree
if (parse->sortClause)
make_sort_input_target
if (activeWindows)
...
if (have_grouping)
...
if (parse->hasTargetSRFs)
...
/* Apply scan/join target. */
scanjoin_target_same_exprs = list_length(scanjoin_targets) == 1
&& equal(scanjoin_target->exprs, current_rel->reltarget->exprs);
apply_scanjoin_target_to_paths(root, current_rel, scanjoin_targets,
scanjoin_targets_contain_srfs,
scanjoin_target_parallel_safe,
scanjoin_target_same_exprs);
if (have_grouping)
...
if (activeWindows)
...
if (parse->distinctClause)
...
if (parse->sortClause)
create_ordered_paths
final_rel = fetch_upper_rel(root, UPPERREL_FINAL, NULL);
foreach(lc, current_rel->pathlist)
if (parse->rowMarks)
create_lockrows_path
if (limit_needed(parse))
create_limit_path
add_path(final_rel, path);
grouping_planner函数执行结束,最后拼接的final_rel在upper_rels里面记录:
pathlist最上层是投影节点:
subquery_planner中后续处理流程:
计划生成步骤 | 作用 |
---|---|
root = subquery_planner | 优化器入口,返回PlannerInfo,里面记录了一个最终的RelOptInfo相当于一张逻辑表,每个ROI都记录了多个path,表示不同的计算路径 |
final_rel = fetch_upper_rel | 拿到最终的RelOptInfo |
best_path = get_cheapest_fractional_path | 在RelOptInfo中选择一个最优的path |
top_plan = create_plan→create_plan_recurse | 根据最优path生成计划 |
drop table student;
create table student(sno int primary key, sname varchar(10), ssex int);
insert into student values(1, 'stu1', 0);
insert into student values(2, 'stu2', 1);
insert into student values(3, 'stu3', 1);
insert into student values(4, 'stu4', 0);
drop table course;
create table course(cno int primary key, cname varchar(10), tno int);
insert into course values(20, 'meth', 10);
insert into course values(21, 'english', 11);
drop table teacher;
create table teacher(tno int primary key, tname varchar(10), tsex int);
insert into teacher values(10, 'te1', 1);
insert into teacher values(11, 'te2', 0);
drop table score;
create table score (sno int, cno int, degree int);
create index idx_score_sno on score(sno);
insert into score values (1, 20, 100);
insert into score values (1, 21, 89);
insert into score values (2, 20, 99);
insert into score values (2, 21, 90);
insert into score values (3, 20, 87);
insert into score values (3, 21, 20);
insert into score values (4, 20, 60);
insert into score values (4, 21, 70);
explain verbose
SELECT STUDENT.sname, COURSE.cname, SCORE.degree
FROM STUDENT
LEFT JOIN SCORE ON STUDENT.sno = SCORE.sno
LEFT JOIN COURSE ON SCORE.cno = COURSE.cno
ORDER BY COURSE.cno;
QUERY PLAN
--------------------------------------------------------------------------------------
Sort (cost=3.44..3.46 rows=8 width=19)
Output: student.sname, course.cname, score.degree, course.cno
Sort Key: course.cno
-> Hash Left Join (cost=2.14..3.32 rows=8 width=19)
Output: student.sname, course.cname, score.degree, course.cno
Inner Unique: true
Hash Cond: (score.cno = course.cno)
-> Hash Right Join (cost=1.09..2.21 rows=8 width=13)
Output: student.sname, score.degree, score.cno
Inner Unique: true
Hash Cond: (score.sno = student.sno)
-> Seq Scan on public.score (cost=0.00..1.08 rows=8 width=12)
Output: score.sno, score.cno, score.degree
-> Hash (cost=1.04..1.04 rows=4 width=9)
Output: student.sname, student.sno
-> Seq Scan on public.student (cost=0.00..1.04 rows=4 width=9)
Output: student.sname, student.sno
-> Hash (cost=1.02..1.02 rows=2 width=10)
Output: course.cname, course.cno
-> Seq Scan on public.course (cost=0.00..1.02 rows=2 width=10)
Output: course.cname, course.cno
grouping_planner
current_rel = query_planner(root, standard_qp_callback, &qp_extra);
final_target = create_pathtarget(root, root->processed_tlist);
if (parse->sortClause)
sort_input_target = make_sort_input_target(root, final_target, &have_postponed_srfs);
make_sort_input_target函数的作用:
生成的final_target和sort_input_target相同,因为没看到srf函数、易变函数。
final_target同sort_input_target | Var | 指向列 | sortgrouprefs |
---|---|---|---|
final_target->exprs->elements[0] | varno = 1, varattno = 2, vartype = 1043 | STUDENT.sname | 0 |
final_target->exprs->elements[1] | varno = 4, varattno = 2, vartype = 1043 | COURSE.cname | 0 |
final_target->exprs->elements[2] | varno = 2, varattno = 3, vartype = 23 | SCORE.degree | 0 |
final_target->exprs->elements[3] | varno = 4, varattno = 1, vartype = 23 | COURSE.cno | 1 |
grouping_planner继续执行,开始生成排序path:
...
if (parse->sortClause)
current_rel = create_ordered_paths(root,
current_rel,
final_target,
final_target_parallel_safe,
have_postponed_srfs ? -1.0 :
limit_tuples);
grouping_planner→create_ordered_paths
create_ordered_paths
// 创建一个排序节点
ordered_rel = fetch_upper_rel(root, UPPERREL_ORDERED, NULL);
// 拿到path入口,目前顶层是T_ProjectionPath,就一个节点
foreach(lc, input_rel->pathlist)
// 判断input_path->pathkeys是不是有序的?
// 因为现在计划树是hashjoin,每一列都是无序的,所以input_path->pathkeys是空的,需要排序
is_sorted = pathkeys_count_contained_in(root->sort_pathkeys, input_path->pathkeys, &presorted_keys);
if (is_sorted)
sorted_path = input_path;
else
sorted_path = (Path *) create_sort_path(root,
ordered_rel,
input_path,
root->sort_pathkeys,
limit_tuples);
sorted_path =
{ path =
{ type = T_SortPath,
pathtype = T_Sort,
parent = 0x2334030,
pathtarget = 0x2333ef0,
param_info = 0x0,
parallel_aware = false, parallel_safe = true, parallel_workers = 0,
rows = 8,
startup_cost = 3.4437500000000005,
total_cost = 3.4637500000000006,
pathkeys = 0x232e018},
subpath = 0x2333a00}
T_PathKey每个pathkey(排序列)都对应了一个T_EquivalenceClass,T_EquivalenceClass中记录了排序的具体信息。
{ type = T_PathKey,
pk_eclass = 0x232bf88,
pk_opfamily = 1976,
pk_strategy = 1,
pk_nulls_first = false}
T_EquivalenceClass中的ec_members记录了排序列信息Var{varno = 4, varattno = 1}
{ type = T_EquivalenceClass,
ec_opfamilies = 0x232ddf8, // List{ 1976 }
ec_collation = 0,
ec_members = 0x232df48, // List { EquivalenceMember }
// EquivalenceMember{
// type = T_EquivalenceMember,
// em_expr = 0x232de68, Var{varno = 4, varattno = 1}
// em_relids = 0x232de48,
// em_is_const = false,
// em_is_child = false,
// em_datatype = 23,
// em_jdomain = 0x2329158, em_parent = 0x0}
ec_sources = 0x0,
ec_derives = 0x0,
ec_relids = 0x232df28,
ec_has_const = false,
ec_has_volatile = false,
ec_broken = false,
ec_sortref = 1,
ec_min_security = 4294967295,
ec_max_security = 0,
ec_merged = 0x0}
生成排序节点后的计划:
drop table student;
create table student(sno int primary key, sname varchar(10), ssex int);
insert into student values(1, 'stu1', 0);
insert into student values(2, 'stu2', 1);
insert into student values(3, 'stu3', 1);
insert into student values(4, 'stu4', 0);
drop table course;
create table course(cno int primary key, cname varchar(10), tno int);
insert into course values(20, 'meth', 10);
insert into course values(21, 'english', 11);
drop table teacher;
create table teacher(tno int primary key, tname varchar(10), tsex int);
insert into teacher values(10, 'te1', 1);
insert into teacher values(11, 'te2', 0);
drop table score;
create table score (sno int, cno int, degree int);
create index idx_score_sno on score(sno);
insert into score values (1, 20, 100);
insert into score values (1, 21, 89);
insert into score values (2, 20, 99);
insert into score values (2, 21, 90);
insert into score values (3, 20, 87);
insert into score values (3, 21, 20);
insert into score values (4, 20, 60);
insert into score values (4, 21, 70);
explain verbose
SELECT STUDENT.sname, random(), SCORE.degree
FROM STUDENT
LEFT JOIN SCORE ON STUDENT.sno = SCORE.sno
LEFT JOIN COURSE ON SCORE.cno = COURSE.cno
ORDER BY COURSE.cno;
QUERY PLAN
--------------------------------------------------------------------------------------------
Result (cost=3.44..3.56 rows=8 width=21)
Output: student.sname, random(), score.degree, course.cno
-> Sort (cost=3.44..3.46 rows=8 width=13)
Output: student.sname, score.degree, course.cno
Sort Key: course.cno
-> Hash Left Join (cost=2.14..3.32 rows=8 width=13)
Output: student.sname, score.degree, course.cno
Inner Unique: true
Hash Cond: (score.cno = course.cno)
-> Hash Right Join (cost=1.09..2.21 rows=8 width=13)
Output: student.sname, score.degree, score.cno
Inner Unique: true
Hash Cond: (score.sno = student.sno)
-> Seq Scan on public.score (cost=0.00..1.08 rows=8 width=12)
Output: score.sno, score.cno, score.degree
-> Hash (cost=1.04..1.04 rows=4 width=9)
Output: student.sname, student.sno
-> Seq Scan on public.student (cost=0.00..1.04 rows=4 width=9)
Output: student.sname, student.sno
-> Hash (cost=1.02..1.02 rows=2 width=4)
Output: course.cno
-> Seq Scan on public.course (cost=0.00..1.02 rows=2 width=4)
Output: course.cno
make_one_rel前:
准备连接的RelOptInfo在simple_rel_array数组中,这里关注下三个RelOptInfo的reltarget:
(gdb) plist root->simple_rel_array[1]->reltarget->exprs
$67 = 2
$68 = {ptr_value = 0x3083218, int_value = 50868760, oid_value = 50868760, xid_value = 50868760}
$69 = {ptr_value = 0x30ab8b8, int_value = 51034296, oid_value = 51034296, xid_value = 51034296}
(gdb) p root->simple_rte_array[1]->relid
$70 = 16564
root→simple_rel_array[i] | simple_rel_array[i]→reltarget->exprs | relid |
---|---|---|
1 | varno = 1, varattno = 2, vartype = 1043 | 16564 student.sname |
1 | varno = 1, varattno = 1, vartype = 23 | 16564 student.sno |
2 | varno = 2, varattno = 3, vartype = 23 | 16579 score.degree |
2 | varno = 2, varattno = 1, vartype = 23 | 16579 score.cno |
2 | varno = 2, varattno = 2, vartype = 23 | 16579 score.sno |
4 | varno = 4, varattno = 1, vartype = 23 | 16569 course.cno |
SELECT STUDENT.sname, random(), SCORE.degree
FROM STUDENT
LEFT JOIN SCORE ON STUDENT.sno = SCORE.sno
LEFT JOIN COURSE ON SCORE.cno = COURSE.cno
ORDER BY COURSE.cno;
make_one_rel生成后:
final_rel->reltarget->exprs | 列 | |
---|---|---|
1 | varno = 1, varattno = 2, vartype = 1043 | 投影第1列:STUDENT.sname |
2 | varno = 2, varattno = 3, vartype = 23 | 投影第3列:SCORE.degree |
3 | varno = 4, varattno = 1, vartype = 23 | 排序列:COURSE.cno |
final_target = create_pathtarget(root, root->processed_tlist);
拿到的final_target:
final_target | Var / FuncExpr | 指向列 | sortgrouprefs |
---|---|---|---|
final_target->exprs->elements[0] | varno = 1, varattno = 2, vartype = 1043 | STUDENT.sname | 0 |
final_target->exprs->elements[1] | funcid = 1598, funcresulttype = 701 | random() | 0 |
final_target->exprs->elements[2] | varno = 2, varattno = 3, vartype = 23 | SCORE.degree | 0 |
final_target->exprs->elements[3] | varno = 4, varattno = 1, vartype = 23 | COURSE.cno | 1 |
make_sort_input_target
拿到的sort_input_target,过滤掉了random列:
sort_input_target | Var / FuncExpr | 指向列 | sortgrouprefs |
---|---|---|---|
sort_input_target->exprs->elements[0] | varno = 1, varattno = 2, vartype = 1043 | STUDENT.sname | 0 |
sort_input_target->exprs->elements[1] | varno = 2, varattno = 3, vartype = 23 | SCORE.degree | 0 |
sort_input_target->exprs->elements[2] | varno = 4, varattno = 1, vartype = 23 | COURSE.cno | 1 |
实例2中,apply_scanjoin_target_to_paths会先挂投影节点,后面的create_ordered_paths在创建顶层的排序节点,为什么这里的投影节点在最上层?因为有volatile函数在,需要先排序,在到投影节点上计算random函数
final_target = create_pathtarget(root, root->processed_tlist);
...
sort_input_target = make_sort_input_target(...);
...
grouping_target = sort_input_target;
...
scanjoin_target = grouping_target;
...
scanjoin_targets = list_make1(scanjoin_target);
...
scanjoin_target_same_exprs = list_length(scanjoin_targets) == 1
&& equal(scanjoin_target->exprs, current_rel->reltarget->exprs);
...
// 1 确定没有SRF list_length(scanjoin_targets) == 1
// 2 这里make_one_rel出来的current_rel和上面make_sort_input_target计算出来的投影列一样,都过滤掉了v函数,剩下三列
// scanjoin_target_same_exprs == true
scanjoin_target_same_exprs = list_length(scanjoin_targets) == 1
&& equal(scanjoin_target->exprs, current_rel->reltarget->exprs);
apply_scanjoin_target_to_paths(root, current_rel, scanjoin_targets,
scanjoin_targets_contain_srfs,
scanjoin_target_parallel_safe,
注意:
在apply_scanjoin_target_to_paths中:
apply_scanjoin_target_to_paths
...
...
foreach(lc, rel->pathlist)
{
Path *subpath = (Path *) lfirst(lc);
if (tlist_same_exprs)
// scanjoin_target->sortgrouprefs = [0, 0, 1] 表示第三列是排序列
// 因为现在的scanjoin_target(同sort_input_target)中只有三列,投影列1、3和排序列,参考上面sort_input_target表格。
subpath->pathtarget->sortgrouprefs = scanjoin_target->sortgrouprefs;
else
{
Path *newpath;
newpath = (Path *) create_projection_path(root, rel, subpath,
scanjoin_target);
lfirst(lc) = newpath;
}
}
继续成成排序node:
grouping_planner
...
if (parse->sortClause)
current_rel = create_ordered_paths(root,
current_rel,
final_target,
final_target_parallel_safe,
have_postponed_srfs ? -1.0 :
limit_tuples);
final_target | Var / FuncExpr | 指向列 | sortgrouprefs |
---|---|---|---|
final_target->exprs->elements[0] | varno = 1, varattno = 2, vartype = 1043 | STUDENT.sname | 0 |
final_target->exprs->elements[1] | funcid = 1598, funcresulttype = 701 | random() | 0 |
final_target->exprs->elements[2] | varno = 2, varattno = 3, vartype = 23 | SCORE.degree | 0 |
final_target->exprs->elements[3] | varno = 4, varattno = 1, vartype = 23 | COURSE.cno | 1 |
create_ordered_paths
ordered_rel = fetch_upper_rel(root, UPPERREL_ORDERED, NULL);
foreach(lc, input_rel->pathlist)
is_sorted = pathkeys_count_contained_in
if (is_sorted)
sorted_path = input_path;
else
sorted_path = (Path *) create_sort_path(...)
// 生成sorted_path
// {type = T_SortPath, pathtype = T_Sort, pathtarget = 三列 }
if (sorted_path->pathtarget != target)
sorted_path = apply_projection_to_path(root, ordered_rel, sorted_path, target);
// 生成投影列
// {type = T_ProjectionPath, pathtype = T_Result, pathtarget = 四列 }
最终生成PATH:
SELECT STUDENT.sname, random(), SCORE.degree
FROM STUDENT
LEFT JOIN SCORE ON STUDENT.sno = SCORE.sno
LEFT JOIN COURSE ON SCORE.cno = COURSE.cno
ORDER BY COURSE.cno;
最终效果:
drop table student;
create table student(sno int primary key, sname varchar(10), ssex int);
insert into student values(1, 'stu1', 0);
insert into student values(2, 'stu2', 1);
insert into student values(3, 'stu3', 1);
insert into student values(4, 'stu4', 0);
drop table course;
create table course(cno int primary key, cname varchar(10), tno int);
insert into course values(20, 'meth', 10);
insert into course values(21, 'english', 11);
drop table teacher;
create table teacher(tno int primary key, tname varchar(10), tsex int);
insert into teacher values(10, 'te1', 1);
insert into teacher values(11, 'te2', 0);
drop table score;
create table score (sno int, cno int, degree int);
create index idx_score_sno on score(sno);
insert into score values (1, 20, 100);
insert into score values (1, 21, 89);
insert into score values (2, 20, 99);
insert into score values (2, 21, 90);
insert into score values (3, 20, 87);
insert into score values (3, 21, 20);
insert into score values (4, 20, 60);
insert into score values (4, 21, 70);
explain verbose
SELECT STUDENT.sname, random(), SCORE.degree
FROM STUDENT
LEFT JOIN SCORE ON STUDENT.sno = SCORE.sno
LEFT JOIN COURSE ON SCORE.cno = COURSE.cno
ORDER BY random();
QUERY PLAN
--------------------------------------------------------------------------------
Sort (cost=2.35..2.37 rows=8 width=17)
Output: student.sname, (random()), score.degree
Sort Key: (random())
-> Hash Right Join (cost=1.09..2.23 rows=8 width=17)
Output: student.sname, random(), score.degree
Inner Unique: true
Hash Cond: (score.sno = student.sno)
-> Seq Scan on public.score (cost=0.00..1.08 rows=8 width=12)
Output: score.sno, score.cno, score.degree
-> Hash (cost=1.04..1.04 rows=4 width=9)
Output: student.sname, student.sno
-> Seq Scan on public.student (cost=0.00..1.04 rows=4 width=9)
Output: student.sname, student.sno
第一步:拿到RelOptInfo
current_rel = query_planner(root, standard_qp_callback, &qp_extra);
current_rel->reltarget中忽略了random函数:
{
type = T_PathTarget,
exprs =
{
Var{varno = 1, varattno = 2, vartype = 1043}, // STUDENT.sname
Var{varno = 2, varattno = 3, vartype = 23} // SCORE.degree
},
sortgrouprefs = 0x0 }
final_target = create_pathtarget(root, root->processed_tlist);
{
type = T_PathTarget,
exprs =
{
Var{varno = 1, varattno = 2, vartype = 1043}, // STUDENT.sname
FuncExpr {xpr = {type = T_FuncExpr}, funcid = 1598}, // random()
Var{varno = 2, varattno = 3, vartype = 23} // SCORE.degree
},
sortgrouprefs = [0, 1, 0]
}
sort_input_target = make_sort_input_target(root, final_target, &have_postponed_srfs);
{
type = T_PathTarget,
exprs =
{
Var{varno = 1, varattno = 2, vartype = 1043}, // STUDENT.sname
FuncExpr {xpr = {type = T_FuncExpr}, funcid = 1598}, // random()
Var{varno = 2, varattno = 3, vartype = 23} // SCORE.degree
},
sortgrouprefs = [0, 1, 0]
}
apply_scanjoin_target_to_paths执行后,增加投影节点:
{ path = {type = T_ProjectionPath, pathtype = T_Result }
{ path = {type = T_SortPath, pathtype = T_Sort }