Skip to content

Commit

Permalink
Bug#36980184: Use first-row-count instead of init-cost as cost
Browse files Browse the repository at this point in the history
dimension in hypergraph [2/2, compare]

Make CompareAccessPaths() use first_row_cost() instead of init_cost()
as cost dimension.

When init_cost() is a cost dimension, the hypergraph optimizer will
keep the candidates that are the cheapest ones to initialize and read
zero rows from. A better dimension would be the cost of initializing
the path and reading the amount of rows needed to reach the LIMIT in
the query. Currently, there is no estimate for the number of rows that
have to be read from a path to reach the LIMIT. But it's reasonable to
expect that more often than not, at least one row will be read. So
using first_row_cost() will in most cases be slightly better than
using init_cost().

This change could give better plans for some queries using LIMIT.
Another effect is that it could speed up planning time, because more
candidates with a very high total cost and zero init cost can be
rejected earlier when a fraction of the very high total cost is taken
into the value. This is seen to reduce planning time in some complex
queries.

Change-Id: Ida4fcf734b0e5973010891bb97ee0423c9ff7840
  • Loading branch information
kahatlen committed Aug 30, 2024
1 parent 06ee851 commit 57341ee
Show file tree
Hide file tree
Showing 7 changed files with 161 additions and 145 deletions.
112 changes: 56 additions & 56 deletions mysql-test/r/group_skip_scan_ext_hypergraph.result

Large diffs are not rendered by default.

108 changes: 54 additions & 54 deletions mysql-test/r/group_skip_scan_hypergraph.result
Original file line number Diff line number Diff line change
Expand Up @@ -301,23 +301,23 @@ b i421 l421
b m422 p422
explain format=tree select a1,a2,b,min(c),max(c) from t1 where a1 < 'd' group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 < 'd') (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=9)
-> Filter: (t1.a1 < 'd') (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=16)

explain format=tree select a1,a2,b,min(c),max(c) from t1 where a1 >= 'b' group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 >= 'b') (rows=13)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over ('b' <= a1) (rows=13)
-> Filter: (t1.a1 >= 'b') (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over ('b' <= a1) (rows=16)

explain format=tree select a1,a2,b, max(c) from t1 where a1 >= 'c' or a1 < 'b' group by a1,a2,b;
EXPLAIN
-> Filter: ((t1.a1 >= 'c') or (t1.a1 < 'b')) (rows=13)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=13)
-> Filter: ((t1.a1 >= 'c') or (t1.a1 < 'b')) (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=16)

explain format=tree select a1, max(c) from t1 where a1 >= 'c' or a1 < 'b' group by a1,a2,b;
EXPLAIN
-> Filter: ((t1.a1 >= 'c') or (t1.a1 < 'b')) (rows=13)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=13)
-> Filter: ((t1.a1 >= 'c') or (t1.a1 < 'b')) (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=16)

explain format=tree select a1,a2,b,min(c),max(c) from t1 where a1 >= 'c' or a2 < 'b' group by a1,a2,b;
EXPLAIN
Expand All @@ -326,58 +326,58 @@ EXPLAIN

explain format=tree select a1,a2,b, max(c) from t1 where a1 = 'z' or a1 = 'b' or a1 = 'd' group by a1,a2,b;
EXPLAIN
-> Filter: ((t1.a1 = 'z') or (t1.a1 = 'b') or (t1.a1 = 'd')) (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=9)
-> Filter: ((t1.a1 = 'z') or (t1.a1 = 'b') or (t1.a1 = 'd')) (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=16)

explain format=tree select a1,a2,b,min(c),max(c) from t1 where a1 = 'z' or a1 = 'b' or a1 = 'd' group by a1,a2,b;
EXPLAIN
-> Filter: ((t1.a1 = 'z') or (t1.a1 = 'b') or (t1.a1 = 'd')) (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=9)
-> Filter: ((t1.a1 = 'z') or (t1.a1 = 'b') or (t1.a1 = 'd')) (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=16)

explain format=tree select a1,a2,b, max(c) from t1 where (a1 = 'b' or a1 = 'd' or a1 = 'a' or a1 = 'c') and (a2 > 'a') group by a1,a2,b;
EXPLAIN
-> Filter: ((t1.a2 > 'a') and ((t1.a1 = 'b') or (t1.a1 = 'd') or (t1.a1 = 'a') or (t1.a1 = 'c'))) (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=9)
-> Filter: ((t1.a2 > 'a') and ((t1.a1 = 'b') or (t1.a1 = 'd') or (t1.a1 = 'a') or (t1.a1 = 'c'))) (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=16)

explain format=tree select a1,a2,b,min(c),max(c) from t1 where (a1 = 'b' or a1 = 'd' or a1 = 'a' or a1 = 'c') and (a2 > 'a') group by a1,a2,b;
EXPLAIN
-> Filter: ((t1.a2 > 'a') and ((t1.a1 = 'b') or (t1.a1 = 'd') or (t1.a1 = 'a') or (t1.a1 = 'c'))) (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=9)
-> Filter: ((t1.a2 > 'a') and ((t1.a1 = 'b') or (t1.a1 = 'd') or (t1.a1 = 'a') or (t1.a1 = 'c'))) (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=16)

explain format=tree select a1,min(c),max(c) from t1 where a1 >= 'b' group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 >= 'b') (rows=13)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over ('b' <= a1) (rows=13)
-> Filter: (t1.a1 >= 'b') (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over ('b' <= a1) (rows=16)

explain format=tree select a1, max(c) from t1 where a1 in ('a','b','d') group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 in ('a','b','d')) (rows=13)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'a') OR (a1 = 'b') OR (a1 = 'd') (rows=13)
-> Filter: (t1.a1 in ('a','b','d')) (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (a1 = 'a') OR (a1 = 'b') OR (a1 = 'd') (rows=16)

explain format=tree select a1,a2,b, max(c) from t2 where a1 < 'd' group by a1,a2,b;
EXPLAIN
-> Filter: (t2.a1 < 'd') (rows=21)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'd') (rows=21)
-> Filter: (t2.a1 < 'd') (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'd') (rows=32.8)

explain format=tree select a1,a2,b,min(c),max(c) from t2 where a1 < 'd' group by a1,a2,b;
EXPLAIN
-> Filter: (t2.a1 < 'd') (rows=21)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'd') (rows=21)
-> Filter: (t2.a1 < 'd') (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'd') (rows=32.8)

explain format=tree select a1,a2,b,min(c),max(c) from t2 where a1 >= 'b' group by a1,a2,b;
EXPLAIN
-> Filter: (t2.a1 >= 'b') (rows=24)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over ('b' <= a1) (rows=24)
-> Filter: (t2.a1 >= 'b') (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over ('b' <= a1) (rows=32.8)

explain format=tree select a1,a2,b, max(c) from t2 where a1 >= 'c' or a1 < 'b' group by a1,a2,b;
EXPLAIN
-> Filter: ((t2.a1 >= 'c') or (t2.a1 < 'b')) (rows=26)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=26)
-> Filter: ((t2.a1 >= 'c') or (t2.a1 < 'b')) (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=32.8)

explain format=tree select a1, max(c) from t2 where a1 >= 'c' or a1 < 'b' group by a1,a2,b;
EXPLAIN
-> Filter: ((t2.a1 >= 'c') or (t2.a1 < 'b')) (rows=26)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=26)
-> Filter: ((t2.a1 >= 'c') or (t2.a1 < 'b')) (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (NULL < a1 < 'b') OR ('c' <= a1) (rows=32.8)

explain format=tree select a1,a2,b,min(c),max(c) from t2 where a1 >= 'c' or a2 < 'b' group by a1,a2,b;
EXPLAIN
Expand All @@ -386,33 +386,33 @@ EXPLAIN

explain format=tree select a1,a2,b, max(c) from t2 where a1 = 'z' or a1 = 'b' or a1 = 'd' group by a1,a2,b;
EXPLAIN
-> Filter: ((t2.a1 = 'z') or (t2.a1 = 'b') or (t2.a1 = 'd')) (rows=13)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=13)
-> Filter: ((t2.a1 = 'z') or (t2.a1 = 'b') or (t2.a1 = 'd')) (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=32.8)

explain format=tree select a1,a2,b,min(c),max(c) from t2 where a1 = 'z' or a1 = 'b' or a1 = 'd' group by a1,a2,b;
EXPLAIN
-> Filter: ((t2.a1 = 'z') or (t2.a1 = 'b') or (t2.a1 = 'd')) (rows=13)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=13)
-> Filter: ((t2.a1 = 'z') or (t2.a1 = 'b') or (t2.a1 = 'd')) (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'b') OR (a1 = 'd') OR (a1 = 'z') (rows=32.8)

explain format=tree select a1,a2,b, max(c) from t2 where (a1 = 'b' or a1 = 'd' or a1 = 'a' or a1 = 'c') and (a2 > 'a') group by a1,a2,b;
EXPLAIN
-> Filter: ((t2.a2 > 'a') and ((t2.a1 = 'b') or (t2.a1 = 'd') or (t2.a1 = 'a') or (t2.a1 = 'c'))) (rows=12)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=12)
-> Filter: ((t2.a2 > 'a') and ((t2.a1 = 'b') or (t2.a1 = 'd') or (t2.a1 = 'a') or (t2.a1 = 'c'))) (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=32.8)

explain format=tree select a1,a2,b,min(c),max(c) from t2 where (a1 = 'b' or a1 = 'd' or a1 = 'a' or a1 = 'c') and (a2 > 'a') group by a1,a2,b;
EXPLAIN
-> Filter: ((t2.a2 > 'a') and ((t2.a1 = 'b') or (t2.a1 = 'd') or (t2.a1 = 'a') or (t2.a1 = 'c'))) (rows=12)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=12)
-> Filter: ((t2.a2 > 'a') and ((t2.a1 = 'b') or (t2.a1 = 'd') or (t2.a1 = 'a') or (t2.a1 = 'c'))) (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'a' AND 'a' < a2) OR (a1 = 'b' AND 'a' < a2) OR (2 more) (rows=32.8)

explain format=tree select a1,min(c),max(c) from t2 where a1 >= 'b' group by a1,a2,b;
EXPLAIN
-> Filter: (t2.a1 >= 'b') (rows=24)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over ('b' <= a1) (rows=24)
-> Filter: (t2.a1 >= 'b') (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over ('b' <= a1) (rows=32.8)

explain format=tree select a1, max(c) from t2 where a1 in ('a','b','d') group by a1,a2,b;
EXPLAIN
-> Filter: (t2.a1 in ('a','b','d')) (rows=21)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'a') OR (a1 = 'b') OR (a1 = 'd') (rows=21)
-> Filter: (t2.a1 in ('a','b','d')) (rows=32.8)
-> Covering index skip scan for grouping on t2 using idx_t2_1 over (a1 = 'a') OR (a1 = 'b') OR (a1 = 'd') (rows=32.8)

select a1,a2,b,min(c),max(c) from t1 where a1 < 'd' group by a1,a2,b;
a1 a2 b min(c) max(c)
Expand Down Expand Up @@ -1797,7 +1797,7 @@ EXPLAIN

explain format=tree select sql_big_result distinct a1,a2,b from t2;
EXPLAIN
-> Covering index skip scan for deduplication on t2 using idx_t2_1 (rows=32)
-> Covering index skip scan for deduplication on t2 using idx_t2_1 (rows=32.8)

explain format=tree select sql_big_result distinct a1,a2,b from t2 where (a2 >= 'b') and (b = 'a');
EXPLAIN
Expand Down Expand Up @@ -1961,7 +1961,7 @@ EXPLAIN

explain format=tree select sql_big_result distinct a1,a2,b from t2;
EXPLAIN
-> Covering index skip scan for deduplication on t2 using idx_t2_1 (rows=32)
-> Covering index skip scan for deduplication on t2 using idx_t2_1 (rows=32.8)

explain format=tree select distinct a1,a2,b from t2 where (a2 >= 'b') and (b = 'a') group by a1,a2,b;
EXPLAIN
Expand Down Expand Up @@ -2099,23 +2099,23 @@ select 98 + count(distinct a1,a2,b) from t1 where (a1 > 'a') and (a2 > 'a');
104
explain format=tree select a1,a2,b, concat(min(c), max(c)) from t1 where a1 < 'd' group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 < 'd') (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=9)
-> Filter: (t1.a1 < 'd') (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=16)

explain format=tree select concat(a1,min(c)),b from t1 where a1 < 'd' group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 < 'd') (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=9)
-> Filter: (t1.a1 < 'd') (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=16)

explain format=tree select concat(a1,min(c)),b,max(c) from t1 where a1 < 'd' group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 < 'd') (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=9)
-> Filter: (t1.a1 < 'd') (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=16)

explain format=tree select concat(a1,a2),b,min(c),max(c) from t1 where a1 < 'd' group by a1,a2,b;
EXPLAIN
-> Filter: (t1.a1 < 'd') (rows=3)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=9)
-> Filter: (t1.a1 < 'd') (rows=16)
-> Covering index skip scan for grouping on t1 using idx_t1_1 over (NULL < a1 < 'd') (rows=16)

explain format=tree select concat(ord(min(b)),ord(max(b))),min(b),max(b) from t1 group by a1,a2;
EXPLAIN
Expand Down Expand Up @@ -2689,7 +2689,7 @@ INSERT INTO t1 VALUES
(4), (2), (1), (2), (2), (4), (1), (4);
EXPLAIN FORMAT=TREE SELECT DISTINCT(a) FROM t1;
EXPLAIN
-> Covering index skip scan for deduplication on t1 using idx (rows=3.16)
-> Covering index skip scan for deduplication on t1 using idx (rows=4.9)

SELECT DISTINCT(a) FROM t1;
a
Expand All @@ -2698,7 +2698,7 @@ a
4
EXPLAIN FORMAT=TREE SELECT SQL_BIG_RESULT DISTINCT(a) FROM t1;
EXPLAIN
-> Covering index skip scan for deduplication on t1 using idx (rows=3.16)
-> Covering index skip scan for deduplication on t1 using idx (rows=4.9)

SELECT SQL_BIG_RESULT DISTINCT(a) FROM t1;
a
Expand Down
26 changes: 13 additions & 13 deletions mysql-test/r/hash_join_hypergraph.result
Original file line number Diff line number Diff line change
Expand Up @@ -1841,20 +1841,20 @@ FROM
b AS subquery3_t1
);
EXPLAIN
-> Nested loop left join (rows=0.2)
-> Inner hash join (subquery1_t1.col_varchar = subquery1_t2.col_varchar) (rows=0.2)
-> Table scan on subquery1_t2 (rows=2)
-> Hash
-> Inner hash join (subquery1_t1.col_varchar = subquery1_t2.col_varchar) (rows=0.2)
-> Table scan on subquery1_t2 (rows=2)
-> Hash
-> Nested loop left join (rows=1)
-> Table scan on subquery1_t1 (rows=1)
-> Filter: <in_optimizer>(subquery1_t1.col_varchar,<exists>(select #4)) (rows=0.141)
-> Covering index lookup on table2 using <auto_key0> (col_varchar = subquery1_t1.col_varchar) (rows=0.141)
-> Materialize (rows=1.41)
-> Group (no aggregates) (rows=1.41)
-> Sort: subquery2_t1.col_varchar (rows=2)
-> Table scan on subquery2_t1 (rows=2)
-> Select #4 (subquery in condition; dependent)
-> Filter: (<cache>(subquery1_t1.col_varchar) = lower(subquery3_t1.pk)) (rows=1)
-> Table scan on subquery3_t1 (rows=1)
-> Filter: <in_optimizer>(subquery1_t1.col_varchar,<exists>(select #4)) (rows=0.141)
-> Covering index lookup on table2 using <auto_key0> (col_varchar = subquery1_t1.col_varchar) (rows=0.141)
-> Materialize (rows=1.41)
-> Group (no aggregates) (rows=1.41)
-> Sort: subquery2_t1.col_varchar (rows=2)
-> Table scan on subquery2_t1 (rows=2)
-> Select #4 (subquery in condition; dependent)
-> Filter: (<cache>(subquery1_t1.col_varchar) = lower(subquery3_t1.pk)) (rows=1)
-> Table scan on subquery3_t1 (rows=1)

SELECT
table1.col_varchar
Expand Down
10 changes: 5 additions & 5 deletions mysql-test/r/hypergraph_bugs.result
Original file line number Diff line number Diff line change
Expand Up @@ -424,8 +424,10 @@ EXPLAIN
-> Inner hash join (no condition) (rows=10e-6)
-> Table scan on t6 (rows=1)
-> Hash
-> Nested loop inner join (FirstMatch) (rows=10e-6)
-> Limit: 1 row(s) (rows=100e-6)
-> Hash semijoin (FirstMatch) (no condition) (rows=10e-6)
-> Filter: (t1.f1 = t1.f1) (rows=0.1)
-> Table scan on t1 (rows=1)
-> Hash
-> Filter: (t2.f2 = t4.f2) (rows=100e-6)
-> Left hash join (no condition) (rows=0.001)
-> Filter: (t3.f1 = t2.f2) (rows=0.001)
Expand All @@ -435,8 +437,6 @@ EXPLAIN
-> Table scan on t3 (rows=1)
-> Hash
-> Table scan on t4 (rows=1)
-> Filter: (t1.f1 = t1.f1) (rows=0.1)
-> Table scan on t1 (rows=1)

Warnings:
Note 1276 Field or reference 'test.t1.f1' of SELECT #3 was resolved in SELECT #2
Expand Down Expand Up @@ -827,7 +827,7 @@ Table Op Msg_type Msg_text
test.t1 analyze status OK
EXPLAIN ANALYZE SELECT DISTINCT b FROM t1;
EXPLAIN
-> Covering index skip scan for deduplication on t1 using b (rows=2.24) (actual rows=5 loops=1)
-> Covering index skip scan for deduplication on t1 using b (rows=5) (actual rows=5 loops=1)

EXPLAIN ANALYZE SELECT DISTINCT c FROM t1;
EXPLAIN
Expand Down
Loading

0 comments on commit 57341ee

Please sign in to comment.