Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-4686] SubQueryRemoveRule.matchJoin should correctly rewrite all sub-queries #2385

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Reverting back bad merge for SubqueryRemoveRule
Turn back on RelOptRulesTest.testExpandJoinIn and testExpandJoinInComposite.
  • Loading branch information
jamesstarr committed Jul 7, 2021
commit f3013aed142b85c0e63128b47af9be6cbed18e6d
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@
import org.apache.calcite.rel.core.Project;
import org.apache.calcite.rel.metadata.RelMetadataQuery;
import org.apache.calcite.rex.LogicVisitor;
import org.apache.calcite.rex.RexBuilder;
import org.apache.calcite.rex.RexCorrelVariable;
import org.apache.calcite.rex.RexInputRef;
import org.apache.calcite.rex.RexLiteral;
Expand All @@ -47,6 +48,7 @@
import org.apache.calcite.util.Util;

import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableSet;

import java.util.ArrayList;
import java.util.List;
Expand Down Expand Up @@ -638,23 +640,46 @@ private static void matchFilter(SubQueryRemoveRule rule,

private static void matchJoin(SubQueryRemoveRule rule, RelOptRuleCall call) {
final Join join = call.rel(0);
switch (join.getJoinType()) {
case INNER:
case LEFT:
case ANTI:
case SEMI:
break;
default:
//Incorrect queries will be produced for full and right joins.
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we cannot treat RIGHT and FULL outer joins I think it makes sense to skip calling onMatch method altogether. Is it possible to move the check as rule predicate or apply the check as part of RelOptRule#matches?

final RexBuilder rexBuilder = call.builder().getRexBuilder();
final RelBuilder builder = call.builder();
final RexSubQuery e =
RexUtil.SubQueryFinder.find(join.getCondition());
assert e != null;
final RelOptUtil.Logic logic =
LogicVisitor.find(RelOptUtil.Logic.TRUE,
ImmutableList.of(join.getCondition()), e);
builder.push(join.getLeft());
builder.push(join.getRight());
final int fieldCount = join.getRowType().getFieldCount();
final Set<CorrelationId> variablesSet =
RelOptUtil.getVariablesUsed(e.rel);
final RexNode target = rule.apply(e, variablesSet,
logic, builder, 2, fieldCount);
final RexShuttle shuttle = new ReplaceSubQueryShuttle(e, target);
builder.join(join.getJoinType(), shuttle.apply(join.getCondition()));
builder.project(fields(builder, join.getRowType().getFieldCount()));
builder
.push(join.getLeft())
.push(join.getRight());
CorrelationId id = join.getCluster().createCorrel();
RexNode condition = RelOptUtil.correlateLeftShiftRight(rexBuilder,
join.getLeft(), id, join.getRight(), join.getCondition());
boolean found = false;
while (true) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen that matchFilter shares some similarities with this newly added code. Is it correct to assume that we need to do this also for matchProject? Should we log a follow-up JIRA?

final RexSubQuery subQuery = RexUtil.SubQueryFinder.find(condition);
if (subQuery == null) {
assert found;
break;
}
found = true;
final Set<CorrelationId> variablesSet =
RelOptUtil.getVariablesUsed(subQuery.rel);
final RelOptUtil.Logic logic =
LogicVisitor.find(RelOptUtil.Logic.TRUE, ImmutableList.of(condition), subQuery);
int offset = builder.peek().getRowType().getFieldCount();
final RexNode target = rule.apply(subQuery, variablesSet, logic,
builder, 1, offset);
final RexShuttle shuttle = new ReplaceSubQueryShuttle(subQuery, target);
condition = shuttle.apply(condition);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: The final at this level make the code more verbose with small benefit given that the scope is very small. In some cases removing it could make things fit in one line.

Copy link
Contributor Author

@jamesstarr jamesstarr Aug 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am normally not a big fan of final in this localized of context when all variables are treated as immutable, but when some variables are mutated, having final on all the immutable variables signals that the mutable variables are in fact mutable.

I will try to shift things to a single where possible.

}
builder
.filter(condition)
.join(join.getJoinType(), rexBuilder.makeLiteral(true), ImmutableSet.of(id))
.project(fields(builder, join.getRowType().getFieldCount()));
call.transformTo(builder.build());
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5695,15 +5695,13 @@ private Sql checkSubQuery(String sql) {
.check();
}

@Disabled("[CALCITE-1045]")
@Test void testExpandJoinIn() {
final String sql = "select empno\n"
+ "from sales.emp left join sales.dept\n"
+ "on emp.deptno in (select deptno from sales.emp where empno < 20)";
checkSubQuery(sql).check();
}

@Disabled("[CALCITE-1045]")
@Test void testExpandJoinInComposite() {
final String sql = "select empno\n"
+ "from sales.emp left join sales.dept\n"
Expand Down
21 changes: 10 additions & 11 deletions core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3295,9 +3295,9 @@ LogicalProject(DEPTNO=[$7])
<![CDATA[
LogicalProject(EMPNO=[$0])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
LogicalJoin(condition=[true], joinType=[left])
LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{7}])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalJoin(condition=[=($7, $11)], joinType=[inner])
LogicalJoin(condition=[=($cor0.DEPTNO, $2)], joinType=[inner])
zabetak marked this conversation as resolved.
Show resolved Hide resolved
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
LogicalAggregate(group=[{0}])
LogicalProject(DEPTNO=[$7])
Expand Down Expand Up @@ -3329,14 +3329,13 @@ LogicalProject(EMPNO=[$0], DEPTNO=[$7])
<![CDATA[
LogicalProject(EMPNO=[$0])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
LogicalJoin(condition=[true], joinType=[left])
LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{0}])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalJoin(condition=[AND(=($0, $11), =($9, $12))], joinType=[inner])
LogicalJoin(condition=[AND(=($cor0.EMPNO, $2), =($0, $3))], joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
LogicalAggregate(group=[{0, 1}])
zabetak marked this conversation as resolved.
Show resolved Hide resolved
LogicalProject(EMPNO=[$0], DEPTNO=[$7])
LogicalFilter(condition=[<($0, 20)])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalProject(EMPNO=[$0], DEPTNO=[$7])
LogicalFilter(condition=[<($0, 20)])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
]]>
</Resource>
</TestCase>
Expand Down Expand Up @@ -3367,9 +3366,9 @@ LogicalProject(DEPTNO=[$7])
<![CDATA[
LogicalProject(EMPNO=[$0])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[$9], NAME=[$10], $f0=[$11])
LogicalJoin(condition=[<($11, $12)], joinType=[left])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalJoin(condition=[true], joinType=[left])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalFilter(condition=[<($2, $3)])
LogicalJoin(condition=[true], joinType=[left])
LogicalJoin(condition=[true], joinType=[left])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
Expand Down