[CALCITE-6718] Optimize SubstitutionVisitor's splitFilter with early return and uniform simplification for equivalence checking #4078

hannerwang · 2024-12-06T05:18:01Z

What changes were proposed in this pull request?

Implement early return for materialized view range checking.
Apply uniform simplification for expression equivalence checking.

Why are the changes needed?

Early Return: Many materialized views do not have filters, so implementing early return can avoid unnecessary checks and improve performance.
Uniform Simplification: In user-customized applications, we cannot guarantee that both sides of the equivalence check have been simplified using simplifyUnknownAsFalse. Adding this ensures consistent behavior.

Does this PR introduce any user-facing change?

No.

sonarcloud · 2024-12-06T07:23:46Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

suibianwanwank · 2024-12-06T08:04:42Z

core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java

-      if (!r.isAlwaysFalse() && isEquivalent(condition2, r)) {
+      RexNode simplifiedCond2 =
+          canonizeNode(rexBuilder, simplify.simplifyUnknownAsFalse(condition2));
+      if (!r.isAlwaysFalse() && isEquivalent(simplifiedCond2, r)) {


I agree with Early Return, it cuts out some unnecessary steps. But doing another simplify for condition2 seems unnecessary, condition2 only canonizeNode after simplify.
How about using simplify instead of simplifyUnknownAsFalse for x2 .

Actually I'm not sure why this function uses both simplify and simplifyUnknownAsFalse, so I made a conservative change to minimize impact. If possible, we could replace simplify with simplifyUnknownAsFalse, but that seems too radical.

I agree with @suibianwanwank, we shouldn't need another simplification on condition2 while early return seems good.

Here we are dealing with filtering predicates, so I guess that simplifyUnknownAsFalse has the right semantics, rather than just simplify, but we need tests covering these changes and showing that we are doing the right thing on corner cases involving unknown and the early return condition.

@hannerwang, can you add them?

Hello @asolimando ,
I attempted to change simplifyUnknownAsFalse to simplify, but I encountered some issues.
For instance, with the expression x >= 5 and x > 5, the simplify method does not reduce it, whereas simplifyUnknownAsFalse simplifies it to x > 5. Additionally, expressions like x > 5 and x < 3 are simplified to x <> x when UNKNOWN is treated as UNKNOWN, and to false when UNKNOWN is treated as FALSE.
Therefore, we cannot simply replace simplifyUnknownAsFalse with simplify, as it would cause many tests to fail.

I was actually suggesting the opposite, simplifyUnknownAsFalse seems the right simplification to be used in the context of filtering predicares, we could use it for simplifying condition and target at the very beginning to keep it consistent with the simplification of x2 later on. To validate that we aren't missing anything here I was suggesting that we need tests covering cases where Unknown is involved.

If those tests already exist it's fine, we just need to validate that we are doing the right thing. I hope it's clearer now, I am sorry if I confused you with my previous comment.

…return and uniform simplification for equivalence checking

NobiGo · 2024-12-11T10:58:12Z

core/src/main/java/org/apache/calcite/plan/SubstitutionVisitor.java

+    if (target.isAlwaysTrue()) {
+      return condition2;
+    }
+    target = simplify.simplify(target);


Can this target be 'true' after simplifying? Do we need to handle when the target is always false?

hannerwang force-pushed the enhance_split_filter branch from 18bc5f7 to dd17183 Compare December 6, 2024 07:04

suibianwanwank reviewed Dec 6, 2024

View reviewed changes

hannerwang force-pushed the enhance_split_filter branch from dd17183 to 9f87b8f Compare December 7, 2024 13:54

[CALCITE-6718] Optimize SubstitutionVisitor's splitFilter with early …

cf9b38d

…return and uniform simplification for equivalence checking

hannerwang force-pushed the enhance_split_filter branch from 9f87b8f to cf9b38d Compare December 7, 2024 14:49

NobiGo reviewed Dec 11, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CALCITE-6718] Optimize SubstitutionVisitor's splitFilter with early return and uniform simplification for equivalence checking #4078

[CALCITE-6718] Optimize SubstitutionVisitor's splitFilter with early return and uniform simplification for equivalence checking #4078

hannerwang commented Dec 6, 2024

sonarcloud bot commented Dec 6, 2024

suibianwanwank Dec 6, 2024

hannerwang Dec 6, 2024

asolimando Dec 6, 2024

hannerwang Dec 7, 2024

asolimando Dec 8, 2024

NobiGo Dec 11, 2024

[CALCITE-6718] Optimize SubstitutionVisitor's splitFilter with early return and uniform simplification for equivalence checking #4078

Are you sure you want to change the base?

[CALCITE-6718] Optimize SubstitutionVisitor's splitFilter with early return and uniform simplification for equivalence checking #4078

Conversation

hannerwang commented Dec 6, 2024

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

sonarcloud bot commented Dec 6, 2024

Quality Gate passed

suibianwanwank Dec 6, 2024

Choose a reason for hiding this comment

hannerwang Dec 6, 2024

Choose a reason for hiding this comment

asolimando Dec 6, 2024

Choose a reason for hiding this comment

hannerwang Dec 7, 2024

Choose a reason for hiding this comment

asolimando Dec 8, 2024

Choose a reason for hiding this comment

NobiGo Dec 11, 2024

Choose a reason for hiding this comment