Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CALCITE-6718] Optimize SubstitutionVisitor's splitFilter with early return and uniform simplification for equivalence checking #4078

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hannerwang
Copy link

What changes were proposed in this pull request?

  1. Implement early return for materialized view range checking.
  2. Apply uniform simplification for expression equivalence checking.

Why are the changes needed?

Early Return: Many materialized views do not have filters, so implementing early return can avoid unnecessary checks and improve performance.
Uniform Simplification: In user-customized applications, we cannot guarantee that both sides of the equivalence check have been simplified using simplifyUnknownAsFalse. Adding this ensures consistent behavior.

Does this PR introduce any user-facing change?

No.

@hannerwang hannerwang force-pushed the enhance_split_filter branch from 18bc5f7 to dd17183 Compare December 6, 2024 07:04
Copy link

sonarcloud bot commented Dec 6, 2024

if (!r.isAlwaysFalse() && isEquivalent(condition2, r)) {
RexNode simplifiedCond2 =
canonizeNode(rexBuilder, simplify.simplifyUnknownAsFalse(condition2));
if (!r.isAlwaysFalse() && isEquivalent(simplifiedCond2, r)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Early Return, it cuts out some unnecessary steps. But doing another simplify for condition2 seems unnecessary, condition2 only canonizeNode after simplify.
How about using simplify instead of simplifyUnknownAsFalse for x2 .

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I'm not sure why this function uses both simplify and simplifyUnknownAsFalse, so I made a conservative change to minimize impact. If possible, we could replace simplify with simplifyUnknownAsFalse, but that seems too radical.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @suibianwanwank, we shouldn't need another simplification on condition2 while early return seems good.

Here we are dealing with filtering predicates, so I guess that simplifyUnknownAsFalse has the right semantics, rather than just simplify, but we need tests covering these changes and showing that we are doing the right thing on corner cases involving unknown and the early return condition.

@hannerwang, can you add them?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @asolimando ,
I attempted to change simplifyUnknownAsFalse to simplify, but I encountered some issues.
For instance, with the expression x >= 5 and x > 5, the simplify method does not reduce it, whereas simplifyUnknownAsFalse simplifies it to x > 5. Additionally, expressions like x > 5 and x < 3 are simplified to x <> x when UNKNOWN is treated as UNKNOWN, and to false when UNKNOWN is treated as FALSE.
Therefore, we cannot simply replace simplifyUnknownAsFalse with simplify, as it would cause many tests to fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually suggesting the opposite, simplifyUnknownAsFalse seems the right simplification to be used in the context of filtering predicares, we could use it for simplifying condition and target at the very beginning to keep it consistent with the simplification of x2 later on. To validate that we aren't missing anything here I was suggesting that we need tests covering cases where Unknown is involved.

If those tests already exist it's fine, we just need to validate that we are doing the right thing. I hope it's clearer now, I am sorry if I confused you with my previous comment.

@hannerwang hannerwang force-pushed the enhance_split_filter branch from dd17183 to 9f87b8f Compare December 7, 2024 13:54
…return and uniform simplification for equivalence checking
@hannerwang hannerwang force-pushed the enhance_split_filter branch from 9f87b8f to cf9b38d Compare December 7, 2024 14:49
if (target.isAlwaysTrue()) {
return condition2;
}
target = simplify.simplify(target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this target be 'true' after simplifying? Do we need to handle when the target is always false?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants