Skip to content

Commit

Permalink
[SPARK-27917][SQL] canonical form of CaseWhen object is incorrect
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

For caseWhen Object canonicalized is not handled

for e.g let's consider below CaseWhen Object
    val attrRef = AttributeReference("ACCESS_CHECK", StringType)()
    val caseWhenObj1 = CaseWhen(Seq((attrRef, Literal("A"))))

caseWhenObj1.canonicalized **ouput** is as below

CASE WHEN ACCESS_CHECK#0 THEN A END (**Before Fix)**

**After Fix** : CASE WHEN none#0 THEN A END

So when there will be aliasref like below statements, semantic equals will fail. Sematic equals returns true if the canonicalized form of both the expressions are same.

val attrRef = AttributeReference("ACCESS_CHECK", StringType)()
val aliasAttrRef = attrRef.withName("access_check")
val caseWhenObj1 = CaseWhen(Seq((attrRef, Literal("A"))))
val caseWhenObj2 = CaseWhen(Seq((aliasAttrRef, Literal("A"))))

**assert(caseWhenObj2.semanticEquals(caseWhenObj1.semanticEquals) fails**

**caseWhenObj1.canonicalized**

Before Fix:CASE WHEN ACCESS_CHECK#0 THEN A END
After Fix: CASE WHEN none#0 THEN A END
**caseWhenObj2.canonicalized**

Before Fix:CASE WHEN access_check#0 THEN A END
After Fix: CASE WHEN none#0 THEN A END

## How was this patch tested?
Added UT

Closes apache#24766 from sandeep-katta/caseWhenIssue.

Authored-by: sandeep katta <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
sandeep-katta authored and dongjoon-hyun committed Jun 10, 2019
1 parent 95a9212 commit 773cfde
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -234,6 +234,8 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
}
def mapChild(child: Any): Any = child match {
case arg: TreeNode[_] if containsChild(arg) => mapTreeNode(arg)
// CaseWhen Case or any tuple type
case (left, right) => (mapChild(left), mapChild(right))
case nonChild: AnyRef => nonChild
case null => null
}
Expand All @@ -249,6 +251,7 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product {
// `mapValues` is lazy and we need to force it to materialize
m.mapValues(mapChild).view.force
case arg: TreeNode[_] if containsChild(arg) => mapTreeNode(arg)
case Some(child) => Some(mapChild(child))
case nonChild: AnyRef => nonChild
case null => null
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -248,4 +248,28 @@ class ConditionalExpressionSuite extends SparkFunSuite with ExpressionEvalHelper
.contains("CASE WHEN ... THEN struct<x:int> WHEN ... THEN struct<y:int> " +
"ELSE struct<z:int> END"))
}

test("SPARK-27917 test semantic equals of CaseWhen") {
val attrRef = AttributeReference("ACCESS_CHECK", StringType)()
val aliasAttrRef = attrRef.withName("access_check")
// Test for Equality
var caseWhenObj1 = CaseWhen(Seq((attrRef, Literal("A"))))
var caseWhenObj2 = CaseWhen(Seq((aliasAttrRef, Literal("A"))))
assert(caseWhenObj1.semanticEquals(caseWhenObj2))
assert(caseWhenObj2.semanticEquals(caseWhenObj1))
// Test for inEquality
caseWhenObj2 = CaseWhen(Seq((attrRef, Literal("a"))))
assert(!caseWhenObj1.semanticEquals(caseWhenObj2))
assert(!caseWhenObj2.semanticEquals(caseWhenObj1))
// Test with elseValue with Equality
caseWhenObj1 = CaseWhen(Seq((attrRef, Literal("A"))), attrRef.withName("ELSEVALUE"))
caseWhenObj2 = CaseWhen(Seq((aliasAttrRef, Literal("A"))), aliasAttrRef.withName("elsevalue"))
assert(caseWhenObj1.semanticEquals(caseWhenObj2))
assert(caseWhenObj2.semanticEquals(caseWhenObj1))
caseWhenObj1 = CaseWhen(Seq((attrRef, Literal("A"))), Literal("ELSEVALUE"))
caseWhenObj2 = CaseWhen(Seq((aliasAttrRef, Literal("A"))), Literal("elsevalue"))
// Test with elseValue with inEquality
assert(!caseWhenObj1.semanticEquals(caseWhenObj2))
assert(!caseWhenObj2.semanticEquals(caseWhenObj1))
}
}

0 comments on commit 773cfde

Please sign in to comment.