Skip to content

Commit

Permalink
[SPARK-36920][SQL][FOLLOWUP] Fix input types of ABS(): numeric and …
Browse files Browse the repository at this point in the history
…ANSI intervals

### What changes were proposed in this pull request?
Change allowed input types of `Abs()` from:
```
NumericType + CalendarIntervalType + YearMonthIntervalType + DayTimeIntervalType
```
to
```
NumericType + YearMonthIntervalType + DayTimeIntervalType
```

### Why are the changes needed?
The changes make the error message more clear.

Before changes:
```sql
spark-sql> set spark.sql.legacy.interval.enabled=true;
spark.sql.legacy.interval.enabled	true
spark-sql> select abs(interval -10 days -20 minutes);
21/10/05 09:11:30 ERROR SparkSQLDriver: Failed in [select abs(interval -10 days -20 minutes)]
java.lang.ClassCastException: org.apache.spark.sql.types.CalendarIntervalType$ cannot be cast to org.apache.spark.sql.types.NumericType
	at org.apache.spark.sql.catalyst.util.TypeUtils$.getNumeric(TypeUtils.scala:77)
	at org.apache.spark.sql.catalyst.expressions.Abs.numeric$lzycompute(arithmetic.scala:172)
	at org.apache.spark.sql.catalyst.expressions.Abs.numeric(arithmetic.scala:169)
```

After:
```sql
spark.sql.legacy.interval.enabled	true
spark-sql> select abs(interval -10 days -20 minutes);
Error in query: cannot resolve 'abs(INTERVAL '-10 days -20 minutes')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month) type, however, 'INTERVAL '-10 days -20 minutes'' is of interval type.; line 1 pos 7;
'Project [unresolvedalias(abs(-10 days -20 minutes, false), None)]
+- OneRowRelation
```

### Does this PR introduce _any_ user-facing change?
No, because the original changes of apache#34169 haven't released yet.

### How was this patch tested?
Manually checked in the command line, see examples above.

Closes apache#34183 from MaxGekk/fix-abs-input-types.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
  • Loading branch information
MaxGekk committed Oct 5, 2021
1 parent 6e8a462 commit 3ac0382
Showing 8 changed files with 36 additions and 32 deletions.
Original file line number Diff line number Diff line change
@@ -162,7 +162,7 @@ case class Abs(child: Expression, failOnError: Boolean = SQLConf.get.ansiEnabled

def this(child: Expression) = this(child, SQLConf.get.ansiEnabled)

override def inputTypes: Seq[AbstractDataType] = Seq(TypeCollection.NumericAndInterval)
override def inputTypes: Seq[AbstractDataType] = Seq(TypeCollection.NumericAndAnsiInterval)

override def dataType: DataType = child.dataType

Original file line number Diff line number Diff line change
@@ -80,15 +80,19 @@ private[sql] class TypeCollection(private val types: Seq[AbstractDataType])
private[sql] object TypeCollection {

/**
* Types that include numeric types and interval type. They are only used in unary_minus,
* unary_positive, add and subtract operations.
* Types that include numeric types and ANSI interval types.
*/
val NumericAndInterval = TypeCollection(
val NumericAndAnsiInterval = TypeCollection(
NumericType,
CalendarIntervalType,
DayTimeIntervalType,
YearMonthIntervalType)

/**
* Types that include numeric and ANSI interval types, and additionally the legacy interval type.
* They are only used in unary_minus, unary_positive, add and subtract operations.
*/
val NumericAndInterval = new TypeCollection(NumericAndAnsiInterval.types :+ CalendarIntervalType)

def apply(types: AbstractDataType*): TypeCollection = new TypeCollection(types)

def unapply(typ: AbstractDataType): Option[Seq[AbstractDataType]] = typ match {
Original file line number Diff line number Diff line change
@@ -78,9 +78,9 @@ class ExpressionTypeCheckingSuite extends SparkFunSuite {
assertErrorForDifferingTypes(BitwiseXor(Symbol("intField"), Symbol("booleanField")))

assertError(Add(Symbol("booleanField"), Symbol("booleanField")),
"requires (numeric or interval or interval day to second or interval year to month) type")
"requires (numeric or interval day to second or interval year to month or interval) type")
assertError(Subtract(Symbol("booleanField"), Symbol("booleanField")),
"requires (numeric or interval or interval day to second or interval year to month) type")
"requires (numeric or interval day to second or interval year to month or interval) type")
assertError(Multiply(Symbol("booleanField"), Symbol("booleanField")), "requires numeric type")
assertError(Divide(Symbol("booleanField"), Symbol("booleanField")),
"requires (double or decimal) type")
Original file line number Diff line number Diff line change
@@ -436,7 +436,7 @@ select +date '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
cannot resolve '(+ DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7


-- !query
@@ -445,7 +445,7 @@ select +timestamp '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
cannot resolve '(+ TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7


-- !query
@@ -462,7 +462,7 @@ select +map(1, 2)
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ map(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'map(1, 2)' is of map<int,int> type.; line 1 pos 7
cannot resolve '(+ map(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'map(1, 2)' is of map<int,int> type.; line 1 pos 7


-- !query
@@ -471,7 +471,7 @@ select +array(1,2)
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ array(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'array(1, 2)' is of array<int> type.; line 1 pos 7
cannot resolve '(+ array(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'array(1, 2)' is of array<int> type.; line 1 pos 7


-- !query
@@ -480,7 +480,7 @@ select +named_struct('a', 1, 'b', 'spark')
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ named_struct('a', 1, 'b', 'spark'))' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'named_struct('a', 1, 'b', 'spark')' is of struct<a:int,b:string> type.; line 1 pos 7
cannot resolve '(+ named_struct('a', 1, 'b', 'spark'))' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'named_struct('a', 1, 'b', 'spark')' is of struct<a:int,b:string> type.; line 1 pos 7


-- !query
@@ -489,7 +489,7 @@ select +X'1'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ X'01')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'X'01'' is of binary type.; line 1 pos 7
cannot resolve '(+ X'01')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'X'01'' is of binary type.; line 1 pos 7


-- !query
@@ -498,7 +498,7 @@ select -date '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7


-- !query
@@ -507,7 +507,7 @@ select -timestamp '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7


-- !query
@@ -516,4 +516,4 @@ select -x'2379ACFe'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(- X'2379ACFE')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'X'2379ACFE'' is of binary type.; line 1 pos 7
cannot resolve '(- X'2379ACFE')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'X'2379ACFE'' is of binary type.; line 1 pos 7
Original file line number Diff line number Diff line change
@@ -679,7 +679,7 @@ select timestamp'2011-11-11 11:11:11' + '1'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(TIMESTAMP '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP))' due to data type mismatch: '(TIMESTAMP '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP))' requires (numeric or interval or interval day to second or interval year to month) type, not timestamp; line 1 pos 7
cannot resolve '(TIMESTAMP '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP))' due to data type mismatch: '(TIMESTAMP '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP))' requires (numeric or interval day to second or interval year to month or interval) type, not timestamp; line 1 pos 7


-- !query
@@ -688,7 +688,7 @@ select '1' + timestamp'2011-11-11 11:11:11'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(CAST('1' AS TIMESTAMP) + TIMESTAMP '2011-11-11 11:11:11')' due to data type mismatch: '(CAST('1' AS TIMESTAMP) + TIMESTAMP '2011-11-11 11:11:11')' requires (numeric or interval or interval day to second or interval year to month) type, not timestamp; line 1 pos 7
cannot resolve '(CAST('1' AS TIMESTAMP) + TIMESTAMP '2011-11-11 11:11:11')' due to data type mismatch: '(CAST('1' AS TIMESTAMP) + TIMESTAMP '2011-11-11 11:11:11')' requires (numeric or interval day to second or interval year to month or interval) type, not timestamp; line 1 pos 7


-- !query
18 changes: 9 additions & 9 deletions sql/core/src/test/resources/sql-tests/results/literals.sql.out
Original file line number Diff line number Diff line change
@@ -436,7 +436,7 @@ select +date '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
cannot resolve '(+ DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7


-- !query
@@ -445,7 +445,7 @@ select +timestamp '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
cannot resolve '(+ TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7


-- !query
@@ -462,7 +462,7 @@ select +map(1, 2)
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ map(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'map(1, 2)' is of map<int,int> type.; line 1 pos 7
cannot resolve '(+ map(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'map(1, 2)' is of map<int,int> type.; line 1 pos 7


-- !query
@@ -471,7 +471,7 @@ select +array(1,2)
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ array(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'array(1, 2)' is of array<int> type.; line 1 pos 7
cannot resolve '(+ array(1, 2))' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'array(1, 2)' is of array<int> type.; line 1 pos 7


-- !query
@@ -480,7 +480,7 @@ select +named_struct('a', 1, 'b', 'spark')
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ named_struct('a', 1, 'b', 'spark'))' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'named_struct('a', 1, 'b', 'spark')' is of struct<a:int,b:string> type.; line 1 pos 7
cannot resolve '(+ named_struct('a', 1, 'b', 'spark'))' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'named_struct('a', 1, 'b', 'spark')' is of struct<a:int,b:string> type.; line 1 pos 7


-- !query
@@ -489,7 +489,7 @@ select +X'1'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(+ X'01')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'X'01'' is of binary type.; line 1 pos 7
cannot resolve '(+ X'01')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'X'01'' is of binary type.; line 1 pos 7


-- !query
@@ -498,7 +498,7 @@ select -date '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7
cannot resolve '(- DATE '1999-01-01')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'DATE '1999-01-01'' is of date type.; line 1 pos 7


-- !query
@@ -507,7 +507,7 @@ select -timestamp '1999-01-01'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7
cannot resolve '(- TIMESTAMP '1999-01-01 00:00:00')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'TIMESTAMP '1999-01-01 00:00:00'' is of timestamp type.; line 1 pos 7


-- !query
@@ -516,4 +516,4 @@ select -x'2379ACFe'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(- X'2379ACFE')' due to data type mismatch: argument 1 requires (numeric or interval or interval day to second or interval year to month) type, however, 'X'2379ACFE'' is of binary type.; line 1 pos 7
cannot resolve '(- X'2379ACFE')' due to data type mismatch: argument 1 requires (numeric or interval day to second or interval year to month or interval) type, however, 'X'2379ACFE'' is of binary type.; line 1 pos 7
Original file line number Diff line number Diff line change
@@ -679,7 +679,7 @@ select timestamp'2011-11-11 11:11:11' + '1'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(TIMESTAMP_NTZ '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP_NTZ))' due to data type mismatch: '(TIMESTAMP_NTZ '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP_NTZ))' requires (numeric or interval or interval day to second or interval year to month) type, not timestamp_ntz; line 1 pos 7
cannot resolve '(TIMESTAMP_NTZ '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP_NTZ))' due to data type mismatch: '(TIMESTAMP_NTZ '2011-11-11 11:11:11' + CAST('1' AS TIMESTAMP_NTZ))' requires (numeric or interval day to second or interval year to month or interval) type, not timestamp_ntz; line 1 pos 7


-- !query
@@ -688,7 +688,7 @@ select '1' + timestamp'2011-11-11 11:11:11'
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve '(CAST('1' AS TIMESTAMP_NTZ) + TIMESTAMP_NTZ '2011-11-11 11:11:11')' due to data type mismatch: '(CAST('1' AS TIMESTAMP_NTZ) + TIMESTAMP_NTZ '2011-11-11 11:11:11')' requires (numeric or interval or interval day to second or interval year to month) type, not timestamp_ntz; line 1 pos 7
cannot resolve '(CAST('1' AS TIMESTAMP_NTZ) + TIMESTAMP_NTZ '2011-11-11 11:11:11')' due to data type mismatch: '(CAST('1' AS TIMESTAMP_NTZ) + TIMESTAMP_NTZ '2011-11-11 11:11:11')' requires (numeric or interval day to second or interval year to month or interval) type, not timestamp_ntz; line 1 pos 7


-- !query
Original file line number Diff line number Diff line change
@@ -168,7 +168,7 @@ SELECT COUNT(*) OVER (PARTITION BY 1 ORDER BY cast(1 as string) DESC RANGE BETWE
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve 'RANGE BETWEEN CURRENT ROW AND CAST(1 AS STRING) FOLLOWING' due to data type mismatch: The data type of the upper bound 'string' does not match the expected data type '(numeric or interval or interval day to second or interval year to month)'.; line 1 pos 21
cannot resolve 'RANGE BETWEEN CURRENT ROW AND CAST(1 AS STRING) FOLLOWING' due to data type mismatch: The data type of the upper bound 'string' does not match the expected data type '(numeric or interval day to second or interval year to month or interval)'.; line 1 pos 21


-- !query
@@ -177,7 +177,7 @@ SELECT COUNT(*) OVER (PARTITION BY 1 ORDER BY cast('1' as binary) DESC RANGE BET
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve 'RANGE BETWEEN CURRENT ROW AND CAST(1 AS BINARY) FOLLOWING' due to data type mismatch: The data type of the upper bound 'binary' does not match the expected data type '(numeric or interval or interval day to second or interval year to month)'.; line 1 pos 21
cannot resolve 'RANGE BETWEEN CURRENT ROW AND CAST(1 AS BINARY) FOLLOWING' due to data type mismatch: The data type of the upper bound 'binary' does not match the expected data type '(numeric or interval day to second or interval year to month or interval)'.; line 1 pos 21


-- !query
@@ -186,7 +186,7 @@ SELECT COUNT(*) OVER (PARTITION BY 1 ORDER BY cast(1 as boolean) DESC RANGE BETW
struct<>
-- !query output
org.apache.spark.sql.AnalysisException
cannot resolve 'RANGE BETWEEN CURRENT ROW AND CAST(1 AS BOOLEAN) FOLLOWING' due to data type mismatch: The data type of the upper bound 'boolean' does not match the expected data type '(numeric or interval or interval day to second or interval year to month)'.; line 1 pos 21
cannot resolve 'RANGE BETWEEN CURRENT ROW AND CAST(1 AS BOOLEAN) FOLLOWING' due to data type mismatch: The data type of the upper bound 'boolean' does not match the expected data type '(numeric or interval day to second or interval year to month or interval)'.; line 1 pos 21


-- !query

0 comments on commit 3ac0382

Please sign in to comment.