Skip to content

Commit

Permalink
[SPARK-25727][SQL] Add outputOrdering to otherCopyArgs in InMemoryRel…
Browse files Browse the repository at this point in the history
…ation

## What changes were proposed in this pull request?
Add `outputOrdering ` to `otherCopyArgs` in InMemoryRelation so that this field will be copied when we doing the tree transformation.

```
    val data = Seq(100).toDF("count").cache()
    data.queryExecution.optimizedPlan.toJSON
```

The above code can generate the following error:

```
assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
java.lang.AssertionError: assertion failed: InMemoryRelation fields: output, cacheBuilder, statsOfPlanToCache, outputOrdering, values: List(count#178), CachedRDDBuilder(true,10000,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [value#176 AS count#178]
+- LocalTableScan [value#176]
,None), Statistics(sizeInBytes=12.0 B, hints=none)
	at scala.Predef$.assert(Predef.scala:170)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonFields(TreeNode.scala:611)
	at org.apache.spark.sql.catalyst.trees.TreeNode.org$apache$spark$sql$catalyst$trees$TreeNode$$collectJsonValue$1(TreeNode.scala:599)
	at org.apache.spark.sql.catalyst.trees.TreeNode.jsonValue(TreeNode.scala:604)
	at org.apache.spark.sql.catalyst.trees.TreeNode.toJSON(TreeNode.scala:590)
```

## How was this patch tested?

Added a test

Closes apache#22715 from gatorsmile/copyArgs1.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
gatorsmile authored and dongjoon-hyun committed Oct 14, 2018
1 parent 6bbceb9 commit 6c3f2c6
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ case class InMemoryRelation(
outputOrdering).asInstanceOf[this.type]
}

override protected def otherCopyArgs: Seq[AnyRef] = Seq(statsOfPlanToCache)
override protected def otherCopyArgs: Seq[AnyRef] = Seq(statsOfPlanToCache, outputOrdering)

override def simpleString: String =
s"InMemoryRelation [${Utils.truncatedString(output, ", ")}], ${cacheBuilder.storageLevel}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -488,6 +488,12 @@ class InMemoryColumnarQuerySuite extends QueryTest with SharedSQLContext {
}
}

test("SPARK-25727 - otherCopyArgs in InMemoryRelation does not include outputOrdering") {
val data = Seq(100).toDF("count").cache()
val json = data.queryExecution.optimizedPlan.toJSON
assert(json.contains("outputOrdering") && json.contains("statsOfPlanToCache"))
}

test("SPARK-22673: InMemoryRelation should utilize existing stats of the plan to be cached") {
// This test case depends on the size of parquet in statistics.
withSQLConf(
Expand Down

0 comments on commit 6c3f2c6

Please sign in to comment.