Skip to content

Commit

Permalink
[SPARK-39285][SQL] Spark should not check field names when reading files
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
Spark should not check filed name when reading data, in this pr we clean the code

### Why are the changes needed?
Although spark can't write data with invalid col names, but Spark can read data with invalid col name, so we should not check field names when reading data.

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
MT

Closes apache#36661 from AngersZhuuuu/SPARK-39285.

Authored-by: Angerszhuuuu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
  • Loading branch information
AngersZhuuuu authored and cloud-fan committed May 26, 2022
1 parent f673ebd commit 55ee406
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,6 @@ object DataSourceUtils extends PredicateHelper {
throw QueryCompilationErrors.dataTypeUnsupportedByDataSourceError(format.toString, field)
}
}
checkFieldNames(format, schema)
}

// SPARK-24626: Metadata files and temporary files should not be
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ object FileFormatWriter extends Logging {

val dataSchema = dataColumns.toStructType
DataSourceUtils.verifySchema(fileFormat, dataSchema)
DataSourceUtils.checkFieldNames(fileFormat, dataSchema)
// Note: prepareWrite has side effect. It sets "job".
val outputWriterFactory =
fileFormat.prepareWrite(sparkSession, job, caseInsensitiveOptions, dataSchema)
Expand Down

0 comments on commit 55ee406

Please sign in to comment.