Skip to content

Commit

Permalink
[SPARK-18764][CORE] Add a warning log when skipping a corrupted file
Browse files Browse the repository at this point in the history
## What changes were proposed in this pull request?

It's better to add a warning log when skipping a corrupted file. It will be helpful when we want to finish the job first, then find them in the log and fix these files.

## How was this patch tested?

Jenkins

Author: Shixiong Zhu <[email protected]>

Closes apache#16192 from zsxwing/SPARK-18764.
  • Loading branch information
zsxwing committed Dec 7, 2016
1 parent f1fca81 commit dbf3e29
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 2 deletions.
4 changes: 3 additions & 1 deletion core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,9 @@ class HadoopRDD[K, V](
try {
finished = !reader.next(key, value)
} catch {
case e: IOException if ignoreCorruptFiles => finished = true
case e: IOException if ignoreCorruptFiles =>
logWarning(s"Skipped the rest content in the corrupted file: ${split.inputSplit}", e)
finished = true
}
if (!finished) {
inputMetrics.incRecordsRead(1)
Expand Down
6 changes: 5 additions & 1 deletion core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,11 @@ class NewHadoopRDD[K, V](
try {
finished = !reader.nextKeyValue
} catch {
case e: IOException if ignoreCorruptFiles => finished = true
case e: IOException if ignoreCorruptFiles =>
logWarning(
s"Skipped the rest content in the corrupted file: ${split.serializableHadoopSplit}",
e)
finished = true
}
if (finished) {
// Close and release the reader here; close() will also be called when the task
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ class FileScanRDD(
}
} catch {
case e: IOException =>
logWarning(s"Skipped the rest content in the corrupted file: $currentFile", e)
finished = true
null
}
Expand Down

0 comments on commit dbf3e29

Please sign in to comment.