Skip to content

Commit

Permalink
[SPARK-3778] newAPIHadoopRDD doesn't properly pass credentials for se…
Browse files Browse the repository at this point in the history
…cure hdfs

.this was apache/spark#2676

https://issues.apache.org/jira/browse/SPARK-3778

This affects if someone is trying to access secure hdfs something like:
val lines = {
val hconf = new Configuration()
hconf.set("mapred.input.dir", "mydir")
hconf.set("textinputformat.record.delimiter","\003432\n")
sc.newAPIHadoopRDD(hconf, classOf[TextInputFormat], classOf[LongWritable], classOf[Text])
}

Author: Thomas Graves <[email protected]>

Closes #4292 from tgravescs/SPARK-3788 and squashes the following commits:

cf3b453 [Thomas Graves] newAPIHadoopRDD doesn't properly pass credentials for secure hdfs on yarn
  • Loading branch information
tgravescs authored and JoshRosen committed Feb 3, 2015
1 parent eb0da6c commit c31c36c
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion core/src/main/scala/org/apache/spark/SparkContext.scala
Original file line number Diff line number Diff line change
Expand Up @@ -804,6 +804,8 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
vClass: Class[V],
conf: Configuration = hadoopConfiguration): RDD[(K, V)] = {
assertNotStopped()
// The call to new NewHadoopJob automatically adds security credentials to conf,
// so we don't need to explicitly add them ourselves
val job = new NewHadoopJob(conf)
NewFileInputFormat.addInputPath(job, new Path(path))
val updatedConf = job.getConfiguration
Expand All @@ -826,7 +828,10 @@ class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationCli
kClass: Class[K],
vClass: Class[V]): RDD[(K, V)] = {
assertNotStopped()
new NewHadoopRDD(this, fClass, kClass, vClass, conf)
// Add necessary security credentials to the JobConf. Required to access secure HDFS.
val jconf = new JobConf(conf)
SparkHadoopUtil.get.addCredentials(jconf)
new NewHadoopRDD(this, fClass, kClass, vClass, jconf)
}

/** Get an RDD for a Hadoop SequenceFile with given key and value types.
Expand Down

0 comments on commit c31c36c

Please sign in to comment.