Skip to content

Commit

Permalink
Spark 1490 Add kerberos support to the HistoryServer
Browse files Browse the repository at this point in the history
Here I've added the ability for the History server to login from a kerberos keytab file so that the history server can be run as a super user and stay up for along period of time while reading the history files from HDFS.

Author: Thomas Graves <[email protected]>

Closes apache#513 from tgravescs/SPARK-1490 and squashes the following commits:

e204a99 [Thomas Graves] remove extra logging
5418daa [Thomas Graves] fix typo in config
0076b99 [Thomas Graves] Update docs
4d76545 [Thomas Graves] SPARK-1490 Add kerberos support to the HistoryServer
  • Loading branch information
tgravescs authored and pwendell committed Apr 24, 2014
1 parent 78a49b2 commit bd37509
Show file tree
Hide file tree
Showing 3 changed files with 44 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ class SparkHadoopUtil {

def getSecretKeyFromUserCredentials(key: String): Array[Byte] = { null }

def loginUserFromKeytab(principalName: String, keytabFilename: String) {
UserGroupInformation.loginUserFromKeytab(principalName, keytabFilename)
}

}

object SparkHadoopUtil {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ import scala.collection.mutable
import org.apache.hadoop.fs.{FileStatus, Path}

import org.apache.spark.{Logging, SecurityManager, SparkConf}
import org.apache.spark.deploy.SparkHadoopUtil
import org.apache.spark.scheduler._
import org.apache.spark.ui.{WebUI, SparkUI}
import org.apache.spark.ui.JettyUtils._
Expand Down Expand Up @@ -257,6 +258,7 @@ object HistoryServer {
val STATIC_RESOURCE_DIR = SparkUI.STATIC_RESOURCE_DIR

def main(argStrings: Array[String]) {
initSecurity()
val args = new HistoryServerArguments(argStrings)
val securityManager = new SecurityManager(conf)
val server = new HistoryServer(args.logDir, securityManager, conf)
Expand All @@ -266,6 +268,20 @@ object HistoryServer {
while(true) { Thread.sleep(Int.MaxValue) }
server.stop()
}

def initSecurity() {
// If we are accessing HDFS and it has security enabled (Kerberos), we have to login
// from a keytab file so that we can access HDFS beyond the kerberos ticket expiration.
// As long as it is using Hadoop rpc (hdfs://), a relogin will automatically
// occur from the keytab.
if (conf.getBoolean("spark.history.kerberos.enabled", false)) {
// if you have enabled kerberos the following 2 params must be set
val principalName = conf.get("spark.history.kerberos.principal")
val keytabFilename = conf.get("spark.history.kerberos.keytab")
SparkHadoopUtil.get.loginUserFromKeytab(principalName, keytabFilename)
}
}

}


Expand Down
24 changes: 24 additions & 0 deletions docs/monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,30 @@ represents an application's event logs. This creates a web interface at
The port to which the web interface of the history server binds.
</td>
</tr>
<tr>
<td>spark.history.kerberos.enabled</td>
<td>false</td>
<td>
Indicates whether the history server should use kerberos to login. This is useful
if the history server is accessing HDFS files on a secure Hadoop cluster. If this is
true it looks uses the configs <code>spark.history.kerberos.principal</code> and
<code>spark.history.kerberos.keytab</code>.
</td>
</tr>
<tr>
<td>spark.history.kerberos.principal</td>
<td>(none)</td>
<td>
Kerberos principal name for the History Server.
</td>
</tr>
<tr>
<td>spark.history.kerberos.keytab</td>
<td>(none)</td>
<td>
Location of the kerberos keytab file for the History Server.
</td>
</tr>
</table>

Note that in all of these UIs, the tables are sortable by clicking their headers,
Expand Down

0 comments on commit bd37509

Please sign in to comment.