pyspark-jupyter-cdh

Pyspark Jupyter Notebook Daemon on Cloudera CDH

This repository contains files to support the execution of Jupyter notebooks for Pyspark as a daemon on a Cloudera CDH Cluster using the anaconda parcels.

Assuming you have installed anaconda to your CDH cluster using the following guide(s) http://blog.cloudera.com/blog/2016/02/making-python-on-apache-hadoop-easier-with-anaconda-and-cdh/ http://www.cloudera.com/documentation/enterprise/latest/topics/spark_ipython.html

To enable the Jupyter notebook as a service on a host, as root:

copy the pyspark-jupyter-cdh file to /etc/init.d and copy the pyspark-jupyter-cdh.sh file to /usr/local/sbin

then chmod +x /etc/init.d/pyspark-jupyter-cdh and chmod +x /usr/local/sbin/pyspark-jupyter-cdh.sh

(ensure the user you wish to use for the daemon has sufficient permissions to execute /usr/local/sbin/pyspark-jupyter-cdh.sh)

If you have installed the anaconda using parcel at defaults the service should operate without changes as the hdfs user. Otherwise edit the /etc/init.d/pyspark-jupyter-cdh file and change the values below as you desire

export DAEMON_USER=hdfs

export DAEMON_NAME=pyspark-jupyter-cdh

export DAEMON_PATH=/var/jupyter

export DAEMON_PORT=8880

Start the service service pyspark-jupyter-cdh start

Auto start the service chkconfig pyspark-jupyter-cdh on

Point your web browser to your notebook server at http://hostname.domainname:8880

##Using Matplot When importing matplotlib, add the following to the beginning of your python code

%matplotlib notebook

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
LICENSE.md		LICENSE.md
README.md		README.md
example-parquet.py		example-parquet.py
pyspark-jupyter-cdh		pyspark-jupyter-cdh
pyspark-jupyter-cdh.sh		pyspark-jupyter-cdh.sh
test-notebook.py		test-notebook.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pyspark-jupyter-cdh

About

Releases

Packages

Languages

License

dereksdata/pyspark-jupyter-cdh

Folders and files

Latest commit

History

Repository files navigation

pyspark-jupyter-cdh

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages