I used a fork of the project https://github.com/apache/incubator-livy (include PR: https://github.com/apache/incubator-livy/pull/249/files)
And also some of the ideas of the guys from the project https://github.com/HPEEzmeral/spark-on-k8s
In the current project, I added a Dokerfile to build livy, spark, as well as helm chart to run in the k8s environment
The scheme of work is similar to the scheme in the Hadoop cluster:
To build, it is enough to build the Dockerfile in the root of the project.
docker build -t livy .
You need to create a Namespace spark in the l8y cluster
kubectl create ns spark
Then install helm chart:
cd helm/
helm upgrade --install livy . -n spark
After the pod is launched. It is necessary to organize access to port 8998. It does not matter how it will be done, through the forwarding port or intgerss.
To start a session, you need to send a POST request http://localhost:8998/sessions.
# Example
{"kind": "spark" , "driverMemory": "8g", "driverCores": 4, "executorMemory": "4g", "executorCores": 1, "numExecutors": 10}
Next you will see the formation of spark pods:
To work from Jupyter, you need to create kernels (https://github.com/jupyter-incubator/sparkmagic)