This initialization action installs Apache Livy on a master node within a Google Cloud Dataproc cluster.
You can use this initialization action to create a new Dataproc cluster with Livy installed:
-
Use the
gcloud
command to create a new cluster with this initialization action.REGION=<region> CLUSTER_NAME=<cluster_name> gcloud dataproc clusters create ${CLUSTER_NAME} \ --region ${REGION} \ --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/livy/livy.sh
-
To change installed Livy version, use
livy-version
metadata value:REGION=<region> CLUSTER_NAME=<cluster_name> gcloud dataproc clusters create ${CLUSTER_NAME} \ --region ${REGION} \ --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/livy/livy.sh \ --metadata livy-version=0.7.0
-
To change version of scala against which livy is linked, use
scala-version
metadata value:REGION=<region> CLUSTER_NAME=<cluster_name> gcloud dataproc clusters create ${CLUSTER_NAME} \ --region ${REGION} \ --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/livy/livy.sh \ --metadata scala-version=2.12
-
To change timeout for Livy session, use
livy-timeout-session
metadata value:REGION=<region> CLUSTER_NAME=<cluster_name> gcloud dataproc clusters create ${CLUSTER_NAME} \ --region ${REGION} \ --initialization-actions gs://goog-dataproc-initialization-actions-${REGION}/livy/livy.sh \ --metadata livy-timeout-session='3h'
-
Once the cluster has been created, Livy is configured to run on port
8998
on the master node in a Dataproc cluster. -
To learn about how to use Livy read the documentation for the Rest API