Skip to content

Commit

Permalink
[SPARK-42656][CONNECT][FOLLOWUP] Fix the spark-connect script
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?
The spark-connect script is broken as it need a jar at the end.
Also ensured when scala 2.13 is set, all commands in the scripts runs with `-PScala-2.13`

Example usage:
Start spark connect with default settings:
* `./connector/connect/bin/spark-connect-shell`
* or `./connector/connect/bin/spark-connect` (Enter "q" <new line> to exit the program)

Start Scala client with default settings: `./connector/connect/bin/spark-connect-scala-client`

Start spark connect with extra configs:
* `./connector/connect/bin/spark-connect-shell --conf spark.connect.grpc.binding.port=8888`
* or `./connector/connect/bin/spark-connect --conf spark.connect.grpc.binding.port=8888`

Start Scala client with a connection string:
```
export SPARK_REMOTE="sc://localhost:8888/"
./connector/connect/bin/spark-connect-scala-client
```

### Why are the changes needed?
Bug fix

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Manually tested on 2.12 and 2.13 for all the scripts changed.

Test example with expected results:
`./connector/connect/bin/spark-connect-shell` :
<img width="1050" alt="Screen Shot 2023-03-08 at 2 14 31 PM" src="https://user-images.githubusercontent.com/4190164/223863343-d5d159d9-da7c-47c7-b55a-a2854c5f5d76.png">

Verify the spark connect server is started at the correct port, e.g.
```
>Telnet localhost 15002
Trying ::1...
Connected to localhost.
Escape character is '^]'.
```

`./connector/connect/bin/spark-connect`:
<img width="1680" alt="Screen Shot 2023-03-08 at 2 13 09 PM" src="https://user-images.githubusercontent.com/4190164/223863099-41195599-c49d-4db4-a1e2-e129a649cd81.png">
Server started successfully when seeing the last line output.

`./connector/connect/bin/spark-connect-scala-client`:
<img width="1658" alt="Screen Shot 2023-03-08 at 2 11 58 PM" src="https://user-images.githubusercontent.com/4190164/223862992-c8a3a36a-9f69-40b8-b82e-5dab85ed14ce.png">
Verify the client can run some simple quries.

Closes apache#40344 from zhenlineo/fix-scripts.

Authored-by: Zhen Li <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
  • Loading branch information
zhenlineo authored and HyukjinKwon committed Mar 9, 2023
1 parent a77bb37 commit b5243d7
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 14 deletions.
11 changes: 9 additions & 2 deletions connector/connect/bin/spark-connect
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,14 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
cd "$FWDIR"
export SPARK_HOME=$FWDIR

# Determine the Scala version used in Spark
SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head -n1 | awk -F '[<>]' '{print $3}'`
SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"

# Build the jars needed for spark submit and spark connect
build/sbt -Phive -Pconnect package
build/sbt "${SCALA_ARG}" -Phive -Pconnect package

# This jar is already in the classpath, but the submit commands wants a jar as the input.
CONNECT_JAR=`ls "${SPARK_HOME}"/assembly/target/scala-"${SCALA_BINARY_VER}"/jars/spark-connect_*.jar | paste -sd ',' -`

exec "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.sql.connect.SimpleSparkConnectService "$@"
exec "${SPARK_HOME}"/bin/spark-submit "$@" --class org.apache.spark.sql.connect.SimpleSparkConnectService "$CONNECT_JAR"
19 changes: 10 additions & 9 deletions connector/connect/bin/spark-connect-scala-client
Original file line number Diff line number Diff line change
Expand Up @@ -34,17 +34,18 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
cd "$FWDIR"
export SPARK_HOME=$FWDIR

# Build the jars needed for spark connect JVM client
build/sbt "sql/package;connect-client-jvm/assembly"

CONNECT_CLASSPATH="$(build/sbt -DcopyDependencies=false "export connect-client-jvm/fullClasspath" | grep jar | tail -n1)"
SQL_CLASSPATH="$(build/sbt -DcopyDependencies=false "export sql/fullClasspath" | grep jar | tail -n1)"

INIT_SCRIPT="${SPARK_HOME}"/connector/connect/bin/spark-connect-scala-client.sc

# Determine the Scala version used in Spark
SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head -n1 | awk -F '[<>]' '{print $3}'`
SCALA_VER=`grep "scala.version" "${SPARK_HOME}/pom.xml" | grep ${SCALA_BINARY_VER} | head -n1 | awk -F '[<>]' '{print $3}'`
SCALA_BIN="${SPARK_HOME}/build/scala-${SCALA_VER}/bin/scala"
SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"

# Build the jars needed for spark connect JVM client
build/sbt "${SCALA_ARG}" "sql/package;connect-client-jvm/assembly"

CONNECT_CLASSPATH="$(build/sbt "${SCALA_ARG}" -DcopyDependencies=false "export connect-client-jvm/fullClasspath" | grep jar | tail -n1)"
SQL_CLASSPATH="$(build/sbt "${SCALA_ARG}" -DcopyDependencies=false "export sql/fullClasspath" | grep jar | tail -n1)"

INIT_SCRIPT="${SPARK_HOME}"/connector/connect/bin/spark-connect-scala-client.sc

exec "${SCALA_BIN}" -cp "$CONNECT_CLASSPATH:$SQL_CLASSPATH" -i $INIT_SCRIPT
exec "${SCALA_BIN}" -cp "$CONNECT_CLASSPATH:$SQL_CLASSPATH" -i $INIT_SCRIPT
10 changes: 7 additions & 3 deletions connector/connect/bin/spark-connect-shell
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,11 @@ FWDIR="$(cd "`dirname "$0"`"/../../..; pwd)"
cd "$FWDIR"
export SPARK_HOME=$FWDIR

# Build the jars needed for spark shell and spark connect
build/sbt -Phive -Pconnect package
# Determine the Scala version used in Spark
SCALA_BINARY_VER=`grep "scala.binary.version" "${SPARK_HOME}/pom.xml" | head -n1 | awk -F '[<>]' '{print $3}'`
SCALA_ARG="-Pscala-${SCALA_BINARY_VER}"

exec "${SPARK_HOME}"/bin/spark-shell --conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin "$@"
# Build the jars needed for spark submit and spark connect
build/sbt "${SCALA_ARG}" -Phive -Pconnect package

exec "${SPARK_HOME}"/bin/spark-shell --conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin "$@"

0 comments on commit b5243d7

Please sign in to comment.