tispark/R at ce9f81879a4014afc492ef70bb58523c5e5decfc · yantzu/tispark

Name		Name	Last commit message	Last commit date
README.md		README.md

README.md

There are currently two ways to use TiSpark on SparkR:

This is the simplest way, just a decent Spark environment should be enough.

Make sure you have the latest version of TiSpark and a jar with all TiSpark's dependencies.
Remember to add needed configurations listed in README into your $SPARK_HOME/conf/spark-defaults.conf
Run this command in your $SPARK_HOME directory:

./bin/sparkR --jars /where-ever-it-is/tispark-${name_with_version}.jar

sql("use tpch_test")
count <- sql("select count(*) from customer")
head(count)

This way is useful when you want to execute your own R scripts.

library(SparkR)
sparkR.session()
sql("use tpch_test")
count <- sql("select count(*) from customer")
head(count)

./bin/spark-submit --jars /where-ever-it-is/tispark-${name_with_version}.jar test.R

+--------+
|count(1)|
+--------+
|     150|
+--------+