Name		Name	Last commit message	Last commit date
parent directory ..
dev		dev
xgboost4j-example		xgboost4j-example
xgboost4j-flink		xgboost4j-flink
xgboost4j-spark		xgboost4j-spark
xgboost4j		xgboost4j
.gitignore		.gitignore
README.md		README.md
build_doc.sh		build_doc.sh
checkstyle-suppressions.xml		checkstyle-suppressions.xml
checkstyle.xml		checkstyle.xml
create_jni.py		create_jni.py
pom.xml		pom.xml
scalastyle-config.xml		scalastyle-config.xml

README.md

XGBoost4J: Distributed XGBoost for Scala/Java

Documentation | Resources | Release Notes

XGBoost4J is the JVM package of xgboost. It brings all the optimizations and power xgboost into JVM ecosystem.

Train XGBoost models in scala and java with easy customizations.
Run distributed xgboost natively on jvm frameworks such as Apache Flink and Apache Spark.

You can find more about XGBoost on Documentation and Resource Page.

Add Maven Dependency

XGBoost4J, XGBoost4J-Spark, etc. in maven repository is compiled with g++-4.8.5

Access release version

maven

<dependency>
    <groupId>ml.dmlc</groupId>
    <artifactId>xgboost4j</artifactId>
    <version>latest_version_num</version>
</dependency>

sbt

 "ml.dmlc" % "xgboost4j" % "latest_version_num"

For the latest release version number, please check here.

if you want to use xgboost4j-spark, you just need to replace xgboost4j with xgboost4j-spark

Access SNAPSHOT version

You need to add github as repo:

maven:

<repository>
  <id>GitHub Repo</id>
  <name>GitHub Repo</name>
  <url>https://raw.githubusercontent.com/CodingCat/xgboost/maven-repo/</url>
</repository>

sbt:

resolvers += "GitHub Repo" at "https://raw.githubusercontent.com/CodingCat/xgboost/maven-repo/"

the add dependency as following:

maven

<dependency>
    <groupId>ml.dmlc</groupId>
    <artifactId>xgboost4j</artifactId>
    <version>latest_version_num</version>
</dependency>

sbt

 "ml.dmlc" % "xgboost4j" % "latest_version_num"

For the latest release version number, please check here.

if you want to use xgboost4j-spark, you just need to replace xgboost4j with xgboost4j-spark

Examples

Full code examples for Scala, Java, Apache Spark, and Apache Flink can be found in the examples package.

NOTE on LIBSVM Format:

There is an inconsistent issue between XGBoost4J-Spark and other language bindings of XGBoost.

When users use Spark to load trainingset/testset in LibSVM format with the following code snippet:

spark.read.format("libsvm").load("trainingset_libsvm")

Spark assumes that the dataset is 1-based indexed. However, when you do prediction with other bindings of XGBoost (e.g. Python API of XGBoost), XGBoost assumes that the dataset is 0-based indexed. It creates a pitfall for the users who train model with Spark but predict with the dataset in the same format in other bindings of XGBoost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jvm-packages

jvm-packages

README.md

XGBoost4J: Distributed XGBoost for Scala/Java

Add Maven Dependency

Access release version

Access SNAPSHOT version

Examples

Files

jvm-packages

Directory actions

More options

Directory actions

More options

Latest commit

History

jvm-packages

Folders and files

parent directory

README.md

XGBoost4J: Distributed XGBoost for Scala/Java

Add Maven Dependency

Access release version

Access SNAPSHOT version

Examples