XGBoost is an optimized distributed gradient boosting library designed to be highly efficient, flexible and portable. It implements machine learning algorithms under the Gradient Boosting framework. XGBoost provides a parallel tree boosting(also known as GBDT, GBM) that solve many data science problems in a fast and accurate way. The same code runs on major distributed environment(Hadoop, SGE, MPI) and can solve problems beyond billions of examples. XGBoost is part of DMLC projects.
- XGBoost brick
- XGBoost helps Vlad Mironov, Alexander Guschin to win the CERN LHCb experiment Flavour of Physics competition. Check out the interview from Kaggle.
- XGBoost helps Mario Filho, Josef Feigl, Lucas, Gilberto to win the Caterpillar Tube Pricing competition. Check out the interview from Kaggle.
- XGBoost helps Halla Yang to win the Recruit Coupon Purchase Prediction Challenge. Check out the interview from Kaggle.
- Current version xgboost-0.6 (brick)
- See Change log for details
- Easily accessible through CLI, python, R, Julia
- Its fast! Benchmark numbers comparing xgboost, H20, Spark, R - benchm-ml numbers
- Memory efficient - Handles sparse matrices, supports external memory
- Accurate prediction, and used extensively by data scientists and kagglers - highlight links
- Distributed version runs on Hadoop (YARN), MPI, SGE etc., scales to billions of examples.
- For reporting bugs please use the xgboost/issues page.
- For generic questions or to share your experience using xgboost please use the XGBoost User Group
XGBoost has been developed and used by a group of active community members. Everyone is more than welcome to contribute. It is a way to make the project better and more accessible to more users.
- Check out Feature Wish List to see what can be improved, or open an issue if you want something.
- Contribute to the documents and examples to share your experience with other users.
- Please add your name to CONTRIBUTORS.md after your patch has been merged.
© Contributors, 2015. Licensed under an Apache-2 license.