This is document of xgboost library. XGBoost is short for eXtreme gradient boosting. This is a library that is designed, and optimized for boosted (tree) algorithms. The goal of this library is to push the extreme of the computation limits of machines to provide a scalable, portable and accurate for large scale tree boosting.
This document is hosted at http://xgboost.readthedocs.org/. You can also browse most of the documents in github directly.
The best way to get started to learn xgboost is by the examples. There are three types of examples you can find in xgboost.
- Tutorials are self-contained tutorials on complete data science tasks.
- XGBoost Code Examples are collections of code and benchmarks of xgboost.
- There is a walkthrough section in this to walk you through specific API features.
- Highlight Solutions are presentations using xgboost to solve real world problems.
- These examples are usually more advanced. You can usually find state-of-art solutions to many problems and challenges in here.
After you gets familiar with the interface, checkout the following additional resources
- Frequently Asked Questions
- Learning what is in Behind: Introduction to Boosted Trees
- User Guide contains comprehensive list of documents of xgboost.
- Developer Guide
Tutorials are self contained materials that teaches you how to achieve a complete data science task with xgboost, these are great resources to learn xgboost by real examples. If you think you have something that belongs to here, send a pull request.
- Binary classification using XGBoost Command Line (CLI)
- This tutorial introduces the basic usage of CLI version of xgboost
- Introduction of XGBoost in Python (python)
- This tutorial introduces the python package of xgboost
- Introduction to XGBoost in R (R package)
- This is a general presentation about xgboost in R.
- Discover your data with XGBoost in R (R package)
- This tutorial explaining feature analysis in xgboost.
- Understanding XGBoost Model on Otto Dataset (R package)
- This tutorial teaches you how to use xgboost to compete kaggle otto challenge.
This section is about blogposts, presentation and videos discussing how to use xgboost to solve your interesting problem. If you think something belongs to here, send a pull request.
- Kaggle CrowdFlower winner's solution by Chenglong Chen
- Kaggle Malware Prediction winner's solution
- Kaggle Tradeshift winning solution by daxiongshu
- Feature Importance Analysis with XGBoost in Tax audit
- Video tutorial: Better Optimization with Repeated Cross Validation and the XGBoost model
- Winning solution of Kaggle Higgs competition: what a single model can do