Skip to content

Commit

Permalink
[UPDATE] Update rabit and threadlocal (dmlc#2114)
Browse files Browse the repository at this point in the history
* [UPDATE] Update rabit and threadlocal

* minor fix to make build system happy

* upgrade requirement to g++4.8

* upgrade dmlc-core

* update travis
  • Loading branch information
tqchen authored Mar 17, 2017
1 parent b0c972a commit d581a3d
Show file tree
Hide file tree
Showing 28 changed files with 59 additions and 48 deletions.
6 changes: 5 additions & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,17 +41,21 @@ matrix:
# dependent apt packages
addons:
apt:
sources:
- ubuntu-toolchain-r-test
packages:
- doxygen
- wget
- libcurl4-openssl-dev
- unzip
- graphviz
- gcc-4.8
- g++-4.8

before_install:
- source dmlc-core/scripts/travis/travis_setup_env.sh
- export PYTHONPATH=${PYTHONPATH}:${PWD}/python-package
- echo "MAVEN_OPTS='-Xmx2048m -XX:MaxPermSize=1024m -XX:ReservedCodeCacheSize=512m -Dorg.slf4j.simpleLogger.defaultLogLevel=error'" > ~/.mavenrc
- echo "MAVEN_OPTS='-Xmx2048m -XX:MaxPermSize=1024m -XX:ReservedCodeCacheSize=512m -Dorg.slf4j.simpleLogger.defaultLogLevel=error'" > ~/.mavenrc

install:
- source tests/travis/setup.sh
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ export CXX = $(if $(shell which g++-6),g++-6,$(if $(shell which g++-mp-6),g++-mp
endif

export LDFLAGS= -pthread -lm $(ADD_LDFLAGS) $(DMLC_LDFLAGS) $(PLUGIN_LDFLAGS)
export CFLAGS= -std=c++0x -Wall -Wno-unknown-pragmas -Iinclude $(ADD_CFLAGS) $(PLUGIN_CFLAGS)
export CFLAGS= -std=c++11 -Wall -Wno-unknown-pragmas -Iinclude $(ADD_CFLAGS) $(PLUGIN_CFLAGS)
CFLAGS += -I$(DMLC_CORE)/include -I$(RABIT)/include
#java include path
export JAVAINCFLAGS = -I${JAVA_HOME}/include -I./java
Expand Down
4 changes: 4 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ This file records the changes in xgboost library in reverse chronological order.
- Specialized some prediction routine
* Automatically remove nan from input data when it is sparse.
- This can solve some of user reported problem of istart != hist.size
* Minor fixes
- Thread local variable is upgraded so it is automatically freed at thread exit.
* Migrate to C++11
- The current master version now requires C++11 enabled compiled(g++4.8 or higher)

## v0.6 (2016.07.29)
* Version 0.5 is skipped due to major improvements in the core
Expand Down
4 changes: 2 additions & 2 deletions doc/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ Our goal is to build the shared library:

The minimal building requirement is

- A recent c++ compiler supporting C++ 11 (g++-4.6 or higher)
- A recent c++ compiler supporting C++ 11 (g++-4.8 or higher)

We can edit `make/config.mk` to change the compile options, and then build by
`make`. If everything goes well, we can go to the specific language installation section.
Expand Down Expand Up @@ -222,7 +222,7 @@ first follow [Building on OSX](#building-on-osx) to get the OpenMP enabled compi

### Installing the development version

Make sure you have installed git and a recent C++ compiler supporting C++11 (e.g., g++-4.6 or higher).
Make sure you have installed git and a recent C++ compiler supporting C++11 (e.g., g++-4.8 or higher).
On Windows, Rtools must be installed, and its bin directory has to be added to PATH during the installation.
And see the previous subsection for an OSX tip.

Expand Down
2 changes: 1 addition & 1 deletion rabit
2 changes: 1 addition & 1 deletion src/c_api/c_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -385,7 +385,7 @@ XGB_DLL int XGDMatrixSliceDMatrix(DMatrixHandle handle,
src.CopyFrom(static_cast<std::shared_ptr<DMatrix>*>(handle)->get());
data::SimpleCSRSource& ret = *source;

CHECK_EQ(src.info.group_ptr.size(), 0)
CHECK_EQ(src.info.group_ptr.size(), 0U)
<< "slice does not support group structure";

ret.Clear();
Expand Down
2 changes: 1 addition & 1 deletion src/common/hist_util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ void GHistIndexMatrix::Init(DMatrix* p_fmat) {
}
index.resize(row_ptr.back());

CHECK_GT(cut->cut.size(), 0);
CHECK_GT(cut->cut.size(), 0U);
CHECK_EQ(cut->row_ptr.back(), cut->cut.size());

omp_ulong bsize = static_cast<omp_ulong>(batch.size);
Expand Down
2 changes: 1 addition & 1 deletion src/common/row_set.h
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ class RowSetCollection {
}
// initialize node id 0->everything
inline void Init() {
CHECK_EQ(elem_of_each_node_.size(), 0);
CHECK_EQ(elem_of_each_node_.size(), 0U);
const bst_uint* begin = dmlc::BeginPtr(row_indices_);
const bst_uint* end = dmlc::BeginPtr(row_indices_) + row_indices_.size();
elem_of_each_node_.emplace_back(Elem(begin, end));
Expand Down
8 changes: 4 additions & 4 deletions src/data/sparse_batch_page.h
Original file line number Diff line number Diff line change
Expand Up @@ -207,14 +207,14 @@ class SparsePage::Writer {
* writing is done by another thread inside writer.
* \param page The page to be written
*/
void PushWrite(std::unique_ptr<SparsePage>&& page);
void PushWrite(std::shared_ptr<SparsePage>&& page);
/*!
* \brief Allocate a page to store results.
* This function can block when the writer is too slow and buffer pages
* have not yet been recycled.
* \param out_page Used to store the allocated pages.
*/
void Alloc(std::unique_ptr<SparsePage>* out_page);
void Alloc(std::shared_ptr<SparsePage>* out_page);

private:
/*! \brief number of allocated pages */
Expand All @@ -224,9 +224,9 @@ class SparsePage::Writer {
/*! \brief writer threads */
std::vector<std::unique_ptr<std::thread> > workers_;
/*! \brief recycler queue */
dmlc::ConcurrentBlockingQueue<std::unique_ptr<SparsePage> > qrecycle_;
dmlc::ConcurrentBlockingQueue<std::shared_ptr<SparsePage> > qrecycle_;
/*! \brief worker threads */
std::vector<dmlc::ConcurrentBlockingQueue<std::unique_ptr<SparsePage> > > qworkers_;
std::vector<dmlc::ConcurrentBlockingQueue<std::shared_ptr<SparsePage> > > qworkers_;
};
#endif // DMLC_ENABLE_STD_THREAD

Expand Down
2 changes: 1 addition & 1 deletion src/data/sparse_page_dmatrix.cc
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,7 @@ void SparsePageDMatrix::InitColAccess(const std::vector<bool>& enabled,

{
SparsePage::Writer writer(name_shards, format_shards, 6);
std::unique_ptr<SparsePage> page;
std::shared_ptr<SparsePage> page;
writer.Alloc(&page); page->Clear();

double tstart = dmlc::GetTime();
Expand Down
2 changes: 1 addition & 1 deletion src/data/sparse_page_raw_format.cc
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ class SparsePageRawFormat : public SparsePage::Format {
public:
bool Read(SparsePage* page, dmlc::SeekStream* fi) override {
if (!fi->Read(&(page->offset))) return false;
CHECK_NE(page->offset.size(), 0) << "Invalid SparsePage file";
CHECK_NE(page->offset.size(), 0U) << "Invalid SparsePage file";
page->data.resize(page->offset.back());
if (page->data.size() != 0) {
CHECK_EQ(fi->Read(dmlc::BeginPtr(page->data),
Expand Down
12 changes: 6 additions & 6 deletions src/data/sparse_page_source.cc
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ SparsePageSource::SparsePageSource(const std::string& cache_info)
: base_rowid_(0), page_(nullptr), clock_ptr_(0) {
// read in the info files
std::vector<std::string> cache_shards = common::Split(cache_info, ':');
CHECK_NE(cache_shards.size(), 0);
CHECK_NE(cache_shards.size(), 0U);
{
std::string name_info = cache_shards[0];
std::unique_ptr<dmlc::Stream> finfo(dmlc::Stream::Create(name_info.c_str(), "r"));
Expand Down Expand Up @@ -85,7 +85,7 @@ const RowBatch& SparsePageSource::Value() const {

bool SparsePageSource::CacheExist(const std::string& cache_info) {
std::vector<std::string> cache_shards = common::Split(cache_info, ':');
CHECK_NE(cache_shards.size(), 0);
CHECK_NE(cache_shards.size(), 0U);
{
std::string name_info = cache_shards[0];
std::unique_ptr<dmlc::Stream> finfo(dmlc::Stream::Create(name_info.c_str(), "r", true));
Expand All @@ -102,7 +102,7 @@ bool SparsePageSource::CacheExist(const std::string& cache_info) {
void SparsePageSource::Create(dmlc::Parser<uint32_t>* src,
const std::string& cache_info) {
std::vector<std::string> cache_shards = common::Split(cache_info, ':');
CHECK_NE(cache_shards.size(), 0);
CHECK_NE(cache_shards.size(), 0U);
// read in the info files.
std::string name_info = cache_shards[0];
std::vector<std::string> name_shards, format_shards;
Expand All @@ -112,7 +112,7 @@ void SparsePageSource::Create(dmlc::Parser<uint32_t>* src,
}
{
SparsePage::Writer writer(name_shards, format_shards, 6);
std::unique_ptr<SparsePage> page;
std::shared_ptr<SparsePage> page;
writer.Alloc(&page); page->Clear();

MetaInfo info;
Expand Down Expand Up @@ -170,7 +170,7 @@ void SparsePageSource::Create(dmlc::Parser<uint32_t>* src,
void SparsePageSource::Create(DMatrix* src,
const std::string& cache_info) {
std::vector<std::string> cache_shards = common::Split(cache_info, ':');
CHECK_NE(cache_shards.size(), 0);
CHECK_NE(cache_shards.size(), 0U);
// read in the info files.
std::string name_info = cache_shards[0];
std::vector<std::string> name_shards, format_shards;
Expand All @@ -180,7 +180,7 @@ void SparsePageSource::Create(DMatrix* src,
}
{
SparsePage::Writer writer(name_shards, format_shards, 6);
std::unique_ptr<SparsePage> page;
std::shared_ptr<SparsePage> page;
writer.Alloc(&page); page->Clear();

MetaInfo info = src->info();
Expand Down
8 changes: 4 additions & 4 deletions src/data/sparse_page_writer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ SparsePage::Writer::Writer(
std::unique_ptr<SparsePage::Format> fmt(
SparsePage::Format::Create(format_shard));
fo->Write(format_shard);
std::unique_ptr<SparsePage> page;
std::shared_ptr<SparsePage> page;
while (wqueue->Pop(&page)) {
if (page.get() == nullptr) break;
fmt->Write(*page, fo.get());
Expand All @@ -47,20 +47,20 @@ SparsePage::Writer::Writer(
SparsePage::Writer::~Writer() {
for (auto& queue : qworkers_) {
// use nullptr to signal termination.
std::unique_ptr<SparsePage> sig(nullptr);
std::shared_ptr<SparsePage> sig(nullptr);
queue.Push(std::move(sig));
}
for (auto& thread : workers_) {
thread->join();
}
}

void SparsePage::Writer::PushWrite(std::unique_ptr<SparsePage>&& page) {
void SparsePage::Writer::PushWrite(std::shared_ptr<SparsePage>&& page) {
qworkers_[clock_ptr_].Push(std::move(page));
clock_ptr_ = (clock_ptr_ + 1) % workers_.size();
}

void SparsePage::Writer::Alloc(std::unique_ptr<SparsePage>* out_page) {
void SparsePage::Writer::Alloc(std::shared_ptr<SparsePage>* out_page) {
CHECK(out_page->get() == nullptr);
if (num_free_buffer_ != 0) {
out_page->reset(new SparsePage());
Expand Down
2 changes: 1 addition & 1 deletion src/gbm/gblinear.cc
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ class GBLinear : public GradientBooster {
if (model.weight.size() == 0) {
model.InitModel();
}
CHECK_EQ(ntree_limit, 0)
CHECK_EQ(ntree_limit, 0U)
<< "GBLinear::Predict ntrees is only valid for gbtree predictor";
std::vector<bst_float> &preds = *out_preds;
const std::vector<bst_float>& base_margin = p_fmat->info().base_margin;
Expand Down
2 changes: 1 addition & 1 deletion src/gbm/gbtree.cc
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,7 @@ class GBTree : public GradientBooster {
new_trees.push_back(std::move(ret));
} else {
const int ngroup = mparam.num_output_group;
CHECK_EQ(gpair.size() % ngroup, 0)
CHECK_EQ(gpair.size() % ngroup, 0U)
<< "must have exactly ngroup*nrow gpairs";
std::vector<bst_gpair> tmp(gpair.size() / ngroup);
for (int gid = 0; gid < ngroup; ++gid) {
Expand Down
2 changes: 1 addition & 1 deletion src/learner.cc
Original file line number Diff line number Diff line change
Expand Up @@ -243,7 +243,7 @@ class LearnerImpl : public Learner {
CHECK_NE(header, "bs64")
<< "Base64 format is no longer supported in brick.";
if (header == "binf") {
CHECK_EQ(fp.Read(&header[0], 4), 4);
CHECK_EQ(fp.Read(&header[0], 4), 4U);
}
}
// use the peekable reader.
Expand Down
2 changes: 1 addition & 1 deletion src/metric/elementwise_metric.cc
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ struct EvalEWiseBase : public Metric {
bst_float Eval(const std::vector<bst_float>& preds,
const MetaInfo& info,
bool distributed) const override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK_EQ(preds.size(), info.labels.size())
<< "label and prediction size not match, "
<< "hint: use merror or mlogloss for multi-class classification";
Expand Down
4 changes: 2 additions & 2 deletions src/metric/multiclass_metric.cc
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@ struct EvalMClassBase : public Metric {
bst_float Eval(const std::vector<bst_float> &preds,
const MetaInfo &info,
bool distributed) const override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK(preds.size() % info.labels.size() == 0)
<< "label and prediction size not match";
const size_t nclass = preds.size() / info.labels.size();
CHECK_GE(nclass, 1)
CHECK_GE(nclass, 1U)
<< "mlogloss and merror are only used for multi-class classification,"
<< " use logloss for binary classification";
const bst_omp_uint ndata = static_cast<bst_omp_uint>(info.labels.size());
Expand Down
4 changes: 2 additions & 2 deletions src/metric/rank_metric.cc
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ struct EvalAuc : public Metric {
bst_float Eval(const std::vector<bst_float> &preds,
const MetaInfo &info,
bool distributed) const override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK_EQ(preds.size(), info.labels.size())
<< "label size predict size not match";
std::vector<unsigned> tgptr(2, 0);
Expand Down Expand Up @@ -166,7 +166,7 @@ struct EvalRankList : public Metric {
std::vector<unsigned> tgptr(2, 0);
tgptr[1] = static_cast<unsigned>(preds.size());
const std::vector<unsigned> &gptr = info.group_ptr.size() == 0 ? tgptr : info.group_ptr;
CHECK_NE(gptr.size(), 0) << "must specify group when constructing rank file";
CHECK_NE(gptr.size(), 0U) << "must specify group when constructing rank file";
CHECK_EQ(gptr.back(), preds.size())
<< "EvalRanklist: group structure must match number of prediction";
const bst_omp_uint ngroup = static_cast<bst_omp_uint>(gptr.size() - 1);
Expand Down
2 changes: 1 addition & 1 deletion src/objective/multiclass_obj.cc
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ class SoftmaxMultiClassObj : public ObjFunction {
const MetaInfo& info,
int iter,
std::vector<bst_gpair>* out_gpair) override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK(preds.size() == (static_cast<size_t>(param_.num_class) * info.labels.size()))
<< "SoftmaxMultiClassObj: label size and pred size does not match";
out_gpair->resize(preds.size());
Expand Down
8 changes: 4 additions & 4 deletions src/objective/regression_obj.cc
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ class RegLossObj : public ObjFunction {
const MetaInfo &info,
int iter,
std::vector<bst_gpair> *out_gpair) override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK_EQ(preds.size(), info.labels.size())
<< "labels are not correctly provided"
<< "preds.size=" << preds.size() << ", label.size=" << info.labels.size();
Expand Down Expand Up @@ -168,7 +168,7 @@ class PoissonRegression : public ObjFunction {
const MetaInfo &info,
int iter,
std::vector<bst_gpair> *out_gpair) override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK_EQ(preds.size(), info.labels.size()) << "labels are not correctly provided";
out_gpair->resize(preds.size());
// check if label in range
Expand Down Expand Up @@ -229,7 +229,7 @@ class GammaRegression : public ObjFunction {
const MetaInfo &info,
int iter,
std::vector<bst_gpair> *out_gpair) override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK_EQ(preds.size(), info.labels.size()) << "labels are not correctly provided";
out_gpair->resize(preds.size());
// check if label in range
Expand Down Expand Up @@ -294,7 +294,7 @@ class TweedieRegression : public ObjFunction {
const MetaInfo &info,
int iter,
std::vector<bst_gpair> *out_gpair) override {
CHECK_NE(info.labels.size(), 0) << "label set cannot be empty";
CHECK_NE(info.labels.size(), 0U) << "label set cannot be empty";
CHECK_EQ(preds.size(), info.labels.size()) << "labels are not correctly provided";
out_gpair->resize(preds.size());
// check if label in range
Expand Down
2 changes: 1 addition & 1 deletion src/tree/param.h
Original file line number Diff line number Diff line change
Expand Up @@ -204,7 +204,7 @@ struct TrainParam : public dmlc::Parameter<TrainParam> {
/*! \brief maximum sketch size */
inline unsigned max_sketch_size() const {
unsigned ret = static_cast<unsigned>(sketch_ratio / sketch_eps);
CHECK_GT(ret, 0);
CHECK_GT(ret, 0U);
return ret;
}
};
Expand Down
6 changes: 3 additions & 3 deletions src/tree/updater_colmaker.cc
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ class ColMaker: public TreeUpdater {
}
unsigned n = static_cast<unsigned>(param.colsample_bytree * feat_index.size());
std::shuffle(feat_index.begin(), feat_index.end(), common::GlobalRandom());
CHECK_GT(n, 0)
CHECK_GT(n, 0U)
<< "colsample_bytree=" << param.colsample_bytree
<< " is too small that no feature can be included";
feat_index.resize(n);
Expand Down Expand Up @@ -628,7 +628,7 @@ class ColMaker: public TreeUpdater {
if (param.colsample_bylevel != 1.0f) {
std::shuffle(feat_set.begin(), feat_set.end(), common::GlobalRandom());
unsigned n = static_cast<unsigned>(param.colsample_bylevel * feat_index.size());
CHECK_GT(n, 0)
CHECK_GT(n, 0U)
<< "colsample_bylevel is too small that no feature can be included";
feat_set.resize(n);
}
Expand Down Expand Up @@ -784,7 +784,7 @@ class DistColMaker : public ColMaker<TStats, TConstraint> {
DMatrix* dmat,
const std::vector<RegTree*> &trees) override {
TStats::CheckInfo(dmat->info());
CHECK_EQ(trees.size(), 1) << "DistColMaker: only support one tree at a time";
CHECK_EQ(trees.size(), 1U) << "DistColMaker: only support one tree at a time";
// build the tree
builder.Update(gpair, dmat, trees[0]);
//// prune the tree, note that pruner will sync the tree
Expand Down
Loading

0 comments on commit d581a3d

Please sign in to comment.