From 01ff2b2c294426bf5a8936b484be917566caf9f0 Mon Sep 17 00:00:00 2001 From: nocluebutalotofit <45629760+nocluebutalotofit@users.noreply.github.com> Date: Fri, 14 Jun 2024 12:09:27 +0200 Subject: [PATCH] [doc] FIx learning to rank (#10412) --- doc/tutorials/learning_to_rank.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/tutorials/learning_to_rank.rst b/doc/tutorials/learning_to_rank.rst index 15a611bd0c32..0e1f5d1a09d0 100644 --- a/doc/tutorials/learning_to_rank.rst +++ b/doc/tutorials/learning_to_rank.rst @@ -65,7 +65,7 @@ The simplest way to train a ranking model is by using the scikit-learn estimator .. code-block:: python ranker = xgb.XGBRanker(tree_method="hist", lambdarank_num_pair_per_sample=8, objective="rank:ndcg", lambdarank_pair_method="topk") - ranker.fit(X, y, qid=qid[sorted_idx]) + ranker.fit(X, y, qid=qid) Please note that, as of writing, there's no learning-to-rank interface in scikit-learn. As a result, the :py:class:`xgboost.XGBRanker` class does not fully conform the scikit-learn estimator guideline and can not be directly used with some of its utility functions. For instances, the ``auc_score`` and ``ndcg_score`` in scikit-learn don't consider query group information nor the pairwise loss. Most of the metrics are implemented as part of XGBoost, but to use scikit-learn utilities like :py:func:`sklearn.model_selection.cross_validation`, we need to make some adjustments in order to pass the ``qid`` as an additional parameter for :py:meth:`xgboost.XGBRanker.score`. Given a data frame ``X`` (either pandas or cuDF), add the column ``qid`` as follows: