Skip to content

Commit aeeac1c

Browse files
authored
ENH add D2 pinbal score and D2 absolute error score (scikit-learn#22118)
1 parent bd9336d commit aeeac1c

File tree

7 files changed

+409
-37
lines changed

7 files changed

+409
-37
lines changed

doc/modules/classes.rst

+2
Original file line numberDiff line numberDiff line change
@@ -998,6 +998,8 @@ details.
998998
metrics.mean_tweedie_deviance
999999
metrics.d2_tweedie_score
10001000
metrics.mean_pinball_loss
1001+
metrics.d2_pinball_score
1002+
metrics.d2_absolute_error_score
10011003

10021004
Multilabel ranking metrics
10031005
--------------------------

doc/modules/model_evaluation.rst

+94-31
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,9 @@ Scoring Function
101101
'neg_mean_poisson_deviance' :func:`metrics.mean_poisson_deviance`
102102
'neg_mean_gamma_deviance' :func:`metrics.mean_gamma_deviance`
103103
'neg_mean_absolute_percentage_error' :func:`metrics.mean_absolute_percentage_error`
104+
'd2_absolute_error_score' :func:`metrics.d2_absolute_error_score`
105+
'd2_pinball_score' :func:`metrics.d2_pinball_score`
106+
'd2_tweedie_score' :func:`metrics.d2_tweedie_score`
104107
==================================== ============================================== ==================================
105108

106109

@@ -1969,7 +1972,8 @@ The :mod:`sklearn.metrics` module implements several loss, score, and utility
19691972
functions to measure regression performance. Some of those have been enhanced
19701973
to handle the multioutput case: :func:`mean_squared_error`,
19711974
:func:`mean_absolute_error`, :func:`r2_score`,
1972-
:func:`explained_variance_score` and :func:`mean_pinball_loss`.
1975+
:func:`explained_variance_score`, :func:`mean_pinball_loss`, :func:`d2_pinball_score`
1976+
and :func:`d2_absolute_error_score`.
19731977

19741978

19751979
These functions have an ``multioutput`` keyword argument which specifies the
@@ -2371,8 +2375,8 @@ is defined as
23712375
\sum_{i=0}^{n_\text{samples} - 1}
23722376
\begin{cases}
23732377
(y_i-\hat{y}_i)^2, & \text{for }p=0\text{ (Normal)}\\
2374-
2(y_i \log(y/\hat{y}_i) + \hat{y}_i - y_i), & \text{for}p=1\text{ (Poisson)}\\
2375-
2(\log(\hat{y}_i/y_i) + y_i/\hat{y}_i - 1), & \text{for}p=2\text{ (Gamma)}\\
2378+
2(y_i \log(y/\hat{y}_i) + \hat{y}_i - y_i), & \text{for }p=1\text{ (Poisson)}\\
2379+
2(\log(\hat{y}_i/y_i) + y_i/\hat{y}_i - 1), & \text{for }p=2\text{ (Gamma)}\\
23762380
2\left(\frac{\max(y_i,0)^{2-p}}{(1-p)(2-p)}-
23772381
\frac{y\,\hat{y}^{1-p}_i}{1-p}+\frac{\hat{y}^{2-p}_i}{2-p}\right),
23782382
& \text{otherwise}
@@ -2415,34 +2419,6 @@ the difference in errors decreases. Finally, by setting, ``power=2``::
24152419
we would get identical errors. The deviance when ``power=2`` is thus only
24162420
sensitive to relative errors.
24172421

2418-
.. _d2_tweedie_score:
2419-
2420-
D² score, the coefficient of determination
2421-
-------------------------------------------
2422-
2423-
The :func:`d2_tweedie_score` function computes the percentage of deviance
2424-
explained. It is a generalization of R², where the squared error is replaced by
2425-
the Tweedie deviance. D², also known as McFadden's likelihood ratio index, is
2426-
calculated as
2427-
2428-
.. math::
2429-
2430-
D^2(y, \hat{y}) = 1 - \frac{\text{D}(y, \hat{y})}{\text{D}(y, \bar{y})} \,.
2431-
2432-
The argument ``power`` defines the Tweedie power as for
2433-
:func:`mean_tweedie_deviance`. Note that for `power=0`,
2434-
:func:`d2_tweedie_score` equals :func:`r2_score` (for single targets).
2435-
2436-
Like R², the best possible score is 1.0 and it can be negative (because the
2437-
model can be arbitrarily worse). A constant model that always predicts the
2438-
expected value of y, disregarding the input features, would get a D² score
2439-
of 0.0.
2440-
2441-
A scorer object with a specific choice of ``power`` can be built by::
2442-
2443-
>>> from sklearn.metrics import d2_tweedie_score, make_scorer
2444-
>>> d2_tweedie_score_15 = make_scorer(d2_tweedie_score, power=1.5)
2445-
24462422
.. _pinball_loss:
24472423

24482424
Pinball loss
@@ -2507,6 +2483,93 @@ explained in the example linked below.
25072483
hyper-parameters of quantile regression models on data with non-symmetric
25082484
noise and outliers.
25092485

2486+
.. _d2_score:
2487+
2488+
D² score
2489+
--------
2490+
2491+
The D² score computes the fraction of deviance explained.
2492+
It is a generalization of R², where the squared error is generalized and replaced
2493+
by a deviance of choice :math:`\text{dev}(y, \hat{y})`
2494+
(e.g., Tweedie, pinball or mean absolute error). D² is a form of a *skill score*.
2495+
It is calculated as
2496+
2497+
.. math::
2498+
2499+
D^2(y, \hat{y}) = 1 - \frac{\text{dev}(y, \hat{y})}{\text{dev}(y, y_{\text{null}})} \,.
2500+
2501+
Where :math:`y_{\text{null}}` is the optimal prediction of an intercept-only model
2502+
(e.g., the mean of `y_true` for the Tweedie case, the median for absolute
2503+
error and the alpha-quantile for pinball loss).
2504+
2505+
Like R², the best possible score is 1.0 and it can be negative (because the
2506+
model can be arbitrarily worse). A constant model that always predicts
2507+
:math:`y_{\text{null}}`, disregarding the input features, would get a D² score
2508+
of 0.0.
2509+
2510+
D² Tweedie score
2511+
^^^^^^^^^^^^^^^^
2512+
2513+
The :func:`d2_tweedie_score` function implements the special case of D²
2514+
where :math:`\text{dev}(y, \hat{y})` is the Tweedie deviance, see :ref:`mean_tweedie_deviance`.
2515+
It is also known as D² Tweedie and is related to McFadden's likelihood ratio index.
2516+
2517+
The argument ``power`` defines the Tweedie power as for
2518+
:func:`mean_tweedie_deviance`. Note that for `power=0`,
2519+
:func:`d2_tweedie_score` equals :func:`r2_score` (for single targets).
2520+
2521+
A scorer object with a specific choice of ``power`` can be built by::
2522+
2523+
>>> from sklearn.metrics import d2_tweedie_score, make_scorer
2524+
>>> d2_tweedie_score_15 = make_scorer(d2_tweedie_score, power=1.5)
2525+
2526+
D² pinball score
2527+
^^^^^^^^^^^^^^^^^^^^^
2528+
2529+
The :func:`d2_pinball_score` function implements the special case
2530+
of D² with the pinball loss, see :ref:`pinball_loss`, i.e.:
2531+
2532+
.. math::
2533+
2534+
\text{dev}(y, \hat{y}) = \text{pinball}(y, \hat{y}).
2535+
2536+
The argument ``alpha`` defines the slope of the pinball loss as for
2537+
:func:`mean_pinball_loss` (:ref:`pinball_loss`). It determines the
2538+
quantile level ``alpha`` for which the pinball loss and also D²
2539+
are optimal. Note that for `alpha=0.5` (the default) :func:`d2_pinball_score`
2540+
equals :func:`d2_absolute_error_score`.
2541+
2542+
A scorer object with a specific choice of ``alpha`` can be built by::
2543+
2544+
>>> from sklearn.metrics import d2_pinball_score, make_scorer
2545+
>>> d2_pinball_score_08 = make_scorer(d2_pinball_score, alpha=0.8)
2546+
2547+
D² absolute error score
2548+
^^^^^^^^^^^^^^^^^^^^^^^
2549+
2550+
The :func:`d2_absolute_error_score` function implements the special case of
2551+
the :ref:`mean_absolute_error`:
2552+
2553+
.. math::
2554+
2555+
\text{dev}(y, \hat{y}) = \text{MAE}(y, \hat{y}).
2556+
2557+
Here are some usage examples of the :func:`d2_absolute_error_score` function::
2558+
2559+
>>> from sklearn.metrics import d2_absolute_error_score
2560+
>>> y_true = [3, -0.5, 2, 7]
2561+
>>> y_pred = [2.5, 0.0, 2, 8]
2562+
>>> d2_absolute_error_score(y_true, y_pred)
2563+
0.764...
2564+
>>> y_true = [1, 2, 3]
2565+
>>> y_pred = [1, 2, 3]
2566+
>>> d2_absolute_error_score(y_true, y_pred)
2567+
1.0
2568+
>>> y_true = [1, 2, 3]
2569+
>>> y_pred = [2, 2, 2]
2570+
>>> d2_absolute_error_score(y_true, y_pred)
2571+
0.0
2572+
25102573

25112574
.. _clustering_metrics:
25122575

doc/whats_new/v1.1.rst

+8
Original file line numberDiff line numberDiff line change
@@ -751,6 +751,14 @@ Changelog
751751
instead of the finite approximation (`1.0` and `0.0` respectively) currently
752752
returned by default. :pr:`17266` by :user:`Sylvain Marié <smarie>`.
753753

754+
- |Feature| :func:`d2_pinball_score` and :func:`d2_absolute_error_score`
755+
calculate the :math:`D^2` regression score for the pinball loss and the
756+
absolute error respectively. :func:`d2_absolute_error_score` is a special case
757+
of :func:`d2_pinball_score` with a fixed quantile parameter `alpha=0.5`
758+
for ease of use and discovery. The :math:`D^2` scores are generalizations
759+
of the `r2_score` and can be interpeted as the fraction of deviance explained.
760+
:pr:`22118` by :user:`Ohad Michel <ohadmich>`
761+
754762
- |Enhancement| :func:`metrics.top_k_accuracy_score` raises an improved error
755763
message when `y_true` is binary and `y_score` is 2d. :pr:`22284` by `Thomas Fan`_.
756764

sklearn/metrics/__init__.py

+4
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,8 @@
7777
from ._regression import mean_poisson_deviance
7878
from ._regression import mean_gamma_deviance
7979
from ._regression import d2_tweedie_score
80+
from ._regression import d2_pinball_score
81+
from ._regression import d2_absolute_error_score
8082

8183

8284
from ._scorer import check_scoring
@@ -115,6 +117,8 @@
115117
"consensus_score",
116118
"coverage_error",
117119
"d2_tweedie_score",
120+
"d2_absolute_error_score",
121+
"d2_pinball_score",
118122
"dcg_score",
119123
"davies_bouldin_score",
120124
"DetCurveDisplay",

0 commit comments

Comments
 (0)