Skip to content

Commit

Permalink
Update notebooks
Browse files Browse the repository at this point in the history
  • Loading branch information
lesteve committed Oct 12, 2022
1 parent 157ba58 commit c3e371b
Showing 1 changed file with 35 additions and 2 deletions.
37 changes: 35 additions & 2 deletions notebooks/metrics_classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -502,6 +502,13 @@
" classifier, data_test, target_test, pos_label='donated',\n",
" marker=\"+\"\n",
")\n",
"disp = PrecisionRecallDisplay.from_estimator(\n",
" dummy_classifier, data_test, target_test, pos_label='donated',\n",
" color=\"tab:orange\", linestyle=\"--\", ax=disp.ax_)\n",
"plt.xlabel(\"Recall (also known as TPR or sensitivity)\")\n",
"plt.ylabel(\"Precision (also known as PPV)\")\n",
"plt.xlim(0, 1)\n",
"plt.ylim(0, 1)\n",
"plt.legend(bbox_to_anchor=(1.05, 0.8), loc=\"upper left\")\n",
"_ = disp.ax_.set_title(\"Precision-recall curve\")"
]
Expand All @@ -528,14 +535,36 @@
"and is named average precision (AP). With an ideal classifier, the average\n",
"precision would be 1.\n",
"\n",
"Notice that the AP of a `DummyClassifier`, used as baseline to define the\n",
"chance level, coincides with the number of samples in the positive class\n",
"divided by the total number of samples (this number is called the prevalence\n",
"of the positive class)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"prevalence = (\n",
" target_test.value_counts()[1] / target_test.value_counts().sum()\n",
")\n",
"print(f\"Prevalence of the class 'donated': {prevalence:.2f}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The precision and recall metric focuses on the positive class, however, one\n",
"might be interested in the compromise between accurately discriminating the\n",
"positive class and accurately discriminating the negative classes. The\n",
"statistics used for this are sensitivity and specificity. Sensitivity is just\n",
"another name for recall. However, specificity measures the proportion of\n",
"correctly classified samples in the negative class defined as: TN / (TN +\n",
"FP). Similar to the precision-recall curve, sensitivity and specificity are\n",
"generally plotted as a curve called the receiver operating characteristic\n",
"generally plotted as a curve called the Receiver Operating Characteristic\n",
"(ROC) curve. Below is such a curve:"
]
},
Expand All @@ -553,8 +582,12 @@
"disp = RocCurveDisplay.from_estimator(\n",
" dummy_classifier, data_test, target_test, pos_label='donated',\n",
" color=\"tab:orange\", linestyle=\"--\", ax=disp.ax_)\n",
"plt.xlabel(\"False positive rate\")\n",
"plt.ylabel(\"True positive rate\\n(also known as sensitivity or recall)\")\n",
"plt.xlim(0, 1)\n",
"plt.ylim(0, 1)\n",
"plt.legend(bbox_to_anchor=(1.05, 0.8), loc=\"upper left\")\n",
"_ = disp.ax_.set_title(\"ROC AUC curve\")"
"_ = disp.ax_.set_title(\"Receiver Operating Characteristic curve\")"
]
},
{
Expand Down

0 comments on commit c3e371b

Please sign in to comment.