update docs

bbayukari · Jul 20, 2024 · a6e82ba · a6e82ba
1 parent 8e338f6
commit a6e82ba
Show file tree

Hide file tree

Showing 6 changed files with 37 additions and 95 deletions.
diff --git a/docs/source/gallery/GeneralizedLinearModels/Inverse-gaussian-regression.ipynb b/docs/source/gallery/GeneralizedLinearModels/Inverse-gaussian-regression.ipynb
@@ -17,7 +17,7 @@
    "metadata": {},
    "source": [
     "\n",
-    "# Inverse Gaussian Regression\n"
+    "# Inverse Gaussian regression\n"
    ]
   },
   {

diff --git a/docs/source/gallery/GeneralizedLinearModels/gamma-regression.ipynb b/docs/source/gallery/GeneralizedLinearModels/gamma-regression.ipynb
@@ -6,8 +6,7 @@
    "id": "7f5e5d54",
    "metadata": {},
    "source": [
-    "\n",
-    "# Gamma Regression\n"
+    "# Gamma regression\n"
    ]
   },
   {
@@ -20,6 +19,14 @@
     "%matplotlib inline"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "634b02ca",
+   "metadata": {},
+   "source": [
+    "## Gamma regression"
+   ]
+  },
   {
    "attachments": {},
    "cell_type": "markdown",
@@ -35,7 +42,7 @@
    "id": "55dcc923",
    "metadata": {},
    "source": [
-    "## Introduction to Gamma Regression\n",
+    "### Introduction\n",
     "Gamma regression can be used when you have positive continuous response variables such as payments for insurance claims,\n",
     "or the lifetime of a redundant system.\n",
     "It is well known that the density of Gamma distribution can be represented as a function of\n",

diff --git a/docs/source/gallery/GeneralizedLinearModels/logistic-regression.ipynb b/docs/source/gallery/GeneralizedLinearModels/logistic-regression.ipynb
@@ -15,15 +15,7 @@
          "id": "7f5e5d54",
          "metadata": {},
          "source": [
-            "## Logistic Regressions"
-         ]
-      },
-      {
-         "cell_type": "markdown",
-         "id": "25f1338c",
-         "metadata": {},
-         "source": [
-            "### Part A, we would like to use an example to show how the sparse-constrained optimization for logistic regression works in our program."
+            "## Logistic regressions"
          ]
       },
       {
@@ -55,7 +47,7 @@
          "id": "c6ce97f7",
          "metadata": {},
          "source": [
-            "### Import necessary packages"
+            "We first import necessary packages. "
          ]
       },
       {
@@ -76,9 +68,7 @@
          "id": "8e3318e9",
          "metadata": {},
          "source": [
-            "### Generate the data\n",
-            "\n",
-            "Firstly, we define a data generator function to provide a way to generate suitable dataset for this task."
+            "Next, we define a data generator function to provide a way to generate suitable dataset for this task."
          ]
       },
       {
@@ -135,9 +125,7 @@
          "id": "61544564",
          "metadata": {},
          "source": [
-            "### Define function to calculate negative log-likelihood of logistic regression\n",
-            "\n",
-            "Secondly, we define the loss function `logistic_loss` accorting to [1](#loss) that matches the data generating function `make_logistic_data`."
+            "Secondly, we define the loss function `logistic_loss` according to [1](#loss) that matches the data generating function `make_logistic_data`."
          ]
       },
       {
@@ -157,9 +145,7 @@
          "id": "415b29a7",
          "metadata": {},
          "source": [
-            "### Use SIC to decide the optimal support size\n",
-            "\n",
-            "There are four types of information criterion can be implemented in `skscope.utilities`:\n",
+            "Here, we use SIC to decide the optimal support size. There are four types of information criterion can be implemented in `skscope.utilities`:\n",
             "- Akaike information criterion (AIC)\n",
             "- Bayesian information criterion (BIC)\n",
             "- Extend BIC (EBIC)\n",
@@ -227,9 +213,7 @@
          "id": "71960a6d",
          "metadata": {},
          "source": [
-            "### More on the results\n",
-            "\n",
-            "We can plot the sparse signal recovering from the noisy observations to visualize the results."
+            "The sparse signal recovering from the noisy observations to visualize the results."
          ]
       },
       {
@@ -274,20 +258,12 @@
             "plt.show()"
          ]
       },
-      {
-         "cell_type": "markdown",
-         "id": "40afeb7f",
-         "metadata": {},
-         "source": [
-            "### Part B, we will use cross-validation to select the optimal support set and compare its runtime with that of SIC."
-         ]
-      },
       {
          "cell_type": "markdown",
          "id": "0a968a95",
          "metadata": {},
          "source": [
-            "#### Use SICto decide the optimal support size and Record the runtime of SIC."
+            "Considering `skscope` also support cross validation (CV), we will use CV to select the optimal support set and compare its runtime with that of SIC. We first record the runtime of using SIC. "
          ]
       },
       {
@@ -314,10 +290,6 @@
             "solver_ic = ScopeSolver(p, sparsity = range(10), sample_size = n, ic_method = SIC)\n",
             "params_ic = solver_ic.solve(logistic_loss, jit=True)\n",
             "\n",
-            "# Variable selection accuracy\n",
-            "print(\"True support set: \", (true_params.nonzero()[0]))\n",
-            "print(\"skscope estimated support set: \", (solver_ic.support_set))\n",
-            "\n",
             "# Calculate runtime\n",
             "runtime = time.time() - start_time\n",
             "print(\"Runtime of SIC:\", runtime, \"seconds\")"
@@ -328,7 +300,7 @@
          "id": "83cbd590",
          "metadata": {},
          "source": [
-            "#### Use CV to decide the optimal support size and record the runtime of CV."
+            "Next, we implement the loss function for using CV to decide the optimal support size and record the runtime of CV."
          ]
       },
       {
@@ -374,31 +346,25 @@
          "id": "a8cacd2e",
          "metadata": {},
          "source": [
-            "Comparing the results of SIC and CV criteria, we find that while maintaining high accuracy in variable selection, SIC exhibits a clear time advantage."
+            "Comparing the results of SIC and CV criteria, we find that while CV maintains high accuracy in variable selection, SIC exhibits a clear time advantage."
          ]
       },
       {
          "cell_type": "markdown",
          "id": "1f36004a",
          "metadata": {},
          "source": [
-            "### Part C, we compare the results under two different circumstances: using warmstart and not using warmstart."
-         ]
-      },
-      {
-         "cell_type": "markdown",
-         "id": "75304405",
-         "metadata": {},
-         "source": [
-            "#### Using warmstart "
+            "Finally, we compare the results under two different circumstances: using warm start and not."
          ]
       },
       {
          "cell_type": "markdown",
          "id": "f33fac23",
          "metadata": {},
          "source": [
-            "Hint: all solvers default to using warmstart, which can slightly prolong computation time if not utilized"
+            "> Hint: All solvers default to using warm start, which can slightly prolong computation time if not utilized.\n",
+            "\n",
+            "As using warm start is the default strategy of `skscope`, the usage of `skscope` with warm start is the same as previous:"
          ]
       },
       {
@@ -418,32 +384,13 @@
          "source": [
             "# Record start time\n",
             "start_time = time.time()\n",
-            "\n",
             "solver_ws = ScopeSolver(p, sparsity = range(10), sample_size = n, cv = 5,\n",
             "                        split_method=lambda data, index: (data[0][index, :], data[1][index]))\n",
             "solver_ws.solve(logistic_loss_cv, jit=True, data=(X, y))\n",
             "\n",
             "# Calculate runtime\n",
             "runtime = time.time() - start_time\n",
-            "print(\"Runtime:\", runtime, \"seconds\")"
-         ]
-      },
-      {
-         "cell_type": "code",
-         "execution_count": 31,
-         "id": "6343e161",
-         "metadata": {},
-         "outputs": [
-            {
-               "name": "stdout",
-               "output_type": "stream",
-               "text": [
-                  "True support set:  [ 90  97 340 395 477]\n",
-                  "Estimated support set:  [ 90  97 340 395 477]\n"
-               ]
-            }
-         ],
-         "source": [
+            "print(\"Runtime:\", runtime, \"seconds\")\n",
             "print(\"True support set: \", (true_params.nonzero()[0]))\n",
             "print(\"Estimated support set: \", (solver_ws.support_set))"
          ]
@@ -453,7 +400,7 @@
          "id": "93acced2",
          "metadata": {},
          "source": [
-            "#### Not using warmstart"
+            "If we turn of the warm-start strategy in `skscope`, the code is change to:"
          ]
       },
       {
@@ -473,35 +420,23 @@
          "source": [
             "# Record start time\n",
             "start_time = time.time()\n",
-            "\n",
-            "\n",
             "solver_nws = ScopeSolver(p, sparsity = range(10), sample_size = n, cv = 5,\n",
             "                        split_method=lambda data, index: (data[0][index, :], data[1][index]))\n",
             "solver_nws.warm_start = False\n",
             "solver_nws.solve(logistic_loss_cv, jit=True, data=(X, y))\n",
             "# Calculate runtime\n",
             "runtime = time.time() - start_time\n",
-            "print(\"Runtime:\", runtime, \"seconds\")"
+            "print(\"Runtime:\", runtime, \"seconds\")\n",
+            "print(\"True support set: \", (true_params.nonzero()[0]))\n",
+            "print(\"Estimated support set: \", (solver_nws.support_set))"
          ]
       },
       {
-         "cell_type": "code",
-         "execution_count": 33,
-         "id": "295f2586",
+         "cell_type": "markdown",
+         "id": "2e1581d0",
          "metadata": {},
-         "outputs": [
-            {
-               "name": "stdout",
-               "output_type": "stream",
-               "text": [
-                  "True support set:  [ 90  97 340 395 477]\n",
-                  "Estimated support set:  [ 90  97 340 395 477]\n"
-               ]
-            }
-         ],
          "source": [
-            "print(\"True support set: \", (true_params.nonzero()[0]))\n",
-            "print(\"Estimated support set: \", (solver_nws.support_set))"
+            "We can see that opening warm-start strategy accelerates the computation. "
          ]
       }
    ],
@@ -526,4 +461,4 @@
    },
    "nbformat": 4,
    "nbformat_minor": 5
-}
+}
diff --git a/docs/source/gallery/GeneralizedLinearModels/multinomial-logistic-regression.ipynb b/docs/source/gallery/GeneralizedLinearModels/multinomial-logistic-regression.ipynb
@@ -15,7 +15,7 @@
    "id": "f63a7bf3",
    "metadata": {},
    "source": [
-    "## Multinomial Logistic Regression"
+    "## Multinomial logistic regression"
    ]
   },
   {

diff --git a/docs/source/gallery/GeneralizedLinearModels/poisson-identity-link.ipynb b/docs/source/gallery/GeneralizedLinearModels/poisson-identity-link.ipynb
@@ -5,7 +5,7 @@
    "id": "f540b29b",
    "metadata": {},
    "source": [
-    "# multiple response non-negativity identity link Poisson model"
+    "# Multiple response non-negativity identity link Poisson model"
    ]
   },
   {

diff --git a/docs/source/gallery/GeneralizedLinearModels/poisson-regression.ipynb b/docs/source/gallery/GeneralizedLinearModels/poisson-regression.ipynb
@@ -16,7 +16,7 @@
          "metadata": {},
          "source": [
             "\n",
-            "# Poisson Regression\n"
+            "# Poisson regression\n"
          ]
       },
       {
@@ -336,4 +336,4 @@
    },
    "nbformat": 4,
    "nbformat_minor": 5
-}
+}