Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Mamba413 committed Jul 20, 2024
1 parent 8e338f6 commit a6e82ba
Show file tree
Hide file tree
Showing 6 changed files with 37 additions and 95 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
"metadata": {},
"source": [
"\n",
"# Inverse Gaussian Regression\n"
"# Inverse Gaussian regression\n"
]
},
{
Expand Down
13 changes: 10 additions & 3 deletions docs/source/gallery/GeneralizedLinearModels/gamma-regression.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@
"id": "7f5e5d54",
"metadata": {},
"source": [
"\n",
"# Gamma Regression\n"
"# Gamma regression\n"
]
},
{
Expand All @@ -20,6 +19,14 @@
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"id": "634b02ca",
"metadata": {},
"source": [
"## Gamma regression"
]
},
{
"attachments": {},
"cell_type": "markdown",
Expand All @@ -35,7 +42,7 @@
"id": "55dcc923",
"metadata": {},
"source": [
"## Introduction to Gamma Regression\n",
"### Introduction\n",
"Gamma regression can be used when you have positive continuous response variables such as payments for insurance claims,\n",
"or the lifetime of a redundant system.\n",
"It is well known that the density of Gamma distribution can be represented as a function of\n",
Expand Down
109 changes: 22 additions & 87 deletions docs/source/gallery/GeneralizedLinearModels/logistic-regression.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,7 @@
"id": "7f5e5d54",
"metadata": {},
"source": [
"## Logistic Regressions"
]
},
{
"cell_type": "markdown",
"id": "25f1338c",
"metadata": {},
"source": [
"### Part A, we would like to use an example to show how the sparse-constrained optimization for logistic regression works in our program."
"## Logistic regressions"
]
},
{
Expand Down Expand Up @@ -55,7 +47,7 @@
"id": "c6ce97f7",
"metadata": {},
"source": [
"### Import necessary packages"
"We first import necessary packages. "
]
},
{
Expand All @@ -76,9 +68,7 @@
"id": "8e3318e9",
"metadata": {},
"source": [
"### Generate the data\n",
"\n",
"Firstly, we define a data generator function to provide a way to generate suitable dataset for this task."
"Next, we define a data generator function to provide a way to generate suitable dataset for this task."
]
},
{
Expand Down Expand Up @@ -135,9 +125,7 @@
"id": "61544564",
"metadata": {},
"source": [
"### Define function to calculate negative log-likelihood of logistic regression\n",
"\n",
"Secondly, we define the loss function `logistic_loss` accorting to [1](#loss) that matches the data generating function `make_logistic_data`."
"Secondly, we define the loss function `logistic_loss` according to [1](#loss) that matches the data generating function `make_logistic_data`."
]
},
{
Expand All @@ -157,9 +145,7 @@
"id": "415b29a7",
"metadata": {},
"source": [
"### Use SIC to decide the optimal support size\n",
"\n",
"There are four types of information criterion can be implemented in `skscope.utilities`:\n",
"Here, we use SIC to decide the optimal support size. There are four types of information criterion can be implemented in `skscope.utilities`:\n",
"- Akaike information criterion (AIC)\n",
"- Bayesian information criterion (BIC)\n",
"- Extend BIC (EBIC)\n",
Expand Down Expand Up @@ -227,9 +213,7 @@
"id": "71960a6d",
"metadata": {},
"source": [
"### More on the results\n",
"\n",
"We can plot the sparse signal recovering from the noisy observations to visualize the results."
"The sparse signal recovering from the noisy observations to visualize the results."
]
},
{
Expand Down Expand Up @@ -274,20 +258,12 @@
"plt.show()"
]
},
{
"cell_type": "markdown",
"id": "40afeb7f",
"metadata": {},
"source": [
"### Part B, we will use cross-validation to select the optimal support set and compare its runtime with that of SIC."
]
},
{
"cell_type": "markdown",
"id": "0a968a95",
"metadata": {},
"source": [
"#### Use SICto decide the optimal support size and Record the runtime of SIC."
"Considering `skscope` also support cross validation (CV), we will use CV to select the optimal support set and compare its runtime with that of SIC. We first record the runtime of using SIC. "
]
},
{
Expand All @@ -314,10 +290,6 @@
"solver_ic = ScopeSolver(p, sparsity = range(10), sample_size = n, ic_method = SIC)\n",
"params_ic = solver_ic.solve(logistic_loss, jit=True)\n",
"\n",
"# Variable selection accuracy\n",
"print(\"True support set: \", (true_params.nonzero()[0]))\n",
"print(\"skscope estimated support set: \", (solver_ic.support_set))\n",
"\n",
"# Calculate runtime\n",
"runtime = time.time() - start_time\n",
"print(\"Runtime of SIC:\", runtime, \"seconds\")"
Expand All @@ -328,7 +300,7 @@
"id": "83cbd590",
"metadata": {},
"source": [
"#### Use CV to decide the optimal support size and record the runtime of CV."
"Next, we implement the loss function for using CV to decide the optimal support size and record the runtime of CV."
]
},
{
Expand Down Expand Up @@ -374,31 +346,25 @@
"id": "a8cacd2e",
"metadata": {},
"source": [
"Comparing the results of SIC and CV criteria, we find that while maintaining high accuracy in variable selection, SIC exhibits a clear time advantage."
"Comparing the results of SIC and CV criteria, we find that while CV maintains high accuracy in variable selection, SIC exhibits a clear time advantage."
]
},
{
"cell_type": "markdown",
"id": "1f36004a",
"metadata": {},
"source": [
"### Part C, we compare the results under two different circumstances: using warmstart and not using warmstart."
]
},
{
"cell_type": "markdown",
"id": "75304405",
"metadata": {},
"source": [
"#### Using warmstart "
"Finally, we compare the results under two different circumstances: using warm start and not."
]
},
{
"cell_type": "markdown",
"id": "f33fac23",
"metadata": {},
"source": [
"Hint: all solvers default to using warmstart, which can slightly prolong computation time if not utilized"
"> Hint: All solvers default to using warm start, which can slightly prolong computation time if not utilized.\n",
"\n",
"As using warm start is the default strategy of `skscope`, the usage of `skscope` with warm start is the same as previous:"
]
},
{
Expand All @@ -418,32 +384,13 @@
"source": [
"# Record start time\n",
"start_time = time.time()\n",
"\n",
"solver_ws = ScopeSolver(p, sparsity = range(10), sample_size = n, cv = 5,\n",
" split_method=lambda data, index: (data[0][index, :], data[1][index]))\n",
"solver_ws.solve(logistic_loss_cv, jit=True, data=(X, y))\n",
"\n",
"# Calculate runtime\n",
"runtime = time.time() - start_time\n",
"print(\"Runtime:\", runtime, \"seconds\")"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "6343e161",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True support set: [ 90 97 340 395 477]\n",
"Estimated support set: [ 90 97 340 395 477]\n"
]
}
],
"source": [
"print(\"Runtime:\", runtime, \"seconds\")\n",
"print(\"True support set: \", (true_params.nonzero()[0]))\n",
"print(\"Estimated support set: \", (solver_ws.support_set))"
]
Expand All @@ -453,7 +400,7 @@
"id": "93acced2",
"metadata": {},
"source": [
"#### Not using warmstart"
"If we turn of the warm-start strategy in `skscope`, the code is change to:"
]
},
{
Expand All @@ -473,35 +420,23 @@
"source": [
"# Record start time\n",
"start_time = time.time()\n",
"\n",
"\n",
"solver_nws = ScopeSolver(p, sparsity = range(10), sample_size = n, cv = 5,\n",
" split_method=lambda data, index: (data[0][index, :], data[1][index]))\n",
"solver_nws.warm_start = False\n",
"solver_nws.solve(logistic_loss_cv, jit=True, data=(X, y))\n",
"# Calculate runtime\n",
"runtime = time.time() - start_time\n",
"print(\"Runtime:\", runtime, \"seconds\")"
"print(\"Runtime:\", runtime, \"seconds\")\n",
"print(\"True support set: \", (true_params.nonzero()[0]))\n",
"print(\"Estimated support set: \", (solver_nws.support_set))"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "295f2586",
"cell_type": "markdown",
"id": "2e1581d0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True support set: [ 90 97 340 395 477]\n",
"Estimated support set: [ 90 97 340 395 477]\n"
]
}
],
"source": [
"print(\"True support set: \", (true_params.nonzero()[0]))\n",
"print(\"Estimated support set: \", (solver_nws.support_set))"
"We can see that opening warm-start strategy accelerates the computation. "
]
}
],
Expand All @@ -526,4 +461,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
"id": "f63a7bf3",
"metadata": {},
"source": [
"## Multinomial Logistic Regression"
"## Multinomial logistic regression"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"id": "f540b29b",
"metadata": {},
"source": [
"# multiple response non-negativity identity link Poisson model"
"# Multiple response non-negativity identity link Poisson model"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"metadata": {},
"source": [
"\n",
"# Poisson Regression\n"
"# Poisson regression\n"
]
},
{
Expand Down Expand Up @@ -336,4 +336,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}

0 comments on commit a6e82ba

Please sign in to comment.