Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
samplefit		samplefit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_samplefit_example.py		_samplefit_example.py
index.html		index.html
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Repository files navigation

`samplefit`: Random Sample Reliability

samplefit is a Python library to assess sample fit, as opposed to model fit, via the Random Sample Reliability algorithm as developed by Okasa & Younge (2022). samplefit is built upon the statsmodels library (Seabold & Perktold, 2010) and follows the same command workflow.

AUTHOR:  Gabriel Okasa & Kenneth A. Younge
SOURCE:  https://github.com/okasag/samplefit
LICENSE: Access to this code is provided under an MIT License.

Repo maintainer: Gabriel Okasa ([email protected])

Introduction

Researchers frequently test model fit by holding data constant and varying the model. We propose Random Sample Reliability (RSR) as a computational framework to test sample fit by holding the model constant and varying the data. Random Sample Reliability re-samples data to estimate the reliability of observations of a sample. RSR can be used to score the reliability of every observation within the sample, test the sensitivity of results to atypical observations via annealing procedure, and estimate a weighted fit where the analysis is more robust.

Detailed documentation of the samplefit library is available here.

Poster describing the RSR algorithm is available here.

Installation

To clone this repo for the samplefit library run:

git clone https://github.com/okasag/samplefit.git

The required modules can be installed by navigating to the root of this project and executing the following command: pip install -r requirements.txt.

Example

The example below demonstrates the workflow of using the samplefit library in conjunction with the well-known statsmodels library.

Import libraries:

import samplefit as sf
import statsmodels.api as sm

Get data:

boston = sm.datasets.get_rdataset("Boston", "MASS")
Y = boston.data['medv']
X = boston.data['rm']
X = sm.add_constant(X)

Assess model fit:

model = sm.OLS(endog=Y, exog=X)
model_fit = model.fit()
model_fit.summary()

Assess sample fit:

sample = sf.RSR(linear_model=model)
sample_fit = sample.fit()
sample_fit.summary()

Assess sample reliability:

sample_scores = sample.score()
sample_scores.plot()

Assess sample sensitivity:

sample_annealing = sample.anneal()
sample_annealing.plot()

References

Okasa, Gabriel, and Kenneth A. Younge. “Random Sample Reliability.” Working Paper. 2022.
Seabold, Skipper, and Josef Perktold. “statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference. 2010.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`samplefit`: Random Sample Reliability

Introduction

Installation

Example

References

About

Releases 1

Packages

Languages

License

okasag/samplefit

Folders and files

Latest commit

History

Repository files navigation

samplefit: Random Sample Reliability

Introduction

Installation

Example

References

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

`samplefit`: Random Sample Reliability

Packages