Skip to content

a Python library for the assessment of sample fit in econometric models

License

Notifications You must be signed in to change notification settings

okasag/samplefit

Repository files navigation

samplefit: Random Sample Reliability

samplefit is a Python library to assess sample fit, as opposed to model fit, via the Random Sample Reliability algorithm as developed by Okasa & Younge (2022). samplefit is built upon the statsmodels library (Seabold & Perktold, 2010) and follows the same command workflow.

Copyright (c) 2022 Gabriel Okasa & Kenneth A. Younge.

AUTHOR:  Gabriel Okasa & Kenneth A. Younge
SOURCE:  https://github.com/okasag/samplefit
LICENSE: Access to this code is provided under an MIT License.

Repo maintainer: Gabriel Okasa ([email protected])

Introduction

Researchers frequently test model fit by holding data constant and varying the model. We propose Random Sample Reliability (RSR) as a computational framework to test sample fit by holding the model constant and varying the data. Random Sample Reliability re-samples data to estimate the reliability of observations of a sample. RSR can be used to score the reliability of every observation within the sample, test the sensitivity of results to atypical observations via annealing procedure, and estimate a weighted fit where the analysis is more robust.

Detailed documentation of the samplefit library is available here.

Installation

To clone this repo for the samplefit library run:

git clone https://github.com/okasag/samplefit.git

The required modules can be installed by navigating to the root of this project and executing the following command: pip install -r requirements.txt.

Example

The example below demonstrates the workflow of using the samplefit library in conjunction with the well-known statsmodels library.

Import libraries:

import samplefit as sf
import statsmodels.api as sm

Get data:

boston = sm.datasets.get_rdataset("Boston", "MASS")
Y = boston.data['medv']
X = boston.data['rm']
X = sm.add_constant(X)

Assess model fit:

model = sm.OLS(endog=Y, exog=X)
model_fit = model.fit()
model_fit.summary()

Assess sample fit:

sample = sf.RSR(linear_model=model)
sample_fit = sample.fit()
sample_fit.summary()

Assess sample reliability:

sample_scores = sample.score()
sample_scores.plot()

Assess sample sensitivity:

sample_annealing = sample.anneal()
sample_annealing.plot()

References

  • Okasa, Gabriel, and Kenneth A. Younge. “Random Sample Reliability.” Working Paper. 2022.
  • Seabold, Skipper, and Josef Perktold. “statsmodels: Econometric and statistical modeling with python.” Proceedings of the 9th Python in Science Conference. 2010.

About

a Python library for the assessment of sample fit in econometric models

Resources

License

Stars

Watchers

Forks

Packages

No packages published