This repository has been archived by the owner on Sep 11, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 120
Object Storage
Frank Noe edited this page Apr 16, 2016
·
10 revisions
What? Implement a method to easily save/load PyEMMA high-level objects to/from disk.
Why? To facilitate interactive or scripted analysis in with large datasets. Without adaptations, the behavior of the pickle
module is not well-defined because it's a priori not defined which attributes are data belonging to an object, and which are just links to other resources.
- Estimation parameters
get_params()
andset_params()
: Input parameters used to construct the estimator object. - Estimation state
get_state()
andset_state()
: State variables set by estimation. This includes estimates that connect data and model, such as convergence information.
Can be mixed in to estimator or standalone.
- Model parameters
get_model_params()
andset_model_params()
: Estimated or set parameters of the model.
All of these are subclass of Models and inherit the model I/O properties.
Estimator/Model save and load:
from pyemma import msm
# save parametrized estimator
mle = msm.estimate_markov_model([1, 0, 0, 0, 1, 1, 0], 1)
mle.save('msm_mle.pyemma')
# load parametrized estimator
mle_recovered = pyemma.load('msm_mle.pyemma')
mle_recovered.cktest(2) # this works if estimation data was stored too
# save just the model
mle.model.save('msm.pyemma')
# load just the model
msm_recovered = pyemma.load('msm.pyemma')
print msm_recovered.stationary_distribution # this works with model parameters alone
We can implement object storage with the pickle
or cpickle
modules.
-
__getstate__()
and__setstate__()
need to be overloaded in order to save/load the desired content of Estimators, Models etc. - Does pickle have efficient protocols (compressed and fast)? Compare to
np.savez_compressed
We can implement object storage with np.savez_compressed
and np.load
- suggested here: http://www.benfrederickson.com/dont-pickle-your-data/