Automatically formulating machine learning tasks for temporal datasets
Trane is a software package for automatically generating prediction problems and generating labels for supervised learning. Trane is a system designed to advance the automation of the machine learning problem solving pipeline.
To install Trane, run the following command:
python -m pip install trane
Below is an example of using Trane:
import trane
data = trane.datasets.load_covid()
table_meta = trane.datasets.load_covid_metadata()
entity_col = "Country/Region"
window_size = "2d"
minimum_data = "2020-01-22"
maximum_data = "2020-03-29"
cutoff_strategy = trane.CutoffStrategy(
entity_col=entity_col,
window_size=window_size,
minimum_data=minimum_data,
maximum_data=maximum_data,
)
time_col = "Date"
problem_generator = trane.PredictionProblemGenerator(
df=data,
entity_col=entity_col,
time_col=time_col,
cutoff_strategy=cutoff_strategy,
table_meta=table_meta,
)
problems = problem_generator.generate(data, generate_thresholds=True)
If you use Trane, please consider citing the following paper:
Ben Schreck, Kalyan Veeramachaneni. What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems. IEEE DSAA 2016, 440-451
BibTeX entry:
@inproceedings{schreck2016would,
title={What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems},
author={Schreck, Benjamin and Veeramachaneni, Kalyan},
booktitle={Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on},
pages={440--451},
year={2016},
organization={IEEE}
}