Trane is a software package that automatically generates problems for temporal datasets and produces labels for supervised learning. Its goal is to streamline the machine learning problem-solving process.
Install Trane using pip:
python -m pip install trane
Here's a quick demonstration of Trane in action:
import trane
data, metadata = trane.load_airbnb()
entity_columns = ["location"]
window_size = "2d"
problem_generator = trane.ProblemGenerator(
metadata=metadata,
window_size=window_size,
entity_columns=entity_columns
)
problems = problem_generator.generate()
print(f'Generated {len(problems)} problems.')
print(problems[108])
print(problems[108].create_target_values(data).head(5))
Output:
Generated 168 problems.
For each <location> predict the majority <rating> in all related records in the next 2 days.
location time target
0 London 2021-01-01 5
1 London 2021-01-03 4
2 London 2021-01-05 5
3 London 2021-01-07 4
4 London 2021-01-09 5
- Questions or Issues? Create a GitHub issue.
- Want to Chat? Join our Slack community.
If you find Trane beneficial, consider citing our paper:
Ben Schreck, Kalyan Veeramachaneni. What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems. IEEE DSAA 2016, 440-451.
BibTeX entry:
@inproceedings{schreck2016would,
title={What Would a Data Scientist Ask? Automatically Formulating and Solving Predictive Problems},
author={Schreck, Benjamin and Veeramachaneni, Kalyan},
booktitle={Data Science and Advanced Analytics (DSAA), 2016 IEEE International Conference on},
pages={440--451},
year={2016},
organization={IEEE}
}