tubular

tubular implements transformers for pre processing steps commonly used in machine learning pipelines.

The transformers are compatible with scikit-learn Pipelines, having a transform method to apply the pre processing step to data and a fit method to learn the relevant information from the data, if applicable.

The transformers in tubular work with data in pandas DataFrames.

There are a variety of transformers to assist with;

capping
imputation
mapping
date differencing
categorical encoding
numeric operations

Here is a simple example of capping 2 columns at a specified value;

from tubular.capping import CappingTransformer
import pandas as pd
from sklearn.datasets import load_boston

# load the boston dataset
boston = load_boston()
y = boston.target
X = pd.DataFrame(boston.data, columns=boston.feature_names)

# initialise a capping transformer for 2 columns
capper = CappingTransformer(columns=['INDUS', 'RM'], cap_value_max = 20)

# transform the data
X_capped = capper.transform(X)

Installation

tubular can be installed from PyPI simply with;

pip install tubular

Documentation

To build local documentation, specify the environment variable $SPHINX_BUILD_DIR$, and then run from the docs/ directory

make apidoc
make html

Examples

To help get started there are example notebooks in the examples folder that show how to use each transformer as well as an example of putting several together in a Pipeline.

Build and test

The test framework we are using for this project is pytest, to run the tests follow the steps below.

First clone the repo and move to the root directory;

git clone https://github.com/lvgig/tubular.git
cd tubular

Then install tubular in editable mode;

pip install -e . -r requirements-dev.txt

Then run the tests simply with pytest

pytest

Contribute

tubular is under active development, we're super excited if you're interested in contributing! See the CONTRIBUTING.md for the full details of our working practices.

For bugs and feature requests please open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.githooks		.githooks
docs		docs
examples		examples
tests		tests
tubular		tubular
.coveragerc		.coveragerc
.envrc.example		.envrc.example
.flake8		.flake8
.gitignore		.gitignore
.readthedocs.yaml		.readthedocs.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Changelog.md		Changelog.md
LICENSE		LICENSE
README.md		README.md
bandit.yml		bandit.yml
conftest.py		conftest.py
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py
tests.py		tests.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tubular

Installation

Documentation

Examples

Build and test

Contribute

About

Releases

Packages

Languages

License

munichpavel/tubular

Folders and files

Latest commit

History

Repository files navigation

tubular

Installation

Documentation

Examples

Build and test

Contribute

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages