Python data pipelines

Documentation Status Updates

Features

This package implements the basics for building pipelines similar to magrittr in R. Pipelines are created using >>. Internally it uses singledispatch to provide a way for a unified API for different kinds of inputs (SQL databases, HDF, simple dicts, ...).

Basic example what can be build with this package:

>>> from my_library import append_col
>>> import pandas as pd

>>> pd.DataFrame({"a" : [1,2,3]}) >> append_col(x=3)
   a  X
0  1  3
1  2  3
2  3  3

In the future, this package might also implement the verbs from the R packages dplyr and tidyr for pandas.DataFrame and or I will fold this into one of the other available implementation of dplyr style pipelines/verbs for pandas.

Documentation

The documentaiton can be found on ReadTheDocs: https://pydatapipes.readthedocs.io

License

Free software: MIT license

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.