Python data pipelines¶
Features¶
This package implements the basics for building pipelines similar to magrittr in R. Pipelines are
created using >>
. Internally it uses singledispatch to provide a way for a unified API
for different kinds of inputs (SQL databases, HDF, simple dicts, ...).
Basic example what can be build with this package:
>>> from my_library import append_col
>>> import pandas as pd
>>> pd.DataFrame({"a" : [1,2,3]}) >> append_col(x=3)
a X
0 1 3
1 2 3
2 3 3
In the future, this package might also implement the verbs from the R packages dplyr and
tidyr for pandas.DataFrame
and or I will fold this into one of the other available
implementation of dplyr style pipelines/verbs for pandas.
Documentation¶
The documentaiton can be found on ReadTheDocs: https://pydatapipes.readthedocs.io
License¶
Free software: MIT license
Credits¶
- magrittr and it’s usage in dplyr / tidyr for the idea of using pipelines in that ways
- lots of python implementations of dplyr style pipelines: dplython, pandas_ply, dfply
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.