๐ง Mage is an open-source data pipeline tool for transforming and integrating data
Watch demo ย ย ย ๐ย ย ย Live demo ย ย ย ๐ฅย ย ย Documentation ย ย ย ๐ช๏ธย ย ย Community chat
๐ถ | Orchestration | Schedule and manage data pipelines with observability. |
๐ | Notebook | Interactive Python, SQL, & R editor for coding data pipelines. |
๐๏ธ | Data integrations | Synchronize data from 3rd party sources to your internal destinations. |
๐ฐ | Streaming pipelines | Ingest and transform real-time data. |
A sample data pipeline defined across 3 files โ
- Load data โ
@data_loader def load_csv_from_file(): return pd.read_csv('default_repo/titanic.csv')
- Transform data โ
@transformer def select_columns_from_df(df, *args): return df[['Age', 'Fare', 'Survived']]
- Export data โ
@data_exporter def export_titanic_data_to_disk(df) -> None: df.to_csv('default_repo/titanic_transformed.csv')
What the data pipeline looks like in the UI โ
New? We recommend reading about blocks and learning from a hands-on tutorial.
You can install and run Mage using Docker (recommended), pip
, or conda
.
-
Create a new project and launch tool (change
demo_project
to any other name if you want):docker run -it -p 6789:6789 -v $(pwd):/home/src mageai/mageai \ mage start demo_project
Want to use Spark or other integrations? Read more about integrations.
-
Open http://localhost:6789 in your browser and build a pipeline.
-
Install Mage
pip install mage-ai
or
conda install -c conda-forge mage-ai
For additional packages (e.g.
spark
,postgres
, etc), please see Installing extra packages.If you run into errors, please see Install errors.
-
Create new project and launch tool (change
demo_project
to any other name if you want):mage start demo_project
-
Open http://localhost:6789 in your browser and build a pipeline.
Build and run a data pipeline with our demo app.
WARNING
The live demo is public to everyone, please donโt save anything sensitive (e.g. passwords, secrets, etc).
Click the image to play video
- Train model on Titanic dataset
- Load data from API, transform it, and export it to PostgreSQL
- Integrate Mage into an existing Airflow project
๐๏ธ Core design principles
Every user experience and technical design decision adheres to these principles.
๐ป | Easy developer experience | Open-source engine that comes with a custom notebook UI for building data pipelines. |
๐ข | Engineering best practices built-in | Build and deploy data pipelines using modular code. No more writing throwaway code or trying to turn notebooks into scripts. |
๐ณ | Data is a first-class citizen | Designed from the ground up specifically for running data-intensive workflows. |
๐ช | Scaling is made simple | Analyze and process large data quickly for rapid iteration. |
๐ธ Core abstractions
These are the fundamental concepts that Mage uses to operate.
Project | Like a repository on GitHub; this is where you write all your code. |
Pipeline | Contains references to all the blocks of code you want to run, charts for visualizing data, and organizes the dependency between each block of code. |
Block | A file with code that can be executed independently or within a pipeline. |
Data product | Every block produces data after it's been executed. These are called data products in Mage. |
Trigger | A set of instructions that determine when or how a pipeline should run. |
Run | Stores information about when it was started, its status, when it was completed, any runtime variables used in the execution of the pipeline or block, etc. |
Add features and instantly improve the experience for everyone.
Check out the contributing guide to setup your development environment and start building.
Individually, weโre a mage.
๐ง Mage
Magic is indistinguishable from advanced technology. A mage is someone who uses magic (aka advanced technology).
Together, weโre Magers!
๐งโโ๏ธ๐ง Magers (
/หmฤjษr/
)A group of mages who help each other realize their full potential!
Letโs hang out and chat together โ
For real-time news, fun memes, data engineering topics, and more, join us on โ
GitHub | |
Slack |
Check out our FAQ page to find answers to some of our most asked questions.
See the LICENSE file for licensing information.