A GStreamer-like Workflow Framework, supporting NVIDIA Omniverse, Python and Web UI, powered by K8S & Rust.
It is under a heavy construction. Unfinished features may have significant changes in composition and usage. Please read the feature support below carefully.
Type | How to read |
---|---|
Feature Kind | e.g. model |
Feature Group | e.g. builtin/ |
Feature Name | e.g. doc |
Feature's Usage | e.g. model/builtin/doc -> docmodel ♜ Ignore the group names and swap the term 🏰 |
Model Function Name | e.g. :split |
Model Function's Usage | e.g. model/builtin/doc:split -> doc:split |
Status | ✅ Yes 🚧 WIP 🔎 TBA 🔲 TBD |
- 🔎 cluster (Parallel Computing on HPC)
- 🔎 local (Current Process; by default)
- 🔲 ray (Ray Cluster; Python-only)
- 🔎 engine (Scalable Cluster Management & Job Scheduling System)
- 🔲 k8s (Kubernetes for Containerized Applications, HPC-Ready with OpenARK)
- 🔎 local (Host Machine; by default)
- 🔲 slurm (Slurm Workload Manager for HPC)
- 🔲 terraform (Terraform by HashiCorp for Cloud Providers)
- 🚧 format (Data File Format)
- 🔲 batch/ (Data Table, Data Catalog)
- 🔲 delta (Delta Lake)
- 🔲 lance (100x faster random access than Parquet)
- ✅ stream (In-Memory, by default)
- ✅ Dynamic type casting
- ✅ Lazy Evaluation
- 🔲 batch/ (Data Table, Data Catalog)
- 🚧 model (Data Schema & Metadata)
- 🚧 builtins/ (Primitives)
- 🔲 batch (Auto-derived by the batch format)
- 🔲 :sql
- ✅ binary
- 🔎 content
- 🔎 :prompt (LLM Prompt)
- 🚧 doc
- 🔲 :split
- 🔲 embed
- 🔲 :vector_search
- ✅ file
- ✅ hash (Hashable -> Storable)
- 🔲 metadata (Nested, Unsafe, for additional description)
- 🔲 batch (Auto-derived by the batch format)
- 🔲 document/ (LibreOffice, etc.)
- 🔲 markdown
- 🔲 tex
- 🔲 media/ (GStreamer)
- 🔲 audio
- 🔲 image
- 🔲 video
- 🔲 ml/ (Machine Learning, not Artificial Intelligence)
- torch (PyTorch)
- eval
- train
- torch (PyTorch)
- 🔲 twin/ (Digital Twin)
- 🔲 loc (Location)
- 🔲 rot (Rotation)
- 🔲 usd (OpenUSD)
- 🚧 builtins/ (Primitives)
- 🚧 sink (Data Visualization & Workload Automation)
- 🚧 local/
- 🔲 file
- 🔲 media (GStreamer)
- ✅ stdout
- 🔲 twin/ (Digital Twin & Robotics)
- 🔲 omni (NVIDIA Omniverse)
- 🚧 local/
- 🚧 src (Data Source)
- 🔲 cloud/
- 🔲 gmail (Google Gmail)
- 🔲 desktop/
- 🔲 screen (Screen Capture & Recording)
- 🚧 local/
- 🚧 file
- ✅ Content-based Hash
- ✅ Lazy Evaluation
- 🔲 Metadata-based Hash
- ✅ stdin
- 🚧 file
- 🔲 ml/ (Machine Learning Models & Datasets)
- 🔲 huggingface (Hugging Face Models & Datasets)
- 🔲 kaggle (Kaggle Datasets)
- 🔲 monitoring/ (Time series database, etc.)
- 🔲 rtls/ (Real-Time Location System)
- 🔲 sewio (Sewio UWB)
- 🔲 twin/ (Digital Twin)
- 🔲 omni (NVIDIA Omniverse)
- 🔲 cloud/
- 🚧 store (Object Store, Cacheable)
- 🔲 cdl (Connected Data Lake)
- 🔲 cloud/
- 🔲 gdrive (Google Drive)
- 🔲 s3 (Amazon S3)
- ✅ local (FileSystem)
Type | How to read |
---|---|
Status | ✅ Yes 🚧 WIP 🔎 TBA 🔲 TBD |
- 🔎 API
- 🔲 Python
- 🔎 Rust
- 🚧 CLI
- ✅ Command-line arguments (GStreamer-like Inline Pipeline)
- 🔎 Container images
- 🔲 YAML templates
- 🔎 Web UI
- 🔎 Backend
- 🔲 Frontend
- 🔲 Cluster Management
- 🔲 Dashboard
- 🔲 Graph-based Pipeline Visualization
- 🔲 Interactive Pipeline Composition
- 🔲 Run & Stop
- 🔲 Save as YAML templates
- 🔲 Job Scheduling
- 🔲 Storage Management
- 🔲 Helm Chart
# Install essentials packages
sudo apt-get update && sudo apt-get install \
default-jre \
libreoffice-java-common \
rustup
# Install the latest rustc
rustup default stable
Change the file path and the store type into your preferred ones.
cargo run --release -- xlake "filesrc path='my_file.pdf'
! localstore path='my_cache_dir'
! stdoutsink"
cargo run --release -- xlake "gmailsrc k=10
! localstore
! doc:split to=paragraph
! doc:embed embeddings=openai
! localstore
! embed:vector_search query='my query' k=5
! content:prompt prompt="Summarize the email contents in bullets"
! stdoutsink"
cargo run --release -- xlake "emptysrc
! content:prompt prompt='Which is better: coke zero vs normal coke'
! stdoutsink"
docker run --rm quay.io/ulagbulag/xlake:latest "emptysrc
! content:prompt prompt='Which is better: coke zero vs normal coke'
! stdoutsink"
Licensed under either of Apache License, Version 2.0 or MIT license at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in XLake by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.