-
llm-driven-data-engineering Public
Forked from DataExpert-io/llm-driven-data-engineeringThis is a public repository to go over all the LLM-driven data engineering concepts.
Python UpdatedOct 24, 2024 -
cuallee Public
Forked from canimus/cualleePossibly the fastest DataFrame-agnostic quality check library in town.
Python Apache License 2.0 UpdatedAug 19, 2024 -
llm-app Public
Forked from pathwaycom/llm-appDynamic RAG for enterprise. Ready to run with Docker,⚡in sync with Sharepoint, Google Drive, S3, Kafka, PostgreSQL, real-time data APIs, and more.
MIT License UpdatedJul 11, 2024 -
llm-twin-course Public
Forked from decodingml/llm-twin-course🤖 𝗟𝗲𝗮𝗿𝗻 for 𝗳𝗿𝗲𝗲 how to 𝗯𝘂𝗶𝗹𝗱 an end-to-end 𝗽𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻-𝗿𝗲𝗮𝗱𝘆 𝗟𝗟𝗠 & 𝗥𝗔𝗚 𝘀𝘆𝘀𝘁𝗲𝗺 using 𝗟𝗟𝗠𝗢𝗽𝘀 best practices: ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 11 𝘩𝘢𝘯𝘥𝘴-𝘰𝘯 𝘭𝘦𝘴𝘴𝘰𝘯𝘴
Python MIT License UpdatedJul 5, 2024 -
delta-examples Public
Forked from delta-io/delta-examplesDelta Lake examples
Jupyter Notebook Apache License 2.0 UpdatedJun 13, 2024 -
phoenix-spark-connector-demo Public
Forked from Abhey/phoenix-spark-connector-demoThis repository hosts the code that's demos Apache Phoenix connector for Apache Spark.
Jupyter Notebook GNU General Public License v3.0 UpdatedMay 29, 2024 -
mlops-zoomcamp Public
Forked from DataTalksClub/mlops-zoomcampFree MLOps course from DataTalks.Club
Jupyter Notebook UpdatedMay 22, 2024 -
-
Tech_Product_Reviews Public
Forked from Hrithik-Kumar/Tech_Product_ReviewsJupyter Notebook UpdatedMay 9, 2024 -
databricks-playground Public
Forked from alexott/databricks-playgroundCode samples, etc. for Databricks
Python UpdatedMay 8, 2024 -
quix-streams Public
Forked from quixio/quix-streamsQuix Streams - A library for data streaming and Python Stream Processing
Python Apache License 2.0 UpdatedMay 5, 2024 -
posts Public
Forked from jaumpedro214/postsA list of all my posts and personal projects
Jupyter Notebook UpdatedMay 1, 2024 -
system-design Public
Forked from systemdesign42/system-designA resource to help you learn system design.
Other UpdatedApr 23, 2024 -
Data_engineering_project_electric_vehicle_population Public
Forked from bkglobal/Data_engineering_project_electric_vehicle_populationPython UpdatedApr 19, 2024 -
spark-minio-delta-lakehouse-docker Public
Forked from kemonoske/spark-minio-delta-lakehouse-dockerA minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lake + MinIO
Shell Apache License 2.0 UpdatedApr 17, 2024 -
-
spark-playground Public
Forked from alexott/spark-playgroundPlaying with different packages of the Apache Spark
Scala UpdatedMar 23, 2024 -
-
Banking-Database-Design-using-PostgreSQL-and-pgAdmin Public
Forked from wonderakwei/Banking-Database-Design-using-PostgreSQL-and-pgAdminDesigned an efficient banking database using PostgreSQL and pgAdmin. Focused on optimizing transaction management included key aspects like data modeling, database architecture, security, and perfo…
MIT License UpdatedMar 1, 2024 -
DataPlatform Public
Forked from jaisingh12/DataPlatformPyspark Dataplatform
Python UpdatedFeb 16, 2024 -
terraform-databricks-examples Public
Forked from databricks/terraform-databricks-examplesExamples of using Terraform to deploy Databricks resources
HCL Other UpdatedFeb 8, 2024 -
github-actions-aws-terraform Public
Forked from harishkannarao/github-actions-aws-terraformRepository to practise Infrastructure-As-Code (IAC) with Github Actions, AWS and Terraform
HCL UpdatedFeb 7, 2024 -
population-dashboard Public template
Forked from dataprofessor/population-dashboardA dashboard web app template built in Python using Streamlit.
Jupyter Notebook UpdatedFeb 2, 2024 -
azure-data-labs-modules Public
Forked from Azure/azure-data-labs-modulesA list of Terraform modules to build your Azure Data IaC templates.
-
spark_playground Public
Forked from experientlabs/spark_playgroundPython Apache License 2.0 UpdatedJan 15, 2024 -
docker-hadoop-spark Public
Forked from Marcel-Jan/docker-hadoop-sparkMulti-container environment with Hadoop, Spark and Hive
Shell UpdatedJan 6, 2024 -
generative-ai Public
Forked from GoogleCloudPlatform/generative-aiSample code and notebooks for Generative AI on Google Cloud
Jupyter Notebook Apache License 2.0 UpdatedDec 14, 2023 -
lasagna Public
Forked from gmrqs/lasagnaA Docker Compose template that builds a interactive development environment for PySpark with Jupyter Lab, MinIO as object storage, Hive Metastore, Trino and Kafka
Jupyter Notebook UpdatedDec 8, 2023 -
data-anonymizer Public
Forked from Vishwamitra/data-anonymizerDataAnonymizer is an open-source personal data anonymization tool designed for GDPR compliancy
TypeScript Apache License 2.0 UpdatedDec 1, 2023 -
databricks-terraform-deployments Public
Forked from afsana-afzal/databricks-terraform-e2e-examplesHCL Apache License 2.0 UpdatedNov 13, 2023