Skip to content
View ahmadMuhammadGd's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ahmadMuhammadGd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ahmadMuhammadGd/README.md

hey there πŸ‘‹

I'm Ahmad, an Egyptian data engineer passionate about building efficient, reliable, and scalable data pipelines


πŸ’Ό Data Engineering Stack

Category Tools/Technologies
πŸš€ Big Data Frameworks PySpark
πŸ“¦ Data Storage and Management Iceberg, MinIO, Nessie
πŸ”„ Workflow Orchestration Airflow, SSIS
βœ”οΈ Data Quality Soda, dbt, Regex for failure detection
πŸ”§ Data Transformation dbt (Data Build Tool), SQL, Jinja templating
πŸ”— Version Control for Data Implementing branching and versioning with Nessie
πŸ“„ File Formats Parquet, CSV, JSON, YAML
πŸ” CI/CD GitHub Actions, act
🐳 Containerization Docker, Docker-Compose
πŸ§ͺ Testing Python UnitTest, dbt unit tests, Soda quality tests, dbt data tests
πŸ—οΈ Data Modeling Kimball Approach, Data Vaults
πŸ’» Programming Languages Python, JS, SH

πŸ› οΈ Projects and Tools I Work With

βš™οΈ ETL Pipelines πŸ€– Orchestration and Automation
🧊 Loading and Partitioning 🌐 Orchestrating remote Spark jobs
☁️ Object Storage Integration πŸ› οΈ Custom Airflow Operators via SSH
🐳 Environment Orchestration ⏱️ Data-Aware Scheduling
airflow logo docker logo dbt logo postgres logo python logo spark logo mysql logo iceber logo javascript logo bash logo debian logo

🧠 Core Principles

πŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘ΎπŸ‘Ύ
πŸ“Ÿ Precision Over Convenience
πŸ“ Efficiency First
πŸ”‹ Collaboration is Key
🧩 Modularity and Reusability

🎯 Future Goals

  • Deepen my knowledge of dbt and evaluate its potential against custom SQL workflows.
  • Continue refining incremental load strategies to support real-time analytics.
  • Explore advanced lakehouse concepts and cutting-edge tools.

🀝 Let's Connect!

I’m always open to learning and collaborating. If you’re working on an interesting data engineering project, I’d love to discuss and exchange ideas. Let’s build something amazing together!


Pinned Loading

  1. Fraud-Detection-Data-LakeHouse Fraud-Detection-Data-LakeHouse Public

    Python 8 6

  2. Data-Quality-with-Nessie Data-Quality-with-Nessie Public

    Python

  3. northwind-dbt northwind-dbt Public

    An example on dbt - CICD data pipeline.

    Jupyter Notebook 1

  4. SALES_DWH_AIRFLOW SALES_DWH_AIRFLOW Public

    Testing a DWH implementation using MYSQL

    Python

  5. SQLify-SQL-inside-Google-Apps-Script SQLify-SQL-inside-Google-Apps-Script Public

    JavaScript 1

  6. Pandas-Data-Frame-Corrupter-For-Data-Pipeline-Tests Pandas-Data-Frame-Corrupter-For-Data-Pipeline-Tests Public

    Python