- This is a curated list of delightful resources for everything you need to develop Machine Learning solutions.
- All resources are structured as follows: [Content level] [Page title] - [Description] ([Reading time]).
- There are three content levels:
- 🐥 Essential reading for all ML engineers
- 🐍 Advanced reading for professional ML engineers
- 🦄 Expert material for expert ML engineers
- Descriptions are written so that they complete the sentence "After reading this article you will have learned [to] ...".
- There are three content levels:
- 🐥 BLUF: The Military Standard That Can Make Your Writing More Powerful - Make your communication more powerful. (5 min)
- 🐥 The XY Problem - Focus on explaining your end goal when asking for help. (5 min)
- 🐍 Understanding MECE - Write structured lists in your documents and communication. (10 min)
- 🦄 Nonviolent communication - Deliver constructive feedback in difficult situations. (10 min)
- 🐍 SMART criteria - Define goals in a structured way. (5 min)
- 🐥 Sombody else's problem - Don't make it SEP. (1 min)
- 🦄 The Halo effect - Account for the cognitive bias that might influence the way you view others. (10 min)
- 🐍 SCQA: What is it, how does it work, and how can it help me? - Structure your presentation, proposals, and sales outlines. (5 min)
- 🐥 E-mail like a boss - Write better e-mails. (1 min)
- 🦄 Mythical Man Month - The Cliff Notes - Understand the relationship between person-days and throughput time in a project (5 min)
- 🐥 Bike-shedding: how mature are you as an engineer? - Call out and avoid bike-shedding. (5 min)
- 🐥 Presentation Rules - Make your slides satisfy essential best practices. (5 min)
- 🦄 Four-sides model - Carefully consider how you communicate to optimize its result. (15 min)
- 🐥 How to write in plain English - Write in plain English. (15 min)
- 🐍 No More Misunderstandings - Avoid misunderstandings by paraphrasing. (15 min)
- 🐍 FastAPI docs - Build RESTful APIs that correspond one-to-one with the OpenAPI spec. (1-X hours)
- 🦄 Zalando's RESTful API guidelines - Design sustainable REST APIs. (X hours)
- 🦄 Microsoft's REST API guidelines - Design sustainable REST APIs. (X hours)
- 🦄 gRPC compared to REST - Compare the two leading solutions for communication between services. (5 min)
- 🦄 HTTP response headers for the responsible developer - Optimize your APIs with HTTP headers. (2 min)
- 🐍 Falsehoods programmers believe about time - Avoid most of the assumptions made about time. (2 min)
- 🐍 Falsehoods programmers believe about names - Avoid most of the assumptions made about personal names. (5 min)
- 🐥 Semantic versioning - Assign and increment version numbers of your software. (10 min)
- 🦄 Keep a changelog - Keep a well-maintained changelog in your software. (10 min)
- 🐍 Learn Git Branching - Work on your version control skills at beginner or advanced level. (1 hour)
- 🐥 The seven rules of a great Git commit message - Write concise and consistent Git commit messages. (5 min)
- 🐍 Trunk Based Development - Use a simple branching approach that scales well to teams. (10 min)
- 🐍 Google's "How to do a code review" - Review code in a way your colleagues will love (Note: Change List ~= Pull Request). (30 min)
- 🐍 Code Health: Respectful Reviews == Useful Reviews - Resolve code review comments respectfully. (5 min)
- 🐥 The Definitive Guide to Python import Statements - Resolve common importing problems. (15 min)
- 🐍 PEP8 style guide, and why is it important? - What PEPs are and what PEP8 is. (5 min)
- 🐍 PEP20 "The Zen of Python" - Get to know the guiding principles for Python's design. (1 min)
- 🦄 Python Design Patterns - High-level software engineering architecture patterns in Python. (30 min)
- 🦄 SOLID - High-level software engineering architecture principles. (30 min)
- 🦄 The Little Book of Python Anti-Patterns - Low-level Python idioms. (45 min)
- 🐍 Understanding Python's logging module - Use the
logging
module effectively. (10 min) - 🐍 Do not log - What you should be doing instead of logging. (15 min)
- 🐍 Please fix your decorators - Why you should probably use
wrapt
to write your decorators. (10 min) - 🦄 Effective Python - Apply 59 ways to write better Python (X hours)
- 🐥 Python Type Hints - How to apply both basic and advanced type hints. (5 min)
- 🐥 The state of type hints in Python - Why you should be using type hints. (5 min)
- 🐥 Leveraging type system to avoid mistakes - More motivation why you should be using type hints. (5 min)
- 🦄 Mypy protocols - Use advanced concepts such as Protocols. (15 min)
- 🐍 Pydantic overview - Stop writing
Dict[str, Any]
type hints and instead useBaseModel
s. (10 min - 1 hour) - 🐍 Enums - Stop writing magic
str
s and instead useEnum
s. (5 min)
- 🐥 Black: The Uncompromising Code Formatter - Use Black to end all formatting discussions. (5 min)
- 🐍 bump2version - Release new versions of your packages with a single command. (15 min)
- 🐍 coloredlogs: Colored terminal output for Python's logging module - Scan logs more easily by coloring them. (1 min)
- 🐍 hvPlot: A high-level plotting API - Use a pandas to create plots with HoloViews, rendered by Bokeh. (30 min)
- 🐥 Flake8 - Use Flake8: pyflakes for common errors, pycodestyle for PEP8-compiancy, and mccabe for code complexity. (10 min)
- 🐍 Portray: Your Project with Great Documentation - Generate documentation for your projects with no configuration. (30 min)
- 🐍 pydocstyle - Use pydocstyle to check compliance with Python docstring conventions. (5 min)
- 🐍 birdseye - Graphically debug your Python code. (10 min)
- 🦄 hypothesis-auto - Write fully automatic unit tests based on type annotations. (30 min)
- 🐍 scalene: a high-performance CPU and memory profiler for Python - Profile CPU and memory usage by line in Python. (10 min)
- 🐍 SnakeViz - Use an interactive profiler in Jupyter Lab to identify bottlenecks. (10 min)
- 🐍 tqdm - Easily add progress bars to long-running jobs. (5 min)
While, in theory, you can just download Tensorflow and start making deep neural networks, it doesn’t hurt to know some of the theory and philosophy that lies behind the algorithms that so many of us know and love today.
- Learning Machine Learning: An online comic from Google AI - Understand the basics of supervised and unsupervised learning. (15 min)
- Rules of Machine Learning by Google
- Bias and variance - Distinguish between different types of prediction error. (5 min)
- Bias and variance and the .632 rule - Balance bias and variance when bootstrapping. (10 min)
- Generalization performance & model selection, nested cross-validation - Use best practices for cross-validation. (10 min)
- Stacking strategies with and without leaks - Choose the right cross-validation strategy when stacking. (15 min)
- Backpropagation is the chain rule to compute the gradient - Make the connection between backpropagation and the chain rule. (20 min)
- Backprop is not just the chain rule - Make the connection between backpropagation and Lagrange multipliers. (20 min)
- You're all calculating churn rates wrong - Correctly define what churn is. (15 min)
- 🐍 Modern Pandas series (Part 1 - 7) - Write idiomatic pandas. (1 hour)
- Custom Estimators - Create your own custom estimator (20 min)
- Pipelines - Combine transformers and estimators into pipelines (15 min)
- Pipelines and custom Estimators
- Tuning hyperparameters - Implement grid search and randomized search for parameter optimization. (10 min)
- 🐍 invoke - Implement common tasks you run on your projects as a CLI. (30 min)
- Understanding Conda and Pip - Know the advantages of Conda over Pip. (5 min)
- Conda tutorial - Manage packages and reproducible environments using one tool. (15 min)
- Conda package index - Search for packages in Anaconda Cloud. (1 min)
- Conda myths - Debunk some common myths and misconceptions about Conda. (5 min)
- Conda in-depth
- Docker getting started
- Dockerfile best practices - Build efficient images (30 min)
- Dockerizing Python is hard
- Multi-stage builds #3: Why your build is surprisingly slow, and how to speed it up
- BuildKit Features You Might Want to Know About
- ZeroMQ: a socket library with message queue primitives
- Redis: a key-value store with optional persistence
- RabbitMQ: a message queue library with persistance
- Kafka is the opposite of RabbitMQ with "smart consumers" and a "dumb broker"
- What Every Software Engineer Should Know about Apache Kafka
- The Missing Semester of Your CS Education - A collection of skills that are often expected to be self-taught.
- A Survey of Deep Learning for Scientific Discovery - An overview of Deep Learning tasks and approaches.
- Flake8 extensions - An overview of Flake8 extensions.
- TODO: Mypy strict mode.
- TODO: Raymond Hettinger
- TODO: Gridsearch vs random search vs Bayesian hyperparam optimization (gaussian processes)
- TODO: Comparison of bayesian hyperparam optimizers (PyGPGO)
- TODO: conda vs virtualenv, pyenv, pipenv.
- TODO: explain how conda-forge works.
- TODO: explain registries (Docker Hub, ECR, GitLab)
- TODO: explain environment.yml + interactions with Docker.
- TODO: S3, DynamoDB, MongoDB
- TODO: CVE scans (frontend and backend)
- TODO: OSS license scan
- TODO: mutual TLS, IP whitelisting, (VPN)
- TODO: Kinesis streams
- TODO: Linting built-in to Terraform with
-check
. - TODO: Tech in our pure cookiecutter scaffolding.
- TODO: Cherry picking?
- TODO: MLOps
- TODO: KISS, DRY
- TODO: mamba
- TODO: pre-commit
- TODO: selected Flake8 extensions
- TODO: selected Pytest extensions
- TODO: cookiecutter & cruft (as a standalone repo?)
- TODO: https://pypi.org/project/snoop/
- TODO: Slack etiquette
Radix is a Belgium-based Machine Learning company.
We invent, design and develop AI-powered software. Together with our clients, we identify which problems within organizations can be solved with AI, demonstrating the value of Artificial Intelligence for each problem.
Our team is constantly looking for novel and better-performing solutions and we challenge each other to come up with the best ideas for our clients and our company.
Here are some examples of what we do with Machine Learning, the technology behind AI:
- Help job seekers find great jobs that match their expectations. On the Belgian Public Employment Service website, you can find our job recommendations based on your CV alone.
- Help hospitals save time. We extract diagnosis from patient discharge letters.
- Help publishers estimate their impact by detecting copycat articles.
We work hard and we have fun together. We foster a culture of collaboration, where each team member feels supported when taking on a challenge, and trusted when taking on responsibility.